Skip to main content
1-Visitor
June 10, 2014
Question

k-means worksheet

  • June 10, 2014
  • 1 reply
  • 3772 views

Hello,

I'm trying to write an algorithm to calculate centroids of 3 clusters obtained from a data set using the k-means method.

Attached is the MC12 worksheet with the algorithm. For some reason, I only get the correct centroid for the first cluster and when using the same function to calculate the centroids for second and third cluster I get the same wrong answer.

I found that somebody had posted a worksheet with k-means algorithm a long time ago at this thread:

http://communities.ptc.com/thread/17242

but the worksheet has been removed from the thread.

Thank you,

Gigi

1 reply

25-Diamond I
June 10, 2014

I found that somebody had posted a worksheet with k-means algorithm a long time ago at this thread:

http://communities.ptc.com/thread/17242

but the worksheet has been removed from the thread.

Yes, thats a sad story that PTC wasn't able to merge the old and valuable collab forum into this community thingy.

Find attched from my archive the three files which where attachment in that thread as well as a pdf of how this thread looked originally. Hope it helps.

19-Tanzanite
June 10, 2014

Those don't work in MC15. I forget why, but at some point I fixed it. Here's a new version.

I am not sure that I ever fixed the way Dunn's validty index works. If not, then although it is a validity index, it's not Dunn's

Edit: where did you get the original thread layout from? The Wayback Machine?

25-Diamond I
June 10, 2014

Edit: where did you get the original thread layout from? The Wayback Machine?

No, before the old collab closed down I tried to make a copy of the site using HTTrack. Unfortunately that software wasn't smart enough to sort out multiple links to the same page and so the very same page was loaded over and over again and eventually the software would stop downloading further files. The result is an incomplete copy which consists of a myriad of duplicate (vers small) files. I guess most of the pages are there at least once, but as soon a a series of "next-next-..." is broken, I get no access to the following pages. So its pure luck if I am able to retrieve an old thread or not. I tried different times until the site closed and kept three copies. The smallest consists of approx. 150000 files and takes up 4,5 GByte but the largest consists of more than half a million files and takes up about 30 GByte space. I host them on an older USB 2.0 extren drive which makes access time even worse. Copying or zipping that copy takes hours.

I never was successful in accessing the collab using the Wayback Machine.