In a previous post I outlined a GPS project I am working on.
http://collab.mathsoft.com/read?131519,15e#131519The work has been interesting. I'm posting my "proof of concept" work that relates to identifying an unknown scan.
The logic is, we have GPS scans occurring and we want to classify the unknown scan into specific premises collection groups (these groups are then assigned to specific premises later).
When a new GPS scan occurs we can then use the current group information to classify it into one of the existing groups.
Note - a simple "confidence interval" will not work very well as some houses and larger complexes move any of their bins up or down the street, up to 20 metres or so. This gives cluster with an ellipse rather than a circle. Cluster analysis handles this quite nicely though.
The end process ends up creating some 50,000 or so clusters. The dataset is so big that Mathcad can't handle it. The precision required so high that it is beyond SQL Server. I have to use "Decimal" variable type in C# in order to process all of the data.
However I have produced a simple table of "loadings" so that identification of any new scan can be done in moments. The driver can be presented a list of "most likely" locations (premises with the highest score of the classifying function) so they can select the correct location.
Philip
___________________
Nobody can hear you scream in Euclidean space.