How to score new data with ThingWorx Analytics ?
The following is valid starting with ThingWorx Analytics (TWA) 8.3.0
Overview
Once a training model has been created, one of the main objective is to score new data to predict the value for the goal
ThingWorx Analytics can score new data in 2 ways:
Batch scoring
Batch scoring will be used when a large amount of data needs to be scored.
To perform a batch scoring we will usually follow steps similar to the below ones:
Uploading the new data can be done in different ways.
We are focusing below on ThingWorx Analytics Builder, that is uploading new data via a csv file.
In order to perform the scoring job only on the new data in step 4 above, we need to be able to filter those added data.
If the dataset has already suitable column/feature such as a timestamp for example, we can use this to score only new data after timestamp > newdate, assuming all data are in chronological order.
If the dataset has no such feature, we will have to add one beforehand when we first upload the historic data in step 1 above.
We often use a new column/feature named record_purpose to this effect.
So initial data can take a value of training for this record_purpose feature since they are used to create the initial model.
Then new added data to be scored can get any value that identify those rows only.
It is important to note that this record_purpose feature needs to be set with the optType INFORMATIONAL so as to not be taken into account by the learning algorithms.
The video below shows those steps while using ThingWorx Analytics Builder
Real time scoring
Real time scoring is better suited for small amount of data.
The process for real time scoring can be done either via the Analytics Server PredictionThing RealTimeScore service or using the Analytics Manager framework.
The posts How to work with ordinal and categorical data in ThingWorx Analytics and Analytics: Prediction Methods Mashup do give examples of the use of the RealTimeScore service.
We will concentrate below on the Analytics Manager.
The process involves the following steps:
The Help Center has got more detailed about this process.
The following video shows those steps
Following articles can also be of interest for this topic:
Note that the AnalyticsServerConnector connector in release 8.3 replaces the ThingPredictor connector from previous releases.
Hi Christophe,
Excellent post 🙂
Could you please share the sample dataset (bean pro) used in demo videos.
Thanks
Tushar
Hi @tushar
The beanpro dataset used is a subset of the one provided with the extension.
I have simply removed some columns to make it easier to see in some UI.
I am attaching it here.
Kind regards
Christophe
Hi there. I think it is a quite interesting question. Thanks for this great post.