cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Help us improve the PTC Community by taking this short Community Survey! X

How to score new data with ThingWorx Analytics 8.3.x ?

No ratings

How to score new data with ThingWorx Analytics ?

 

The following is valid starting with ThingWorx Analytics (TWA) 8.3.0

 

Overview

 

Once a training model has been created, one of the main objective is to score new data to predict the value for the goal

ThingWorx Analytics can score new data in 2 ways:

  1. Batch scoring
  2. Real time scoring

Batch scoring

 

Batch scoring will be used when a large amount of data needs to be scored.

To perform a batch scoring we will usually follow steps similar to the below ones:

  1. Upload the historic data
  2. Create a new model with this historic data
  3. Upload new data – the one to be scored
  4. Perform a prediction job to score those new data
  5. Retrieve the prediction job result

Uploading the new data can be done in different ways.

  • If using a large amount of data, it can be easier to upload the data via a csv file in a similar way as the historic data. This is the way used in ThingWorx Analytics Builder.
  • If the amount of data is more limited this can be sent in the body of the scoring request.
    The post Analytics: Prediction Methods Mashup  shows a good example of how to do this using the PredictionThing.BatchScore service.

We are focusing below on ThingWorx Analytics Builder, that is uploading new data via a csv file.

In order to perform the scoring job only on the new data in step 4 above, we need to be able to filter those added data.
If the dataset has already suitable column/feature such as a timestamp for example, we can use this to score only new data after timestamp > newdate, assuming all data are in chronological order.

If the dataset has no such feature, we will have to add one  beforehand when we first upload the historic data in step 1 above.

We often use a new column/feature named record_purpose to this effect.

So initial data can take a value of training for this record_purpose feature since they are used to create the initial model.

Then new added data to be scored can get any value that identify those rows only.
It is important to note that this record_purpose feature needs to be set with the optType INFORMATIONAL so as to not be taken into account by the learning algorithms.

 

The video below shows those steps while using ThingWorx Analytics Builder

 

Real time scoring

 

Real time scoring is better suited for small amount of data.

The process for real time scoring can be done either via the Analytics Server PredictionThing RealTimeScore service or using the Analytics Manager framework.

The posts How to work with ordinal and categorical data in ThingWorx Analytics  and Analytics: Prediction Methods Mashup do give  examples of the use of the RealTimeScore service.

 

We will concentrate below on the Analytics Manager.
The process involves the following steps:

  1. In Analytics Manager
    1. Create an Analysis Provider that uses the AnalyticsServerConnector connector
    2. Publish the model created in ThingWorx Analytics Builder to Analytics Manager
    3. Enable the model created
    4. Create an Analysis Event
    5. Map the properties to the datashape field
    6. Enable the Event
  2. In ThingWorx Composer
    1. Relevant properties of the Thing used in the Analysis Event are updated in someway
    2. This trigger the analysis job to be executed
    3. The scoring result is populated into the result property mapped in the Analysis event

The Help Center has got more detailed about this process.

The following video shows those steps

Following articles can also be of interest for this topic:

Note that the AnalyticsServerConnector connector in release 8.3 replaces the ThingPredictor connector from previous releases.

Comments

Hi Christophe,

Excellent post  🙂

Could you please  share the sample dataset (bean pro) used in demo videos.

 

Thanks

Tushar

Hi @tushar

 

The beanpro dataset used is a subset of the one provided with the extension.

I have simply removed some columns to make it easier to see in some UI.

 

I am attaching it here.

Kind regards

Christophe

 

 

Hi there. I think it is a quite interesting question. Thanks for this great post.

 

Version history
Last update:
‎Sep 11, 2018 10:00 AM
Updated by:
Labels (1)
Tags (1)