Skip to main content
1-Visitor
December 3, 2021
Solved

How to interpret Predictive Scoring & Important Field Weights

  • December 3, 2021
  • 1 reply
  • 3699 views

Hello how are you community, I hope very well.


I want to share with you a question about Thingworx Analytics, specifically about how to use the Predictive Scoring option available in Analytics Builder and interpret its results. I finished the learning path on "Vehicle Predictive Pre-Failure Detection with ThingWorx Platform", which helped me to understand several concepts about Thingworx Analytics, managing to generate predictions for values in "real time".

 

I would like to complement the predictions obtained with uncertainty probability or other practical information. Unfortunately, this guide does not cover topics that complement the predictions with information such as Predictive Scoring or confidence modeling. For my part I wanted to try and used the data and the model created to perform Predictive Scoring tests obtaining successful results but without knowing how to give a practical meaning to the Important Field Weights. On the other hand, according to the ThingWorx Analytics 9 documentation, the confidence models (which provide a probability of uncertainty about the prediction) are only available for continuous or ordinal data.

 

So I would like to know if there is extra information with which I can complement the predictions for the example "Vehicle Predictive Pre-Failure Detection with ThingWorx Platform", and how I could interpret the Important Field Weights. 

 

At the end of the text I attach an image with 2 predictive scoring results and Important Field Weights (Feature Weigth). 

 

Thank you for reading.

 

predictive_scoring_results.png

 

Best answer by sniculescu

Hello @unknown ,

 

In order to deliver practical value, I recommend you discuss with our Field team. They can advise once you have a concrete use case, or discuss sample use cases.

 

Regarding your questions:

 

1) You can check for model calibration by splitting your validation set predictions in bins: [0-0.1),[0.1-0.2), ..., [0.9-1]. Then, in each bin you compute the predicted (average predictions) and actual risk (percent failure from your data). Plot the predicted vs actual risk for each bin. If the model is calibrated as a probability, then your points will arrange somewhat close to the main diagonal. Note that even if the model is not calibrated, it can still be very successfully used for risk prediction, but the predictions cannot be interpreted as probabilities. In that case, the model automatically identifies the "optimal" threshold to transform the scores ("_mo" values) into failure predictions.

 

2) Typically, the requirements for the solution are use case dependent. From a data science perspective, you want to have enough high quality data to build accurate models. There is not a hard and fast number for the size of the dataset, but generally speaking, the more variables you want to model, the more data you need. Also, if failures are rare, you need to track the process / assets over a longer period of time to ensure enough failures are collected. Before building models you may want to perform some data cleanup (drop variables with too much missing data, fill missing data otherwise, check for outliers / incorrect values, etc).

 

Regards,

 

--Stefan

1 reply

17-Peridot
December 3, 2021

@unknown ,

 

Thank you for posting to the PTC Community.

 

Your questions are a bit advanced, I will do my best to assist you with your questions.

 

Have you had the opportunity to review our HelpCenter Documentation: Working With Predictive Models

 

We also have an older Community Post, which can be found here: PTC Community - Predictive Analytics

Regards,

 

Neel

 

1-Visitor
December 9, 2021

Hi Neel, thanks for your time

 

My goal is to explore the scope/limitations of Thingworx Analytics and the technical requirements to transform the business problem to an IoT + Analytics project, and to be able to explain these points to potential customers. Ideally it would be great to have a mockup on Prescriptive Models to show the great potential of Thingworx Analytics, but Predictive Models are enough for now. Any information that complements the prediction, such as the likehood of the prediction, is appreciated. For this purpose I have been studying the field weights of the Predictive Scores.

 

About the field weights the following is explained:
"Important field weights - For each important field, a field weight represents the relative impact of that field on the target variable. If the field weights of all fields in a training data set could be summed for a record, the sum would equal 1. In the sample results shown below, the weights of the important fields in each row add up to something less than one."

The interpretation of the weights seems to be similar to Signals, where a value of the relevance of the signal with respect to the target signal is given, but in this case only for one record. Unlike Signals there is no indication of the measurement method e.g. Mutual Information for Signals.

It would be great if you could confirm or refute this hypothesis.