Retraining the Model in ThingWorx Analytics
When using ThingWorx Analytics Products to build Prediction Models, it is not enough to end up with models that are a Technical Success. The purpose is to ultimately have models that are a Business Success. What the user would want to achieve is to have Models that remain reliable and accurate in a potentially changing production environment.
Therefore, when your environment changes, the model that you have used and relied on might no longer provide the same quality of results. Hence the need to retrain your model.
Types of Models to be retrained:
There are currently two types of models that are created with ThingWorx Analytics:
- Predictive models
- Anomaly Detection models
Each of those models could require retraining based on the context in which they are created then used.
When to retrain your Model:
- Predictive models:
For predictive analytics models, the main initiator for retraining would be a change in the production environment. resulting in the change of collected Dataset. This could nonetheless be caused by many factors:
- An overall change in the business objective: This could include a change in the granularity at which the Dataset is used. An Example, in a Company HR Dataset, could be moving from making predictions on a Department Level to making predictions on an Employee Level.
- The addition of either new features in the Dataset or even new values in the existing features which did not figure within the values of the training dataset. This type of change in the Dataset would require the retraining of the Model.
- The emergence of new trends in the marketplace: These new trends would appear in the generated Datasets. This could be detected by the degradation of the results that are provided by the existing Prediction Models.
- Anomaly Detection models:
In anomaly detection, the need to retrain the models originates mainly from a change in what is considered to be a Normal behavior of a certain monitored property. The could be caused by the following factors:
- A change in the context in which the Property values are measured then monitored: An example is monitoring the Traffic in a Street in the Working weekdays while excluding the weekends then adding the Weekend days to the monitored behavior. Here the change in the Traffic is normal however would be detected as an Anomaly unless the model is retrained.
- A change in the Thresholds of values accepted to be normal in a certain property. An example is the temperatures measured on a running device. The Device, previously, never run at full power when the model was built but since it started running at full power the temperature increased beyond the usual threshold and thus the model needs to be retrained to include the new Normal temperatures.
- Another reason that could justify retraining the anomaly detection model is simply that when the model was trained the Property values that were used were not representing its normal state. For example, the temperature of an Engine was being measured on a "Turned off" state when we are actually trying to build a model that would detect temperature anomalies on a running Engine.
This might not be an exhaustive list of the reasons that would require either a Predictive or an Anomaly Detection Model to be retrained.
As a general rule of thumb, if the model starts delivering results that are below expected or if the business context for the model is not valid, then it might be a wise decision to retrain the Analytics Model.