Predicting time to failure (TTF) or remaining useful life (RUL) is a common need in IIOT world.
We are looking here at some ways to implement it.
We are going to use one of the Nasa dataset publicly available that simulates the Turbofan engine degradation (https://c3.nasa.gov/dashlink/resources/139/) .
The original dataset has got 26 features as below
Column 1 – asset id
Column 2 – cycle/time of sensor data collection
Column 3- 5 – operational setting
Column 6-26 – sensor measurement
In the training dataset the sensor measurement ends when the failure occurs.
Since the prediction model is based on historic data, the data collection is a critical point.
In some cases the data would have been already collected form the past and you need to make the best out of it. See the Data preparation chapter below.
In situation where you are collecting data, a few points are good to keep in mind, some may or may not apply depending on the type of data to be collected.
TTF business need
Before going into data preparation and model creation we need to understand what information is important in term of TTF prediction for our business need.
There are several ways to conceive the TTF, for example:
The picture below shows the 3 different types of TTF listed above
Once this TTF column is defined, we may need to transform it further depending on the path we choose for TTF prediction, as described in the TTF business need chapter.
In the case of the NASA dataset we are choosing a range TTF with values of more100, 50to100, 10to50 and less10 to represent the number of remaining cycles till the predicted failure.
This is the information we need to predict in order to plan a suitable maintenance action.
Our transformed TTF column look as below:
Once the data in csv is ready, we need to create the json file to represent the metadata.
In the case of range TTF this will be defined as an ordinal goal as below (see attachment for the full matadata json file)
Once the data is ready it can be uploaded into ThingWorx Analytics and work on the prediction model can start.
ThingWorx Analytics is designed to make machine learning easy and accessible to non data scientists, so this steps will be easier than when using other solutions.
However some trial and error are needed to refine the model which may also involve reworking the dataset.
In the case of the NASA dataset, since we are using an ordinal goal, we need to execute it through API.
This can be done through mashup and services (see How to work with ordinal and categorical data in ThingWorx Analytics ? for an example) for a more productive way.
As a test the TrainingThing.CreateJob service can be called from the Composer directly, as shown below:
Once the model is created we can check some performance statistics in ThingWorx Analytics Builder or, in the case of ordinal goal, via the ValidationThing.RetrieveResults service. The parameter most relevant in the case of ordinal goal will be the confusion matrix.
Here is the confusion matrix I get
Another validation is to compute some PVA (Predicted Vs Actual) results for some validation data.
ThingWorx Analytics does validation automatically when using ThingWorx Analytics Builder and present some useful performance metrics and graph. In the case of ordinal goal, we can still get this automatic validation run (hence the above confusion matrix), but no PVA graph or data is available. This can be done manually if some data are kept aside and not passed to the training microservice. Once the model is completed, we can then score (using PredictionThing.RealTimeScore or BatchScore for ordinal goal, or Builder UI for other goal) this validation dataset and compare the prediction result with the actual value.
here is one example:
Depending on the business case this model can be deemed acceptable or may need rework, such as change the range values, change learners’ parameters, modify dataset …
There is certainly a fair amount of experimentation before creating the optimal model but hopefully this post does give some good starting points.
Original Dataset attached as train_FD001-original.csv
Transformed dataset attached as train_FD001-TTF-transformed.csv
json metadata file for transformed dataset attached as train_FD001-ttford.json
Very nice article. Thanks for the post.
Could you please help me to understand, how to create the graph shown below (predicted vs actual for ordinal goals). I could not find suitable service to draw it. I see RetrieveResult from PredictionThing gives predicted values in infotable, but could not find any service for actual values in info table to compare with.
Thank you for your comment.
The PVA can be retrieved using the RetrievePVAs service of the ValidationThing. However this does not support categorical or ordinal goal, so if you are using a goal of this type there is no direct way.
I created the mentioned graph in a very manual way (there might be ways to automatized this but I did not investigate this as I just needed a quick graph for illustration purpose)
- I scored a validation set (for which I therefore knew the values for the goal)
- then compare the predicted value (result of the scoring job) with the actual value from the original validation set.
- the graph was simply created in Excel with the 2 series as input.
Hope this helps