cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - New to the community? Learn how to post a question and get help from PTC and industry experts! X

IoT Tips

Sort by:
This video is Module 11: ThingWorx Analytics Mashup Exercise of the ThingWorx Analytics Training videos. It shows you how to create a ThingWorx project and populate it with entities that collectively comprise a functioning application. 
View full tip
This video concludes Module 9: Anomaly Detection of the ThingWorx Analytics Training videos. It gives an overview of the "Statistical Process Control (SPC) Accelerator"
View full tip
This video continues Module 3: Data Profiling of the ThingWorx Analytics Training videos. It describes metadata, and how it is used to ensure that your data is handled appropriately when running Signals, Profiles, Training, Scoring, and other jobs inside ThingWorx Analytics.
View full tip
This video begins Module 3: Data Profiling of the ThingWorx Analytics Training videos. It describes the process of examining your data to make sure that it is suitable for the use case you would like to explore.
View full tip
Welcome to the ThingWorx Analytics Training Course! Through these 11 modules, you will learn all about the functionality of this software, as well as techniques to help you build a successful and meaningful predictive analytics application.
View full tip
This video begins Module 1: ThingWorx Analytics Overview of the ThingWorx Analytics Training videos. It covers some of the functionality of the ThingWorx platform, as well as ThingWorx Analytics capabilities.
View full tip
A Feature - a piece of information that is potentially useful for prediction. Any attribute could be a feature, as long as it is useful to the model. Feature engineering – Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data. It’s a vaguely agreed space of tasks related to designing feature sets for Machine Learning applications. Components: First, understanding the properties of the task you’re trying to solve and how they might interact with the strengths and limitations of the model you are going to use. Second, experimental work were you test your expectations and find out what actually works and what doesn’t. Feature engineering as a technique, has three sub categories of techniques: Feature selection, Dimension reduction and Feature generation. Feature Selection: Sometimes called feature ranking or feature importance, this is the process of ranking the attributes by their value to predictive ability of a model. Algorithms such as decision trees automatically rank the attributes in the data set. The top few nodes in a decision tree are considered the most important features from a predictive stand point. As a part of a process, feature selection using entropy based methods like decision trees can be employed to filter out less valuable attributes before feeding the reduced dataset to another modeling algorithm. Regression type models usually employ methods such as forward selection or backward elimination to select the final set of attributes for a model. For example: Project Development decision-tree:                                                  Dimension Reduction: This is sometimes called feature extraction. The most classic example of dimension reduction is principle component analysis or PCA. PCA allows us to combine existing attributes into a new data frame consisting of a much reduced number of attributes by utilizing the variance in the data. The attributes which "explain" the highest amount of variance in the data form the first few principal components and we can ignore the rest of the attributes if data dimensionality is a problem from a computational standpoint. Feature Generation or Feature Construction: Quite simply, this is the process of manually constructing new attributes from raw data. It involves intelligently combining or splitting existing raw attributes into new one which have a higher predictive power. For example a date stamp may be used to generate 2 new attributes such as AM and PM which may be useful in discriminating whether day or night has a higher propensity to influence the response variable. Feature construction is essentially a data transformation process. Tips for Better Feature Engineering Tip 1: Think about inputs you can create by rolling up existing data fields to a higher/broader level or category. As an example, a person’s title can be categorized into strategic or tactical. Those with titles of “VP” and above can be coded as strategic. Those with titles “Director” and below become tactical. Strategic contacts are those that make high-level budgeting and strategic decisions for a company. Tactical are those in the trenches doing day-to-day work.  Other roll-up examples include: Collating several industries into a higher-level industry: Collate oil and gas companies with utility companies, for instance, and call it the energy industry, or fold high tech and telecommunications industries into a single area called “technology.” Defining “large” companies as those that make $1 billion or more and “small” companies as those that make less than $1 billion.   Tip 2: Think about ways to drill down into more detail in a single field. As an example, a contact within a company may respond to marketing campaigns, and you may have information about his or her number of responses. Drilling down, we can ask how many of these responses occurred in the past two weeks, one to three months, or more than six months in the past. This creates three additional binary (yes=1/no=0) data fields for a model. Other drill-down examples include: Cadence: Number of days between consecutive marketing responses by a contact: 1–7, 8–14, 15–21, 21+ Multiple responses on same day flag (multiple responses = 1, otherwise =0) Tip 3: Split data into separate categories also called bins. For example, annual revenue for companies in your database may range from $50 million (M) to over $1 billion (B). Split the revenue into sequential bins: $50–$200M, $201–$500M, $501M–$1B, and $1B+. Whenever a company falls with the revenue bin it receives a one; otherwise the value is zero. There are now four new data fields created from the annual revenue field. Other examples are: Number of marketing responses by contact: 1–5, 6–10, 10+ Number of employees in company: 1–100, 101–500, 502–1,000, 1,001–5,000, 5,000+ Tip 4: Think about ways to combine existing data fields into new ones. As an example, you may want to create a flag (0/1) that identifies whether someone is a VP or higher and has more than 10 years of experience. Other examples of combining fields include: Title of director or below and in a company with less than 500 employees Public company and located in the Midwestern United States You can even multiply, divide, add, or subtract one data field by another to create a new input. Tip 5: Don’t reinvent the wheel – use variables that others have already fashioned. Tip 6: Think about the problem at hand and be creative. Don’t worry about creating too many variables at first, just let the brainstorming flow.
View full tip
In ThingWorx Analytics, you have the possibility to use an external model for scoring. In this written tutorial, I would like to provide an overview of how you can use a model developed in Python, using the scikit-learn library in ThingWorx Analytics. The provided attachment contains an archive with the following files: iris_data.csv: A dataset for pattern recognition that has a categorical goal. You can click here to read more about this dataset TestRFToPmml.ipynb: A Jupyter notebook file with the source code for the Python model as well as the steps to export it to PMML RF_Iris.pmml: The PMML file with the model that you can directly upload in Analytics without going through the steps of training the model in Python The tutorial assumes you already have some knowledge of ThingWorx and ThingWorx Analytics. Also, if you plan to run the Python code and train the model yourself, you need to have Jupyter notebook installed (I used the one from the Anaconda distribution). For demonstration purposes, I have created a very simple random forest model in Python. To convert the model to PMML, I have used the sklearn2pmml library. Because ThingWorx Analytics supports PMML format 4.3, you need to install sklearn2pmml version 0.56.2 (the highest version that supports PMML 4.3). To read more about this library, please click here Furthermore, to use your model with the older version of the sklearn2pmml, I have installed scikit-learn version 0.23.2.  You will find the commands to install the two libraries in the first two cells of the notebook.   Code Walkthrough The first step is to import the required libraries (please note that pandas library is also required to transform the .csv to a Dataframe object):   import pandas from sklearn.ensemble import RandomForestClassifier from sklearn2pmml import sklearn2pmml from sklearn.model_selection import GridSearchCV from sklearn2pmml.pipeline import PMMLPipeline   After importing the required libraries, we convert the iris_data.csv to a pandas dataframe and then create the features (X) as well as the goal (Y) vectors:   iris_df = pandas.read_csv("iris_data.csv") iris_X = iris_df[iris_df.columns.difference(["class"])] iris_y = iris_df["class"]   To best tune the random forest, we will use the GridSearchCSV and cross-validation. We want to test what parameters have the best validation metrics and for this, we will use a utility function that will print the results:   def print_results(results): print('BEST PARAMS: {}\n'.format(results.best_params_)) means = results.cv_results_['mean_test_score'] stds = results.cv_results_['std_test_score'] for mean, std, params in zip(means, stds, results.cv_results_['params']): print('{} (+/-{}) for {}'.format(round(mean, 3), round(std * 2, 3), params))   We create the random forest model and train it with different numbers of estimators and maximum depth. We will then call the previous function to compare the results for the different parameters:   rf = RandomForestClassifier() parameters = { 'n_estimators': [5, 50, 250], 'max_depth': [2, 4, 8, 16, 32, None] } cv = GridSearchCV(rf, parameters, cv=5) cv.fit(iris_X, iris_y) print_results(cv)   To convert the model to a PMML file, we need to create a PMMLPipeline object, in which we pass the RandomForestClassifier with the tuning parameters we identified in the previous step (please note that in your case, the parameters can be different than in my example). You can check the sklearn2pmml  documentation  to see other examples for creating this PMMLPipeline object :   pipeline = PMMLPipeline([ ("classifier", RandomForestClassifier(max_depth=4,n_estimators=5)) ]) pipeline.fit(iris_X, iris_y)   Then we perform the export:   sklearn2pmml(pipeline, "RF_Iris.pmml", with_repr = True)   The model has now been exported as a PMML file in the same folder as the Jupyter Notebook file and we can upload it to ThingWorx Analytics.   Uploading and Exploring the PMML in Analytics To upload and use the model for scoring, there are two steps that you need to do: First, the PMML file needs to be uploaded to a ThingWorx File Repository Then, go to your Analytics Results thing (the name should be YourAnalyticsGateway_ResultsThing) and execute the service UploadModelFromRepository. Here you will need to specify the repository name and path for your PMML file, as well as a name for your model (and optionally a description)   If everything goes well, the result of the service will be an id. You can save this id to a separate file because you will use it later on. You can verify the status of this model and if it’s ready to use by executing the service GetDetails:   Assuming you want to use the PMML for scoring, but you were not the one to develop the model, maybe you don’t know what the expected inputs and the output of the model are. There are two services that can help you with this: QueryInputFields – to verify the fields expected as input parameters for a scoring job   QueryOutputFields – to verify the expected output of the model The resultType input parameter can be either MODELS or CLUSTERS, depending on the type of model,    Using the PMML for Scoring With all this information at hand, we are now ready to use this PMML for real-time scoring. In a Thing of your choice, define a service to test out the scoring for the PMML we have just uploaded. Create a new service with an infotable as the output (don’t add a datashape). The input data for scoring will be hardcoded in the service, but you can also add it as service input parameters and pass them via a Mashup or from another source. The script will be as follows:   // Values: INFOTABLE dataShape: "" let datasetRef = DataShapes["AnalyticsDatasetRef"].CreateValues(); // Values: INFOTABLE dataShape: "" let data = DataShapes["IrisData"].CreateValues(); data.AddRow({ sepal_length: 2.7, sepal_width: 3.1, petal_length: 2.1, petal_width: 0.4 }); datasetRef.AddRow({ data: data}); // predictiveScores: INFOTABLE dataShape: "" let result = Things["AnalyticsServer_PredictionThing"].RealtimeScore({ modelUri: "results:/models/" + "97471e07-137a-41bb-9f29-f43f107bf9ca", //replace with your own id datasetRef: datasetRef /* INFOTABLE */, });   Once you execute the service, the output should look like this (as we would have expected, according to the output fields in the PMML model):   As you have seen, it is easy to use a model built in Python in ThingWorx Analytics. Please note that you may use it only for scoring, and the model will not appear in Analytics Builder since you have created it on a different platform. If you have any questions about this brief written tutorial, let me know.
View full tip
Analytics projects typically involve using the Analytics API rather than the Analytics Builder to accomplish different tasks. The attached documentation provides examples of code snippets that can be used to automate the most common analytics tasks on a project such as: Creating a dataset Training a Model Real time scoring predictive and prescriptive Retrieving the validation metrics for a model Appending additional data to a dataset Retraining the model The documentation also provides examples that are specific to time series datasets. The attached .zip file contains both the document as well as some entities that you need to import in ThingWorx to access the services provided in the examples. 
View full tip
This document provides API information for all 51.0 releases of ThingWorx Machine Learning.
View full tip
There have been a number of questions from customers and partners on when they should use different tools for calculation of descriptive analytics within ThingWorx applications. The platform includes two different approaches for the implementation of many common statistical calculations on data for a property: descriptive services and property transforms. Both of these tools are easy to implement and orchestrate as part of a ThingWorx application. However, these tools are targeted for handling different scenarios and also differ in utilization of compute resources. When choosing between these two approaches it is important to consider the specific use case being implemented along with how the implemented approach will fit into the overall design and architecture of the ThingWorx environment. This article will provide some guidance on scenarios to use each of these approaches in ThingWorx applications and things to consider with each approach.   Let's look at the two different approaches and some guidelines for when they should be used.   Descriptive services (click for more details) provide a set of ThingWorx services to analyze a data set and perform many common data transformations.  These services are targeted for performing calculations and transformations on recent operating history of a single property.  Descriptive services are called on demand to perform batch calculations. Scenarios to use descriptive services: On demand calculations performed within a mashup, a service call or an event to determine action and calculation results are not (always) stored Regular occurring calculations on logged property values or generated datasets (batch calculations) Calculations are done regularly in minutes, hours or days on a discrete set of data.  Examples: average value in last hour, median value in last day, or max value in last half hour.  Time between data creation and analysis is minutes or hours.  Some latency in the calculation result is acceptable for the use case. Input data set has 10s to 100s to 1000s of values.  Keep the size of the input data at 10,800 values or less.  If larger data sizes are required, then break them into micro batches if possible or use other tools to handle the processing. Multiple calculations need to be done from the same set of input data.  Examples: average value in last hour, max value in the last hour and standard deviation value in the last hour are all required to be calculated. Things to consider when using descriptive services Requires input dataset to be in the specific datashape format that is used by descriptive services.  If property values are logged in a value stream, there is a service to query the values and prepare the dataset for processing.  If scenarios where the data is not for a logged property, then another service or sql query can be used to prepare the dataset for processing. Requires javascript development work to implement.   This includes creation of a service to execute the descriptive services and usage of subscriptions and events to orchestrate calculations. An example of the javascript to execute descriptive services is available in the help center (here) Typically retrieval of the input data from value stream (QueryTimedValuesForProperty) is slowest part of the process. The input data is sent to an out of process platform analytics service for all calculations. Broader set of calculation services available (see table at the end of this article) Remember that these services are not meant to be used for big data calculations or big data preparation.  Look for other approaches if the input data sets grow larger than 10,800 values Property Transforms (click for more details) provide a set of transformation services for streaming data as it enters ThingWorx.   Property transforms are targeted for performing continuous calculations on recent values in the stream of a single property and delivering results in (near) real-time.  Since property transforms are continuous calculations, they are always running and using compute resources. Before implementing property transforms review the information in the property transform sizing guide to better understand factors that impact the scaling of property transforms. Scenarios to use: Continuous calculations on a stream for a single property as new data comes into ThingWorx New values enter the stream faster than one value per minute (as a general guideline) Calculations required to be done in seconds or minutes.  Examples: average electrical current in last 10 seconds, median pressure in the last 10 readings,  or max torque in last minute Time between data creation and analysis is small (in seconds).  Results of property transform is required for rapid decisions and action so reducing latency is critical Data sets used for calculation are small and contain 10s to 100s of values.  Calculated results are stored in a new property in the ThingModel Things to consider when using property transforms Codeless process to create new property transforms on a single property in the ThingModel Does not require input property values to be logged as calculations are performed on streaming data as it enters ThingWorx Unlike descriptive services which only execute when called, each property transform creates a continuously running job that will always be using compute resources.  Resource allocations for property transforms must be included in the overall system architecture.  Before selecting the property transform approach, refer to the Property Transform Sizing Guide for more information about how different parameters affect the performance of Property Transforms and results of performance load test scenarios. Let’s apply these guidelines to a few different use cases to determine which approach to select. 1. Mashup application that allows users to calculate and view median temperature over a selected time window In this scenario, the calculation will be executed on-demand with a user defined time window. Descriptive services are the only option here since there is not a pre-defined schedule and the user can select which data to use for the calculation.   2. Calculate the max torque (readings arriving one per second) on a press over each minute without storing all of the individual readings. In this scenario, the calculation will be executed without storing the individual readings coming from the machine. The transformation is made to the data on its way into ThingWorx and continuously calculating based on new values. Property transforms are the only option here since the individual values are not being stored.   3. Calculation of average pressure value (readings arriving one per second) over a five minute window to monitor conditions and raise an alert when the median value is more than two standard deviations from expected. In this scenario, both descriptive services and property transforms can perform the calculation required. The calculation is going to occur every 5 minutes and each data set will have about 300 values. The selection of batch (descriptive services) or streaming (property transforms) will likely be determined by the usage of the result. In this case, the calculation result will be used to raise an alert for a specific five minute window which likely will require immediate action. Since the alert needs to be raised as soon as possible, property transforms are the best option (although descriptive services will handle this case also with less compute resource requirements).   4, Calculation of median temperature (readings each 20 seconds) over 48 hour period to use as input to predict error conditions in the next week. In this scenario, the calculation will be performed relatively infrequently (once every 48 hours) over a larger data set (about 8,640 values). Descriptive services are the best option in this case due to the data size and calculation frequency. If property transforms were used, then compute resources would be tied up holding all of the incoming values in memory for an extended period before performing a calculation. With descriptive services, the calculation will only consume resource when needed, or once every 48 hours.   Hopefully this information above provides some more insight and guidelines to help choose between property transforms and descriptive services. The table below provides some additional comparisons between the two approaches.     Descriptive Services Property Transforms Purpose Provide a set of ThingWorx services to analyze a data set and perform many common data transformations. Provide a set of prescribed transformation services for streaming data as it enters ThingWorx. Processing Mode Batch Streaming / Continuous Delivery API / Service Composer interface API / Service Input Data Discrete data set Must be logged Single property Configurable by time or lookback Rolling data set on property X Persistence is optional Single property Configurable by time or lookback Output Data Return object handled programmatically Single output for discrete data set New property f_X in the input model Continuous output at configurable frequency Output time aligned with input data Available Services Statistics (min, max, mean, median, mode, std deviation) SPC calculations (# continuous data points: above threshold, in / out of range, increasing / decreasing, alternating) Data distribution: count by bins (histogram) Five numbers (min, lower quartile, median, upper quartile, max) Confidence interval Sampling frequency Frequency transform (FFT) Statistics (min, max, mean, median, mode, std deviation) SPC calculations (# continuous data points: above threshold, in / out of range, increasing / decreasing, alternating)
View full tip
Time series prediction uses a model to predict future values based on previously observed values. Time series data differs somewhat from non-time series data in both the formatting of the data and the training of predictive models. This article will highlight several important considerations when dealing with time series data. Preparing Time Series Data: The data must contain exactly one field with Op Type “TEMPORAL” and one field with Op Type “ENTITY_ID”, which defines the identifier for an entity, such as a machine serial number. The ENTITY_ID field should remain the same as long as there are no missing timestamps and it is within the same asset but should be different for different assets or asset runs in order to accurately assign history during model training and scoring.     The TEMPORAL field is a numeric field indicating the order of the data rows for a specific entity . One should also ensure that data is formatted such that the timestamps are equally spaced (for example, one data point every minute) and that no gaps exist in the sequence of numbers.   If there are gaps in the time series data, it is recommended to restart the series after the gap as a new entity. Alternatively, if the gap is small enough (few data points), linear interpolation based on the gap endpoint values within the same entity is generally acceptable.   Model Creation in Time Series: When creating a timeseries model in Analytics Builder, you will be asked to specify a lookback size and lookahead parameter. The lookback size determines how many historical datapoints (including the current row) will be used in the model. The lookahead indicates how many time steps ahead to predict.  If the value of the goal variable is not known at time of scoring, unchecking Use Goal History will use the goal column during training but not its history during scoring.   Time Series models can also be created in Services using the Training Thing. The lookback size and lookahead parameter are specified in the CreateJob service. The virtualSensor field is used to indicate if the model should be trained to predict values for a field that will not be available during scoring. For example, one can train a time series model to predict Volume using evolving Temperature and Pressure, based on sensor data for these three variables over a period of time. However, the Volume sensor may be removed from further assets in order to reduce costs, and the predictive model can be used instead.   Two important considerations: ThingWorx Analytics will expand historical data in the time series into new columns. This process creates new features using the values of the previous time steps. Additionally, low order derivatives, together with average and standard deviation features are computed over small contiguous subgroups of the historical data.   The expansion process can make the dataset exceptionally wide, so time series training is generally significantly slower compared to training with no history on the same dataset. This gets exacerbated when lookback size = 0 (auto-windowing, a process where the system is trying to find the optimal lookback). If there are columns that are not changing or change infrequently (such as a device serial number or zip code of the device’s location), these should be marked as Static when importing the data. Any columns labeled Static will not be expanded to create new features. Care also needs to be taken to exclude any features that are known to not be relevant to the prediction. Using a large lookback can eliminate how many examples / entities the model has available to train. For example, if a lookback of 8 is used, then any entities that have less than 8 examples will not be used in training. For the same reason, scoring for time series produces less results than the number of rows provided as input: if 10 rows are provided and lookback is 6, then only 5 predictions will be produced.
View full tip
Design and Implement Data Models to Enable Predictive Analytics Learning Path   Design and implement your data model, create logic, and operationalize an analytics model.   NOTE: Complete the following guides in sequential order. The estimated time to complete this learning path is 390 minutes.    Data Model Introduction  Design Your Data Model Part 1 Part 2 Part 3  Data Model Implementation Part 1 Part 2 Part 3  Create Custom Business Logic  Implement Services, Events, and Subscriptions Part 1 Part 2  Build a Predictive Analytics Model  Part 1 Part 2 Operationalize an Analytics Model  Part 1 Part 2  
View full tip
Getting Started on the ThingWorx Platform Learning Path   Learn hands-on how ThingWorx simplifies the end-to-end process of implementing IoT solutions.   NOTE: Complete the following guides in sequential order. The estimated time to complete this learning path is 210 minutes.   Get Started with ThingWorx for IoT   Part 1 Part 2 Part 3 Part 4 Part 5 Data Model Introduction Configure Permissions Part 1 Part 2 Build a Predictive Analytics Model  Part 1 Part 2
View full tip
Build a Predictive Analytics Model Guide Part 2   Step 5: Profiles   The Profiles section of ThingWorx Analytics looks for combinations of data which are highly correlated with your desired goal. On the left, click ANALYTICS BUILDER > Profiles. Click New....The New Profile pop-up will open. NOTE: Notice the Text Data Only section which is new in ThingWorx 9.3.         3. In the Profile Name field, enter vibration_profile. 4. In the Dataset field, select vibration_dataset. 5. Leave the Goal field set to the default of low_grease. 6. Leave the Filter field set to the default of all_data. 7. Leave the Excluded Fields from Profile field set to the default of empty. 8. Click Submit. 9. After ~30 seconds, the Signal State will change to COMPLETED. The results will be displayed at the bottom.                 The results show several Profiles (combinations of data) that appear to be statistically significant. Only the first few Profiles, however, have a significant percentage of the total number of records. The later Profiles can largely be ignored. Of those first Profiles, both Frequency Bands from Sensor 1 and Sensor 2 appear. But in combination with the result from Signals (where Sensor 1 was always more important), this could possibly indicate that Sensor 1 is still the most important overall. In other words, since Sensor 1 is statistically significant both by itself and in combination (but Sensor 2 is only significant in combation with Sensor 1), then Sensor 2 may not be necessary.     Step 6: Create Model   Models are primarily used by Analytics Manager (which is beyond the scope of this guide), but they can still be used to measure the accuracy of predictions. When Models are calculated, they inherently withhold a certain amount of data. The prediction model is then run against the withheld data. This provides a form of "accuracy measure", which we'll use to determine whether Sensor 2 is necessary to the detection of a low grease condition by creating two different Models. The first Model (which you will create below) will contain all the data, while the second Model (in the next step) will exclude Sensor 2. On the left, click ANALYTICS BUILDER > Models.   Click New….The New Predictive Model pop-up will open.   3. In the Model Name field, enter vibration_model. 4. In the Dataset field, select vibration_dataset. 5. Leave the Goal field set to the default of low_grease. 6. Leave the Filter field set to the default of all_data.         7. Leave the Excluded Fields from Model section at its default of empty.       8. Click Submit. 9. After ~60 seconds, the Model Status will change to COMPLETED.   View Model   Now that the prediction model is COMPLETED, you can view the results. Select the model that was created in the previous step, i.e. vibration_model. Click View… to open the Model Information page.   Review the visualization of the validation results. Note that your results may differ slightly from the picture, as the automatically-withheld "test" portion of the dataset is randomly chosen. Click on the ? icon to the right of the chart for details on the information displayed.   The desired outcome is for the model to have a relatively high level of accuracy. The True Positive Rate shown on the Receiver Operating Characteristic (ROC) chart are much higher than the False Positives. The curve is relatively high and to the left, which indicates a high accuracy level. You may also click on the Confusion Matrix tab in the top-left, which will show you the number of True Positive and True Negatives in comparison to False Positives and False Negatives.     Note that the number of correct predictions is much higher than the number of incorrect predictions.     As such, we now know that our Sensors have a relatively good chance at predicting an impending failure by detecting low grease conditions before they cause catastrophic engine failure.     Step 7: Refine Model   We will now try comparing this first Model that includes both Sensors to a simpler Model using only Sensor 1. We do this because we suspect that Sensor 2 may not be necessary to achieve our goal. On the left, click ANALYTICS BUILDER > Models.   Click New…. In the Model Name field, enter vibration_model_s1_only. In the Dataset field, select vibration_dataset. Leave the Goal field set to the default of low_grease. Leave the Filter field set to the default of all_data.   On the right beside Excluded Fields from Model, click the Excluded Fields button. The Fields To Be Excluded From Job pop-up will open. 8. Click s2_fb1 to select the first Sensor 2 Frequency Band. 9. Select the rest of the Frequency Bands through s2_fb5 to choose all of the Sensor 2 frequencies. 10. While all the s2 values are selected, click the green "right arrow", i.e. the > button in the middle. 11. At the bottom-left, click Save. The Fields To Be Excluded From Job pop-up will close.           12. Click Submit. 13. After ~60 seconds, the Model State will change to COMPLETED. 14. With vibration_model_s1_only selected, click View....   The ROC chart is comparable to the original model (including Sensor 2). Likewise, the Confusion Matrix (on the other tab) indicates a good ratio of correct predictions versus incorrect predictions.     NOTE: These Models may vary slightly from your own final scores, as what data is used for the prediction versus for evaluation is random. ThingWorx Analytics's Models have indicated that you are likely to receive roughly the same accuracy of predicting a low-grease condition whether you use one sensor or two! If we can get an accurate early-warning of the low grease condition with just one sensor, it then becomes a business decision as to whether the extra cost of Sensor 2 is necessary.   Step 8: Next Steps   Congratulations! You've successfully completed the Build a Predictive Analytics Model guide, and learned how to:   Load an IoT dataset Generate machine learning predictions Evaluate the analytics output to gain insight    This is the last guide in the Getting Started on the ThingWorx Platform learning path.   This is the last guide in the Monitor Factory Supplies and Consumables learning path.   The next guide in the Design and Implement Data Models to Enable Predictive Analytics learning path is Operationalize an Analytics Model.     Additional Resources   If you have questions, issues, or need additional information, refer to:   Resource Link Support Analytics Builder Help Center    
View full tip
This video is Module 10: ThingWorx Foundation & Analytics Integration of the ThingWorx Analytics Training videos. It gives a brief review of core ThingWorx Platform functionality, and how the Analytics server works on top of the platform. It also describes the process of creating a simple application, complete with a mashup to display the information from a predictive model.
View full tip
This video continues Module 9: Anomaly Detection of the ThingWorx Analytics Training videos. It begins with a ThingWatcher exercise, and concludes by describing Statistical Process Control (SPC). The "SPC Accelerator" will be covered in Module 9 Part 3.
View full tip
This video begins Module 9: Anomaly Detection of the ThingWorx Analytics Training videos. It describes how Thingwatcher can be set up to monitor values streaming from connected assets, and send an alert if its behavior deviates from its 'normal' behavior.
View full tip
This video concludes Module 8: Time Series Modeling of the ThingWorx Analytics Training videos. 
View full tip
This video continues Module 8: Time Series Modeling of the ThingWorx Analytics Training videos. It continues to show how ThingWorx Analytics automatically transforms time series datasets into ones that are ready for machine learning. It also describes the concept of virtual sensors. It finishes by describing the time series dataset that will be used in the following modules.
View full tip
This video begins Module 8: Time Series Modeling of the ThingWorx Analytics Training videos. It describes the differences between time series and cross-sectional datasets. It begins to show how ThingWorx Analytics automatically transforms time series datasets into ones that are ready for machine learning. 
View full tip