IoT Tips

Jan 20, 2018

Nov 28, 2016

The accuracy of a predictive model can be boosted in two ways: Either by embracing Feature engineering or by applying boosting algorithms straight away. There are multiple boosting algorithms like Gradient Boosting, XGBoost, AdaBoost, Gentle Boost etc. Every algorithm has its own underlying mathematics and a slight variation is observed while applying them. While working with boosting algorithms, we have come across two frequently occurring buzzwords: Bagging and Boosting. Bagging: It is an approach where you take random samples of data, build learning algorithms and take simple means to find bagging probabilities. Boosting: Boosting is similar, however the selection of sample is made more intelligently. We subsequently give more and more weight to hard to classify observations. Below are Default Algorithms used in Predictive Models generated in ThingWorx Analytics: Decision Tree Gradient Boost Linear regression Neural Net Random Forrest Logistic Regression Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differential loss function. Let’s begin with an easy example: Assume, you are given a previous model M to improve on. Currently you observe that the model has an accuracy of 80% (any metric). How do you go further about it? One simple way is to build an entirely different model using new set of input variables and trying better ensemble learners. On the contrary, we have a much simpler way to suggest. It goes like this: Y = M(x) + error What if we are able to see that error is not a white noise but have same correlation with outcome(Y) value. What if we can develop a model on this error term? Like:error = G(x) + error2 Probably, we will see error rate will improve to a higher number, say 84%. Let’s take another step and regress against error2: error2 = H(x) + error3 Now we combine all these together: Y = M(x) + G(x) + H(x) + error3 This probably will have a accuracy of even more than 84%. What if we can find an optimal weights for each of the three learners: Y = alpha * M(x) + beta * G(x) + gamma * H(x) + error4 How Gradient Boosting Works: 1. Loss Function: The loss function used depends on the type of problem being solved. It must be differential, but many standard loss functions are supported and you can define your own. A benefit of the gradient boosting framework is that a new boosting algorithm does not have to be derived for each loss function that may want to be used, instead, it is a generic enough framework that any differential loss function can be used. 2. Weak Learner: Decision trees are used as the weak learner in gradient boosting. Specifically regression trees are used that output real values for splits and whose output can be added together, allowing subsequent models outputs to be added and “correct” the residuals in the predictions. Trees are constructed in a greedy manner, choosing the best split points based on purity scores like Gini or to minimize the loss. 3. Additive Model: Trees are added one at a time, and existing trees in the model are not changed. A gradient descent procedure is used to minimize the loss when adding trees. we have weak learner sub-models or more specifically decision trees. After calculating the loss, to perform the gradient descent procedure, we must add a tree to the model that reduces the loss. Improvements to Basic Gradient Boosting: 1. Tree Constraints: It is important that the weak learners have skill but remain weak. Below are some constraints that can be imposed on the construction of decision trees: Number of trees: Generally adding more trees to the model can be very slow to over fit. The advice is to keep adding trees until no further improvement is observed. Tree depth: Deeper trees are more complex trees and shorter trees are preferred. Generally, better results are seen with 4-8 levels. Number of nodes or number of leaves: like depth, this can constrain the size of the tree, but is not constrained to a symmetrical structure if other constraints are used. Number of observations per split: Imposes a minimum constraint on the amount of training data at a training node before a split can be considered Minimum improvement to loss: Is a constraint on the improvement of any split added to a tree. 2. Weighted Updates: The contribution of each tree to this sum can be weighted to slow down the learning by the algorithm. This weighting is called a shrinkage or a learning rate. "Each update is simply scaled by the value of the “learning rate parameter v". 3. Stochastic Gradient Boosting: At each iteration a sub sample of the training data is drawn at random (without replacement) from the full training data set. The randomly selected sub sample is then used, instead of the full sample, to fit the base learner. 4. Penalized Gradient Boosting: The additional regularization term helps to smooth the final learnt weights to avoid over-fitting. Intuitively, the regularized objective will tend to select a model employing simple and predictive functions.

Sep 28, 2017

Jun 13, 2017

Oct 10, 2016

Jun 13, 2017

ThingWorx Analytics is capable of being assembled in multiple Operating Systems. In this post, we will discuss common issues that have been encountered by other users. Permissions Denied – Read/Write access to Third Party Components This is encountered when executing the desired Shell script to begin the creation process. In MacOS and Linux you may encounter a “Permissions Denied” error on the two required components in the creation, the packer-post-processor-vhd and packer components. Error Message This will result in a Terminal dialog message that will read “Process Completed, No Artifacts Created”. This indicates that the Packer Script has failed to complete the task, and the desired appliance images were not created. To correct this issue, you will have to change the permissions of the packer-post-processor-vhd and packer components to be able to be read and executable by the user account that is attempting to create the appliance. Solution Run the following commands in the Virtual Machine terminal (you may need to run as SUDO or as Root): chmod +x packer-post-processor-vhd chmod +x packer After running the above command, run the Shell script of the desired VM Appliance output. This should resolve the issue with “Permission Denied” while executing the build scripts. Error Starting Appliance in VirtualBox Users have experienced this issue at the first run of the Appliance, right after it has been assembled. This issue is unique to VirtualBox versions 5.0 and above. Error Message – Dialog Box If you encounter the error depicted below, please check under settings for the imported OVA for any errors: This issue is the result of invalid settings in the Appliance Configuration. You will need to check for Invalid Settings, by navigating to the Settings Menu for the Appliance: The “Invalid settings detected” indicates that when the Product was assembled, some configuration settings were not applied correctly by the creation tool scripts. Solution Hover your mouse over the settings and it will direct you to cause, in this case it is due to remote monitor setup. Just change the settings in Display (Remote Display Tab) by unchecking the Enable Server button. Press OK after unchecking the “Enable Server” option, and start the Appliance.

Sep 29, 2016

Jul 4, 2018

Jun 29, 2017

Users of ThingWorx Analytics (TWA) may choose to create a predictive model using TWA or import a predictive model that was created using other software. When importing into or exporting out of TWA, this predictive model must be in a PMML (Predictive Model Markup Language) version 4.3+ format. This post describes how to complete the import and export processes. Exporting: The user may create a model in two main ways inside of TWA: using the Builder user interface, or by using ‘Create Job’ service that exists the Training Thing. Whichever method is used, a model Job Id is created automatically by TWA for that model. It is this model Job Id that is used to identify the model inside of TWA, regardless of what is being done with that model. If a model is trained using Builder, the user may highlight that model, click ‘Job Details’, and then copy the Job ID. This is done as follows: Next, the user will navigate to Browse --> Things --> …TrainingThing. This is the Training Microservice inside of TWA where all the functionality involved with training a model exists. Within the …TrainingThing, the user will execute the ‘RetrieveModel’ service under Services. When executing the service, the user will paste the model Job ID (ex. 49704f1a-7fcd-4e38-ab53-84ef46517d0a) they copied earlier, and press ‘Execute’. The resulting text can then be highlighted and copied to Notepad or some other text editor, and saved as .pmml format (ex. ‘ModelExport.pmml’). Importing Through Results Microservice: To import a model that has been saved in PMML 4.3+ format into TWA using the Results Microservice, the user will navigate to Manage --> Repositories (ex. AnalyticsUploadStorage) --> Actions --> Upload, and choose the PMML file. The user will then navigate to Browse --> Things --> …ResultsThing. This is the Results Microservice inside of TWA where all the functionality exists related to previously trained models. Within the …ResultsThing, the user will execute the ‘UploadModel’ service under Services. Alternatively, the user can upload the model from any repository using ‘UploadModelFromRepository” service. To create a model from the uploaded PMML inside of TWA, the user will fill out the filePath and name then execute the service. Note: This model will not show up in Builder, as that would require model validation information that is not part of the imported PMML file. The resulting Job Id can be used to make predictions, such as by using the …PredictionThing’s BatchScore or RealtimeScore services. At this point, the uploaded model acts the same way as if the model were created inside of that TWA environment. Importing Through Analytics Manager: To import a model that has been saved in PMML 4.3+ format into TWA using the Analytics Manager, the user will navigate to Analytics --> Analytics Manager --> Analysis Models, and click the green “New” button. Next the user will choose the provider name (or create a new one by navigating to Analytics --> Analytics Manager --> Analysis Providers). The user will also check the box to “Upload Model”, and click the grey “Choose File” button to find the PMML file. Finally, the user will click the black “Upload” button, then the green “Save” button. At this point, the model is uploaded into ThingWorx Analytics, and the user may progress through the subsequent steps to set up “Analysis Events” and “Analysis Jobs” that will be powered by the imported model.

Oct 4, 2019

Jan 25, 2019

Video Author: Asia Garrouj Original Post Date: June 13, 2017 Applicable Releases: ThingWorx Analytics 8.0 Description: This video is the second of a 3 part series walking you through how to setup ThingWatcher for Anomaly Detection. In this second video you will learn how to use the "Discover UI" from the NextGen Composer to bind simulated data coming thru KEPServer for Anomaly Detection.

Jan 20, 2018

Video Author: Christophe Morfin Original Post Date: October 2, 2017 Applicable Releases: ThingWorx Analytics 8.1 Description: In this video we will walk thru the installation steps of ThingWorx Analytics Server 8.1. This covers the Native Linux installation though the steps will be similar for a docker installation on Windows or Linux.

Jan 21, 2018

Jun 13, 2017

Preface In this blog post, we will discuss how to Start and Stop ThingWorx Analytics, as well as some other useful triaging/troubleshooting commands. This applies to all flavors of the native Linux installation of the Application. In order to perform these steps, you will have to have sudo or ROOT access on the host machine; as you will have to execute a shell script and be able to view the outputs. The example screenshots below were taken on a virtual CentOS 7 Server with a GUI as ROOT user. Checking ThingWorx Analytics Server Application Status 1. Change directory to the installation destination of the ThingWorx Analytics (TWA) Application. In the screenshot below, the application is installed to the /opt/ThingWorxAnalyticsServer directory 2. In the install directory, there are a series of folders and files. You can use the ls command to see a list of files and folders in the installation directory. a. You will need to go navigate one more level down into the ./ThingWorxAnalyticsServer/bin directory by using command cd ./bin b. As you can see above, we used the pwd command to verify that we are in the correct directory. 3. In the ./ThingWorxAnalyticsServer/bin directory, there should be three shell files: configure-apirouter.sh, configure-user.sh, and twas.sh a. To run a status check of the application, use the command ./twas.sh status i. This will provide a list of outputs, and a few warning messages. This is normal, see screenshot below: b. You will have a series of services, which will have a green active (running) or red not active (stopped). i. List of services: twas-results-ms.service - ThingWorx Analytics - Results Microservice twas-data-ms.service - ThingWorx Analytics - Data Microservice twas-analytics-ms.service - ThingWorx Analytics - Analytics Microservice twas-profiling-ms.service - ThingWorx Analytics - Profiling Microservice twas-clustering-ms.service - ThingWorx Analytics - Clustering Microservice twas-prediction-ms.service - ThingWorx Analytics - PredictionMicroservice twas-training-ms.service - ThingWorx Analytics - Training Microservice twas-validation-ms.service - ThingWorx Analytics - Validation Microservice twas-apirouter.service - ThingWorx Analytics - API Router twas-edge-ms.service - ThingWorx Analytics - Edge Microservice Starting and Stopping ThingWorx Analytics If you encounter any errors or stopped services in the above, a good solution would be to restart the TWA Server application. There are two methods to restart the application, one being the restart command, the other would be using the stop and start commands. Method 1 - Restart Command: 1. In the same ./ThingWorxAnalyticsServer/bin directory, run the following command: ./twas.sh restart a. The output of a successful restart will look like the following: 2. The restart should only take a few seconds to complete Method 2 - Stop / Start Commands: 1. In the same ./ThingWorxAnalyticsServer/bin directory, run the following command: ./twas.sh stop 2. After the application stops, run the following command: ./twas.sh start Note: You can confirm the status of the TWA Server application by following the steps in the "Checking ThingWorx Analytics Server Application Status" section above.

Dec 5, 2017

In this video we cover the different configuration steps required for ThingWorx Analytics Builder extension This video applies to ThingWorx Analytics 52.1 till 8.1. Note though: - this video uses Classic Composer, the same operations can be done using the New Composer starting with version 8.0 as illustrated in the Help Center - For release 8.1, the Settings menu differs from previous versions, see Video Link : 2079 between times 00:12 sec to 00:40 sec for up to date menu selection. Updated Link for access to this video: Installing Thingworx Analytics Builder: Part 2 of 3

Sep 13, 2016

Video Author: Asia Garrouj Original Post Date: June 13, 2017 Applicable Releases: ThingWorx Analytics 8.0 Description: This video is the first of a 3 part series walking you through how to setup ThingWatcher for Anomaly Detection. In this first video you will learn the basics of how to establish connectivity between KEPServer and the ThingWorx Platform.

Jan 20, 2018

Concepts of Anomaly Detection used in ThingWatcher ThingWatcher is based on anomaly detection with the normal distribution. What does that mean? Actually, normally distributed metrics follow a set of probabilistic rules. Upcoming values who follow those rules are recognized as being “normal” or “usual”. Whereas value who break those rules are recognized as being unusual. What is a normal distribution? A normal distribution is a very common probability distribution. In real life, the normal distribution approximates many natural phenomena. A data set is known as “normally distributed” when most of the data aggregate around it's mean, in a symmetric way. Also, it's extreme values get less and less likely to appear. Example When a factory is making 1 kg sugar bags it doesn’t always produce exactly 1 kg. In reality, it is around 1 kg. Most of the time very close to 1 kg and very rarely far from 1 kg. Indeed, the production of 1 kg sugar bag follows a normal distribution. Mathematical rules When a metric appears to be normally distributed it follows some interesting law. As does the sugar bag example. The mean and the median are the same. Both are equal to 1000. It’s because of the perfectly symmetric “bell-shape” It is the standard deviation called sigma σ that defines how the normal distribution is spread around the mean. In this example σ = 20 68% of all values fall between [mean-σ; mean+σ] For the sugar bag [980; 1020] 95% of all values fall between [mean-2*σ; mean+2*σ] For the sugar bag [960; 1040] 99,7% of all values fall between [mean-3*σ; mean+3*σ] For the sugar bag [940; 1060] The last 3 rules are also known as the 68–95–99.7 rule also called the three-sigma rule of thumb When the rules get broken: it’s an anomaly As previously stated, When a system has been proven normally distributed, it follows a set of rules. Those rules become the model representing the normal behavior of the metric. Under normal conditions, upcoming values will match the normal distribution and the model will be followed. But what happens when the rules get broken? This is when things turn different as something unusual is happening. In theory, in a normal distribution, no values are impossible. If the weights of the bags of sugar were really distributed, we would probably find a bag of sugar of 860 g every billion products. In reality, we approximate this sugar bag example as normally distributed. Also, almost impossible value are approximated as impossible Techniques of Anomaly Detection Technique n°1: outlier value An almost impossible value could be considered as an anomaly. When the value deviates too much from the mean, let’s say by ± 4σ, then we can consider this almost impossible value as an anomaly. (This limit can also be calculated using the percentile). Sugar bags who weigh less than 920 g or more than 1080 g are considered anomalous. Chances are, there is a problem in the production chain. This provides a simple way to define maximum and minimum thresholds. Technique 2: detecting change in the normal distribution Technique n°2 can detect unusual distribution fast, using only some points. But it can’t detect anomalies who move from one sigma σ to another in a usual manner. To detect this kind of anomaly we use a “window” of n last elements. If the mean and standard derivation of this window change too much from usual then we can deduce an anomaly. Using a big window with a lot of values is more stable, but it requires more time to detect the anomaly. The bigger the window is the more stable it becomes. But it would require more time to detect the anomaly as it needs to aggregate more values for the detection.

Jan 30, 2017

In this Blog, we will share some light about Gradient boost, which is a default algorithm in our Analytics platform. We will touch on: 1) The main purpose of Gradient boost and how the technique works. 2) We will look at advantages and constraint. 3) Last some “nice to know” tips when working with Gradient. Gradient boost is a machine learning technique which main purpose is to help weak prediction models become stronger. Gradient boost works by building one tree at a time, and correct errors made by previously tree. The theory support reweights of edges which allows badly weight edges to get reweighted. For example the misclassified gain weight and those weights which are classified correctly, lose weight. It is kind of the same strategy when dealing with stocks; you balance the investment between bonds and share. An analog could also be done to illnesses; If a doctor informs that you have a rare disease, you want to make sure to get a few more opinions from other doctors, You will evaluate all the information to make a more correct decision about how to cure yourself. Why use gradient boost: - Gradient boost provides the user with a powerful tool to boost/improve weak prediction models. - Gradient boost works well with regression and classification problems, therefore Decision tree can benefit from applying gradient boost. - Gradient boost is known in the industry, to be one of the best techniques to use when dealing with model improvement. - Gradient boost uses stagewise fashion, in this way each time it adjust a tree, it does not go back and readjust when dealing with the next tree. As with all machine learning algorithms gradient boost also have some constraint: - There is a change of overfitting. “Nice to know” tips: - A natural way to reduce this risk of overfitting would be to monitor and adjust the iterations. - The depth of the tree might have an influence on the prediction error, observe what happens if the depth is a stump/1 level deep.

Jan 6, 2017

Video Author: Asia Garrouj Original Post Date: June 13, 2017 Applicable Releases: ThingWorx Analytics 8.0 Description: This video is the third of a 3 part series walking you through how to setup ThingWatcher for Anomaly Detection. In this second video you will learn how to use the the Anomaly Mashup to visualize data received from a remote device.

Jan 20, 2018

In our interactions with PTC customers we often learn they have previously performed Analytics modeling in Python, Matlab, R, or even built home grown analyses in languages such as Java or C++. As expected, when adopting an Industrial Innovation Platform such as ThingWorx that also has its own ThingWorx Analytics module, customers do not want to reimplement everything from scratch and would rather integrate their previous work in the Smart Applications built in ThingWorx, leveraging a combination of their existing toolset together with ThingWorx Analytics modeling. That is certainly possible and there are multiple ways to do that. In this article we will focus on several general ways to make that happen, but it is important to keep in mind that language specific approaches are also possible and we are happy to discuss those in the specific context of the customer. Here are five different ways to bring existing Analytics into ThingWorx: If the task is to reuse an existing predictive model developed in a language such as Python/R/Matlab, typically one can export that model in PMML (Predictive Model Markup Language), an xml format, and import it in ThingWorx Analytics using the AnalyticsServer_ResultsThing -> UploadModel service. Libraries such as sklearn2pmml & r2pmml can be utilized towards that goal. The imported model can then be used in the same fashion as a ThingWorx Analytics developed model to power smart applications built in ThingWorx. If the Analysis involves more complex tasks than Predictive Modeling, such as custom data normalizations or non-standard Machine Learning models or home grown algorithms, one can use the options below. Call the ThingWorx exposed REST Web API from Python/Matlab/R/Java/Javascript. Every service from ThingWorx can be called that way, and the API can also be used to push analyses results into ThingWorx for further consumption, perhaps together with other sources of data such as sensor readings, in the smart applications built there. The documentation for the ThingWorx REST API can be found here. Expose the existing Analytics via using a thin layer of REST Web Services. For example, in Python, this can be done using Flask, with few lines of code. Then, the orchestration can happen from ThingWorx by calling the exposed Web Service and weaving the results back into smart applications. Often our customers' current architecture involves a relational database (e.g. SQL Server, Oracle, etc) that is powering the existing Analytics, and stores the end results (predictions, correlations, etc). In this scenario, we can connect ThingWorx directly to that database to read these results. Finally, in the case of complex Analytics, where a tighter integration with ThingWorx is desired, existing Analytics / algorithms can be wrapped into a ThingWorx Extension or an Analytics Provider using the corresponding PTC SDKs. When choosing an integration option, customers need to carefully balance complexity of integration, constraints of their architecture, Analytics modeling complexity, as well as end user consumption requirements.

May 27, 2021

IoT Tips

Installing ThingWorx Analytics 8.0

Getting Started with ThingWorx Analytics Part-1

Gradient Boost Algorithm

Anomaly Detection 8.0 –Part 3. Viewing Data via Anomaly Mashup

Installing ThingWorx Analytics Server 52.x - part 2 - end

Anomaly Detection 8.0 –Part 2. Configure Anomaly Alert

Troubleshooting Steps for ThingWorx Analytics – Getting Started

ThingWorx Analytics Publish Model using TW.AnalysisServices.AnalyticsServer.AnalyticsServerConnector

Connecting Existing Things to ThingWorx Industrial Gateway for Anomaly Detection

Exporting and Importing PMML Models with ThingWorx Analytics

Configuring and using Anomaly Detection in ThingWorx 8.4

Anomaly Detection 8.0: Configuring Anomaly Alerts: Part 2 of 3

Installing ThingWorx Analytics Server 8.1 - Native Linux

Anomaly Detection 8.0 –Part 1. Connecting KEPServer to ThingWorx

Starting & Stopping ThingWorx Analytics 8.1 (Native Linux Deployment)

Installing Thingworx Analytics Builder part 2 of 3

Anomaly Detection 8.0: Connecting KEPServer to ThingWorx: Part 1 of 3

Concepts of Anomaly Detection used in ThingWatcher

Why use Gradient Boost and how does it work?

Anomaly Detection 8.0: Viewing Data via Anomaly Mashup: Part 3 of 3

Five ways to integrate external Analytics such as Python, R, Matlab in the ThingWorx Platform

ThingWorx Learning Paths

Getting Started on the ThingWorx Platform Learning Path