IoT Tips

Predictive models: Predictive model is one of the best technique to perform predictive analytics. This is the development of models that are trained on historical data and make predictions on new data. These models are built in order to analyse the current data records in combination with some historical data. Use of Predictive Analytics in Thingworx Analytics and How to Access Predictive Analysis Functionality via Thingworx Analytics Bias and variance are the two components of imprecision in predictive models. Bias in predictive models is a measure of model rigidity and inflexibility, and means that your model is not capturing all the signal it could from the data. Bias is also known as under-fitting. Variance on the other hand is a measure of model inconsistency, high variance models tend to perform very well on some data points and really bad on others. This is also known as over-fitting and means that your model is too flexible for the amount of training data you have and ends up picking up noise in addition to the signal. If your model is performing really well on the training set, but much poorer on the hold-out set, then it’s suffering from high variance. On the other hand if your model is performing poorly on both training and test data sets, it is suffering from high bias. Techniques to improve: Add more data: Having more data is always a good idea. It allows the “data to tell for itself,” instead of relying on assumptions and weak correlations. Presence of more data results in better and accurate models. The question is when we should ask for more data? We cannot quantify more data. It depends on the problem you are working on and the algorithm you are implementing, example when we work with time series data, we should look for at least one-year data, And whenever you are dealing with neural network algorithms, you are advised to get more data for training otherwise model won’t generalize. Feature Engineering: Adding new feature decreases bias on the expense of variance of the model. New features can help algorithms to explain variance of the model in more effective way. When we do hypothesis generation, there should be enough time spent on features required for the model. Then we should create those features from existing data sets. Feature Selection: This is one of the most important aspects of predictive modelling. It is always advisable to choose important features in the model and build the model again only with important and significant features. e. let’s say we have 100 variables. There will be variables which drive most of the variance of a model. If we just select the number of features only on p-value basis, then we may still have more than 50 variables. In that case, you should look for other measures like contribution of individual variable to the model. If 90% variance of the model is explained by only 15 variables then only choose those 15 variables in the final model. Multiple Algorithms: Hitting at the right machine learning algorithm is the ideal approach to achieve higher accuracy. Some algorithms are better suited to a particular type of data sets than others. Hence, we should apply all relevant models and check the performance. Algorithm Tuning: We know that machine learning algorithms are driven by parameters. These parameters majorly influence the outcome of learning process. The objective of parameter tuning is to find the optimum value for each parameter to improve the accuracy of the model. To tune these parameters, you must have a good understanding of these meaning and their individual impact on model. You can repeat this process with a number of well performing models. For example: In random forest, we have various parameters like max_features, number_trees, random_state, oob_score and others. Intuitive optimization of these parameter values will result in better and more accurate models. Cross Validation: Cross Validation is one of the most important concepts in data modeling. It says, try to leave a sample on which you do not train the model and test the model on this sample before finalizing the model. This method helps us to achieve more generalized relationships. Ensemble Methods: This is the most common approach found majorly in winning solutions of Data science competitions. This technique simply combines the result of multiple weak models and produce better results. This can be achieved through many ways. Bagging: It uses several versions of the same model trained on slightly different samples of the training data to reduce variance without any noticeable effect on bias. Bagging could be computationally intensive esp. in terms of memory. Boosting: is a slightly more complicated concept and relies on training several models successively each trying to learn from the errors of the models preceding it. Boosting decreases bias and hardly affects variance.

Mar 28, 2018

Alerts via Anomaly Detection This documents objective is to provide information and links about alerts used for anomaly detection. This document covers following topics: What Is Anomaly Detection Implementing Anomaly Detection Creating an Anomaly Alert and Prerequisites Anomaly Stats Certainty Parameter Video Example On How To Create An Alert for Anomaly Detection Tips and troubleshooting What Is Anomaly Detection Anomaly Detection in ThingWorx is implemented via built-in ThingWatcher functionality. ThingWatcher detects anomalies by monitoring a data stream from a device, calculating an expected distribution of data, and validating that the current data point is a member of the expected distribution. Implementing Anomaly Detection Anomaly Detection is enabled by default in ThingWorx. However, several steps are required to configure the functionality for your specific environment, including the prerequisite activities below. Creating an Anomaly Alert and Prerequisites Configuring Anomaly Detection to monitor a stream of data. For information about setting up Anomaly Detection, view Preparing ThingWorx for Anomaly Detection. Anomaly Stats Anomaly Alert Statuses moves through several statuses as it works its way through the corresponding phases. Initialized Calibrating Training Buffering Monitoring Failed Certainty Parameter The Certainty Parameter when implementing anomaly detection requires a number of factors to consider. At its most basic, ThingWatcher functionality compares two sets of data, a validation set (collected during the Calibrating phase) and a test dataset (data streaming from a remote device). ThingWatcher tries to determine the likelihood that the distribution of values in the test dataset is from the same distribution of values contained in the validation dataset. The accuracy of the model plays a large role in this determination, but so does the Certainty parameter used for the statistical analysis of the two data sets. Video Example On How To Create An Alert for Anomaly Detection Anomaly Detection Part 1. Create connectivity between KEPServer and ThingWorx Platform. Anomaly Detection Part 2. Configure Anomaly Alert to bind simulated data coming through KEPServer for Anomaly Detection. Anomaly Detection Part 3. Viewing data via Anomaly Mashup. Tips and troubleshooting Diagnose and fix the most common issues that may be encountered when working with ThingWatcher. It cannot be stressed strongly enough that you should be familiar with your data including the average time interval between data points, and the collection duration and certainty threshold you specified.

Mar 15, 2018

Jan 21, 2018

Jan 20, 2018

Use Case: You’ve published a model from Analytics Builder to Analytics Manager, and then used service CreateOrUpdateThingTemplateForModel on resource TW.AnalysisServices.ModelManagementServicesAPI. A thing created from the resulting template will have an infotable called “data” which needs to be populated in order to trigger an Analysis Event & Job. For example you might have been following the online documentation for Analytics Manager > Working with Thing Predictor > Demo: Using Thing Predictor, link here. This script makes it easy to create a line of test data into field "data" on your thing to trigger the analysis event & job. Also fields causalTechnique, goalName and importantFieldCount are set programmatically, these are needed for the analysis event & job. Also this script might be useful as a general example of how to write to an infotable property on a thing. The JavaScript code is shown here and also attached as a text file to this post: me.causalTechnique = 'FULL_RANGE' me.goalName = 'predict_Compressor_failure' me.importantFieldCount = 3 // ThingPredictor.test_3f1a6a31-e388-4232-9e47-284572658a4a.InputParamsdataDataShape entry object //var newEntry = new Object(); var params = { infoTableName : "InfoTable", dataShapeName : "ThingPredictor.test-integer_afebaef3-b2cf-4347-824c-a39c11ddbb4a.InputParamsdataDataShape" }; // CreateInfoTableFromDataShape(infoTableName:STRING("InfoTable"), dataShapeName:STRING):INFOTABLE(ThingPredictor.test_3f1a6a31-e388-4232-9e47-284572658a4a.InputParamsdataDataShape) var myInfoTable = Resources["InfoTableFunctions"].CreateInfoTableFromDataShape(params); // 2 - CREATE INFOTABLE ROW USING object var newEntry = new Object(); newEntry._Pressure = 10.5; // NUMBER newEntry._Temperature = 45.1; // NUMBER newEntry._VibrationX = 81; // NUMBER newEntry._VibrationY = 65; // NUMBER //newEntry.key = 4; // STRING - isPrimaryKey = true // 3 - ADD INFOTABLE ROW USING TO INFOTABLE myInfoTable.AddRow(newEntry); // 3 – PERSIST INFOTABLE TO THE THING PROPERTY ‘data’ me.data = myInfoTable;

Jan 12, 2018

Mapping previous versions of ThingWorx Analytics API to ThingWorx Analytics 8.1 Services Since ThingWorx Analytics 8.1, the classic server monolith has been replaced by a series of independent microservices. This new structure groups services around specific elements of functionality (data, training, results). Thus the use of the previous API commands to access ThingWorx Analytics functions has been replaced by the use of ThingWorx Services. Those Services exist within specific Microservice Things accessible in the ThingWorx Platform 8.1. The table below shows a mapping of the most common previous API commands from version 8.0 and previous versions to the version 8.1 related services. The table below does not contain an exhaustive listing either of API commands nor of Services. The API commands used below are samples which might require further information like headers and Body once used. These are used in the table below for reference purposes. Previous API Command Purpose Sample Syntax TWA 8.1 Service Analytics Thing related to Service Service description 1 Version Info GET: http://<IP Address>:8080/1.0/about/versioninfo VersionInfo This service is available in each Mircorservice Thing inheriting from Analytics Server Returns the internal version number for a specific microservice. The first two digits = ThingWorx Core version. The next three digits = version of the microservice. 2 Registering new Dataset POST: http://<IP Address>:8080/1.0/datasets/ CreateDataset Data Microservice Creates the dataset uploads the data along with its metadata and optimizes it automatically. 3 Checking Dataset Status GET: http://<IP Address>:8080/1.0/datasets/<DataSet Name> ListCreatedDatasets Data Microservice This old functionality is replaced by a Service that lists all the created Datasets 4 Creating Metadata POST: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/configuration CreateDataset Data Microservice (Check line 2 for further information) 5 Checking Dataset Configuration GET: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/configuration GetDatasetSchema Data Microservice Retrieves the metadata from a dataset. 6 Loading Dataset CSV POST: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/data CreateDataset Data Microservice (Check line 2 for further information) 7 Checking Job Status GET: http://<IP Address>:8080/1.0/status/<Job ID> GetJobStatus Available in all created Microservices inheriting from AnalyticsJob Server Retrieves the status of a specific job 8 Signals Job POST: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/signals CreateJob Signals Microservice Create a job to identify signals 9 Signal Results Job GET: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/signals/<Job ID>/results RetrieveResult Signals Microservice Retrieve a result of a Signals job 10 Profile Job POST: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/profiles CreateJob Profiling Microservice Creates a job to generate profiles. 11 Profile Result Job GET: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/profiles/<Job ID>/results RetrieveResult Profiling Micorservice Retrieve the results of a profiles job. 12 Train Model Job POST: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/prediction CreateJob Training Micorservice Create a prediction model job. 13 Train Model Result Job GET: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/prediction/<Job ID>/results RetrieveModel Training Microservice Only retrieves the PMML model. But if a holdout for validation was specified in the CreateJob, a validation job is auto-created and runs. 14 Scoring Job POST: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/predictive_scores BatchScore Prediction Microservice Submit Predictive Scoring Job 15 Scoring Job Result GET: http://<IP Address>:8080/1.0/datasets/<DataSet Name>/predictive_scores/<Job ID>/results RetrieveResult Prediction Microservice Retrieve results from prediction scoring jobs

Dec 4, 2017

In the last while I've seen a few things which got me thinking about how value is created or unlocked from connected data. Multiple components are required to create, send, store and manage the data created by edge devices, doing these things enables value to be unlocked, but what does it take to unlock the value? It was this article which first got me interested in considering this question. In particular it was a section near the bottom of the article where the author describes a number of creative business use cases for car manufacturers which could be enabled by connected data. If this author could come up with several creative and potentially valuable use case examples for one industry I started to wonder what other sorts of use cases could exist in other industries? Could there be a series of use of use cases which a little variation be applied to different industries? The second article which further sparked my interest in where the value originates from, is this one on using a cryptocurrency with IOT. While the idea of using a blockchain like technology with IOT is intriguing, it was the second image in the article (below) which resonated with me on value. This image is a graphical representation of the connection between the key components of a connected system and it makes it clear that each component has a critical role to play and the whole system and missing anyone of the parts and the system doesn't function. This image make it clear that it's the "Analyze" phase which drives the action to do something, and it's taking an action which is the reason the systems reason for existing. Which brings me to the third and final article describing Industry 4.0. Like the other two articles, it wasn't the main point of the article I found most interesting, rather it was the image below, and in particular the side bar 'Value Creation through' which brought me back to the question of where value comes from. The idea that in a manufacturing setting, value can be created through product or process innovations as well as through new business models is intriguing. I think a fourth idea missing from this list, is one were network effects from getting more and more proprietary data creating a compounding effect, like with Facebook or LinkedIn. If there are at least four modes of value creation, maybe there others? While these articles caused me to ask some questions, none of them really answered the question of where the value is unlocked. To answer the question I decided to restate the question to be "how is value unlocked from data" making the assumption the value is derived from the data. This question is a little easier to address. The best visual representation of the answer I've seen is the data value road map (below) from the Creating a Data-Driven Organization book which was released a couple of years ago. While I think the author is probably missing at least two boxes above 'optimization' ("new business models" and "data driven network effects") I think the graphic does a good job communicating that as the value created from data increases, the complexity of the analytic task also increases; suggesting the value is unlocked by the analytics. For me, the value from a system of connected devices is unlocked from the "analysis" phase as seen in the first image. But in order to perform the "analysis" I think requires two things. First asking the right high value questions of the data (product managers/beginning with the end in mind/use cases) and then using the right set of technologies to address those questions which in many instances means Artificial Intelligence of some sort. Interestingly although artificial intelligence is required for many high value use cases, both parts of the analysis require distinctly human skills (the right use cases & controlling the technology) to create externalized intelligence and generate value. Creating a Data-Driven Organization: Practical Advice from the Trenches 1, Carl Anderson, eBook - Amazon.com

Oct 11, 2017

Parquet Data Format used in ThingWorx Analytics Starting ThingWorx Analytics Version 8.1 Data storage will no longer require the installation of a PostgreSQL database. Instead, uploaded CSV data is converted to the optimized Apache Parquet format and stored directly in the file system. This Blog explains some the features of Apache Parquet justifying this transition in ThingWorx Analytics Data Storage. features What is Apache Parquet: Apache Parquet is a column-oriented data store of the Apache Hadoop ecosystem. It is compatible with most of the data processing frameworks in the Hadoop environment. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk. Below is an illustration of the Columnar Storage model: Apache Parquet Features and Benefits: Apache Parquet is implemented using the record shredding and assembly algorithm taking into account the complex data structures that can be used to store the data. Apache Parquet stores data where the values in each column are physically stored in contiguous memory locations. Due to the columnar storage, Apache Parquet provides the following benefits: Column-wise compression is efficient and saves storage space Compression techniques specific to a type can be applied as the column values tend to be of the same type Queries that fetch specific column values need not read the entire row data thus improving performance Different encoding techniques can be applied to different columns Some advantages of using Parquet for ThingWorx Analytics: Apart from the above benefits of using Parquet which amount to higher efficiency and increased performance, below are some advantages that apply specifically to ThingWorx Analytics This change in ThingWorx Analytics from using a Database to using Parquet removes the limitations on the number of data columns the system can handle. It also allows for streamlining the dataset creation process. Since the data is converted to a Parquet format, there is no need to separately optimize the dataset. Even when new data is appended to an existing dataset, a new partition is added and re-optimization is optional but not required. Data could be appended easily so there is no longer a need to re-load the full Dataset when new Data values are added The illustration below shows the transition from Row-based Data Storage model VS the columnar based Storage of Parquet

Oct 2, 2017

First we need to Understand below terms: Quantitative Variable: A quantitative variable is naturally measured as a number for which meaningful arithmetic operations make sense. Examples: Height, age, crop yield, GPA, salary, temperature, area, air pollution index (measured in parts per million), etc. Categorical variable: Any variable that is not quantitative is categorical. Categorical variables take a value that is one of several possible categories. As naturally measured, categorical variables have no numerical meaning. Examples: Hair color, gender, field of study, college attended, political affiliation, status of disease infection. Ordinal Variables: An ordinal variable is a categorical variable for which the possible values are ordered. Ordinal variables can be considered “in between” categorical and quantitative variables. Example: Educational level might be categorized as 1: Elementary school education 2: High school graduate 3: Some college 4: College graduate 5: Graduate degree • In this example (and for many ordinal variables), the quantitative differences between the categories are uneven, even though the differences between the labels are the same. (e.g., the difference between 1 and 2 is four years, whereas the difference between 2 and 3 could be anything from part of a year to several years) • Thus it does not make sense to take a mean of the values. • Common mistake: Treating ordinal variables like quantitative variables without thinking about whether this is appropriate in the particular situation at hand. Ordinal regression: In statistics, ordinal regression (also called "ordinal classification") is a type of regression analysis used for predicting an ordinal variable. The Ordinal Regression procedure allows you to build models, generate predictions, and evaluate the importance of various predictor variables in cases where the dependent (target) variable is ordinal in nature. Ordinal dependents and linear regression: When you are trying to predict ordinal responses, the usual linear regression models don't work very well. Those methods can work only by assuming that the outcome (dependent) variable is measured on an interval scale. Because this is not true for ordinal outcome variables, the simplifying assumptions on which linear regression relies are not satisfied, and thus the regression model may not accurately reflect the relationships in the data. In particular, linear regression is sensitive to the way you define categories of the target variable. With an ordinal variable, the important thing is the ordering of categories. So, if you collapse two adjacent categories into one larger category, you are making only a small change, and models built using the old and new categorizations should be very similar. Unfortunately, because linear regression is sensitive to the categorization used, a model built before merging categories could be quite different from one built after. Below are some examples pf ordered logistic regression: Example 1: A marketing research firm wants to investigate what factors influence the size of soda (small, medium, large or extra large) that people order at a fast-food chain. These factors may include what type of sandwich is ordered (burger or chicken), whether or not fries are also ordered, and age of the consumer. While the outcome variable, size of soda, is obviously ordered, the difference between the various sizes is not consistent. The difference between small and medium is 10 ounces, between medium and large 8, and between large and extra large 12. Example 2: A researcher is interested in what factors influence modaling in Olympic swimming. Relevant predictors include at training hours, diet, age, and popularity of swimming in the athlete’s home country. The researcher believes that the distance between gold and silver is larger than the distance between silver and bronze. Example 3: A study looks at factors that influence the decision of whether to apply to graduate school. College juniors are asked if they are unlikely, somewhat likely, or very likely to apply to graduate school. Hence, our outcome variable has three categories. Data on parental educational status, whether the undergraduate institution is public or private, and current GPA is also collected. The researchers have reason to believe that the “distances” between these three points are not equal. For example, the “distance” between “unlikely” and “somewhat likely” may be shorter than the distance between “somewhat likely” and “very likely”. How to use and get result by Ordinal Regression: Clink this link for PDF PDF source: http://www.norusis.com

Oct 2, 2017

The accuracy of a predictive model can be boosted in two ways: Either by embracing Feature engineering or by applying boosting algorithms straight away. There are multiple boosting algorithms like Gradient Boosting, XGBoost, AdaBoost, Gentle Boost etc. Every algorithm has its own underlying mathematics and a slight variation is observed while applying them. While working with boosting algorithms, we have come across two frequently occurring buzzwords: Bagging and Boosting. Bagging: It is an approach where you take random samples of data, build learning algorithms and take simple means to find bagging probabilities. Boosting: Boosting is similar, however the selection of sample is made more intelligently. We subsequently give more and more weight to hard to classify observations. Below are Default Algorithms used in Predictive Models generated in ThingWorx Analytics: Decision Tree Gradient Boost Linear regression Neural Net Random Forrest Logistic Regression Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differential loss function. Let’s begin with an easy example: Assume, you are given a previous model M to improve on. Currently you observe that the model has an accuracy of 80% (any metric). How do you go further about it? One simple way is to build an entirely different model using new set of input variables and trying better ensemble learners. On the contrary, we have a much simpler way to suggest. It goes like this: Y = M(x) + error What if we are able to see that error is not a white noise but have same correlation with outcome(Y) value. What if we can develop a model on this error term? Like:error = G(x) + error2 Probably, we will see error rate will improve to a higher number, say 84%. Let’s take another step and regress against error2: error2 = H(x) + error3 Now we combine all these together: Y = M(x) + G(x) + H(x) + error3 This probably will have a accuracy of even more than 84%. What if we can find an optimal weights for each of the three learners: Y = alpha * M(x) + beta * G(x) + gamma * H(x) + error4 How Gradient Boosting Works: 1. Loss Function: The loss function used depends on the type of problem being solved. It must be differential, but many standard loss functions are supported and you can define your own. A benefit of the gradient boosting framework is that a new boosting algorithm does not have to be derived for each loss function that may want to be used, instead, it is a generic enough framework that any differential loss function can be used. 2. Weak Learner: Decision trees are used as the weak learner in gradient boosting. Specifically regression trees are used that output real values for splits and whose output can be added together, allowing subsequent models outputs to be added and “correct” the residuals in the predictions. Trees are constructed in a greedy manner, choosing the best split points based on purity scores like Gini or to minimize the loss. 3. Additive Model: Trees are added one at a time, and existing trees in the model are not changed. A gradient descent procedure is used to minimize the loss when adding trees. we have weak learner sub-models or more specifically decision trees. After calculating the loss, to perform the gradient descent procedure, we must add a tree to the model that reduces the loss. Improvements to Basic Gradient Boosting: 1. Tree Constraints: It is important that the weak learners have skill but remain weak. Below are some constraints that can be imposed on the construction of decision trees: Number of trees: Generally adding more trees to the model can be very slow to over fit. The advice is to keep adding trees until no further improvement is observed. Tree depth: Deeper trees are more complex trees and shorter trees are preferred. Generally, better results are seen with 4-8 levels. Number of nodes or number of leaves: like depth, this can constrain the size of the tree, but is not constrained to a symmetrical structure if other constraints are used. Number of observations per split: Imposes a minimum constraint on the amount of training data at a training node before a split can be considered Minimum improvement to loss: Is a constraint on the improvement of any split added to a tree. 2. Weighted Updates: The contribution of each tree to this sum can be weighted to slow down the learning by the algorithm. This weighting is called a shrinkage or a learning rate. "Each update is simply scaled by the value of the “learning rate parameter v". 3. Stochastic Gradient Boosting: At each iteration a sub sample of the training data is drawn at random (without replacement) from the full training data set. The randomly selected sub sample is then used, instead of the full sample, to fit the base learner. 4. Penalized Gradient Boosting: The additional regularization term helps to smooth the final learnt weights to avoid over-fitting. Intuitively, the regularized objective will tend to select a model employing simple and predictive functions.

Sep 28, 2017

This blog is intended to help diagnose and fix the most common issues that may be encountered when working with ThingWatcher. It cannot be stressed strongly enough that you should be familiar with your data including the average time interval between data points, and the collection duration and certainty threshold you specified. Before you start troubleshooting ThingWatcher, check that result and training microservices is running. For testing result microservices open a web browser and paste result URL; http://<IP of microservices>:<Port of results microservices>/results/models (e.g., http://localhost:8096/results/models) For testing training microservices open a web browser and paste training URL; http://<IP of microservices>:<Port of training microservices>/training (e.g., http://localhost:8091/training) If you see either: {"values":[],"total":0,"next":null,"previous":null} or a list of training jobs in JSON format, this means the result and training microservice service is available. 1. Question. I haven't seen an anomaly but I believe that my 'property' is anomalous? This can be caused by different reasons, here are the most common causes: The certainty is too high. If the certainty is too high ThingWatcher is conservative in its categorization of "true positives" and therefore may emit more "false negatives". Reducing the certainty will change this behavior but note that ThingWatcher may now categorize too many "false positives" as a result. In other words, ThingWatcher may detect the desired anomalies but also some non-anomalies. The 'property' is anomalous during training data collection. If ThingWatcher creates a predictive model from anomalous data, it may not be able to detect the desired anomalies during MONITORING because the data does not really appear to be anomalous. So ThingWatcher treats this pattern as 'normal'. Therefore, ensure that 'property' values are also non-anomalous during training. There are long time gaps during the monitoring state so ThingWatcher stays in Buffering and categorizes these data points as non-anomalous. 2. Question. ThingWatcher detects an anomaly but my 'property' is non-anomalous? The certainty might be too low. In this case, ThingWatcher reports anomalies when the incoming data pattern looks even slightly different from the expected data pattern. ThingWatcher might need more training data. If the 'property' data has a pattern that occurs over a long time span, ThingWatcher needs to collect multiple cycles of all these patterns in order to detect a true anomaly without emitting too many false positives. 3. Question. ThingWatcher is in FAILED State, why? There are many possible reasons for a failed state, here are the most likely problems that can cause a failed state. ThingWatcher emits a FAILED ThingWatcher State because the training service has not been setup or is down. similarly, the result service is not available. NotemessageText=Unexpected exception. {Throwable=[ConnectException: Operation timed out}]]messageText=Unexpected exception. {Throwable=[ConnectException: Connection refused}]]. Note that ThingWatcher is still able to collect all training data and you will only begin to see these failed states after ThingWatcher tried to post the training request. ThingWatcher emits a FAILED ThingWatcher State because time gaps prevent the data collection for training.You will see this warning in the log messages : "A long time gap was detected in the data that is greater than the threshold of {n}". This means you have a long gap in the training data and ThingWatcher will recollect the data. If there are more than 3 recollections due to a long time gap, ThingWatcher transitions to a failed state and will not be able to recover. In this case you can either instruct ThingWatcher to retrain and try again or check the data source to make sure it does not have long gaps. 4. Question. Why does ThingWatcher remain in Buffering? There are many possible reasons for ThingWatcher to remain in Buffering, but the most likely issue is time gaps which cause ThingWatcher to remain stuck in Buffering. If the incoming data regularly contains long time gaps, you will notice that ThingWatcher keeps alternating between the monitoring and buffering states. You may need to provide better quality data i.e. more evenly spaced data. Source: Alex Meng, Specialist Software Engineer

Sep 22, 2017

Learning is a wide domain and thus, the field of machine learning has branched into several sub fields dealing with different types of learning tasks. Supervised vs Unsupervised Learning Since learning involves an interaction between the learner and the environment, we can divide tasks according to the nature of the interaction. Consider the task of detecting the hand written numbers among the lots of digital numbers vs task of anomaly detection. For the hand-written numbers detection task, we consider a setting in which the learner receives a set of training numbers, labeled as ‘Hand-Written’ and ‘Digital’. On the basis of such training, the learner should figure out a rule for labeling a newly arriving number. In contrast, for the task of anomaly detection, all the learner gets, as a training is a large set of numbers, without any labels on it, and the learner’s task is to detect ‘unusual’ numbers. Consider learning as a process of ‘using the experience to gain expertise’. Supervised learning describes a scenario in which the ‘experience’, a training example, consist significant information (labels Hand-Written or Digital) that is missing in the future arriving numbers, to which the learned experience is to be applied. Here, the acquired expertise is aimed at predicting the missing information for the validation data. However, in unsupervised learning, there is no difference between the training data and validation data. The learner processes input data with the goal of coming up with some summary or compressed version of the data. Clustering a data set into subsets of similar objects is a good example of such task. There is also an intermediate learning setting in which, the training set contains more information than the validation set. The learner here is required to predict even more information for the validation set. Consider a game of chess, where a value function describes each setting of a chess board. Now one may want to learn the degree why which White’s position is better than the black’s. The only information available to the learner at the training time is positions that occurred in an actual chess game and who won it. Such learning frameworks are investigated under reinforcement learning. Source: ‘Understanding machine learning: From theory to algorithms’ written by Shai Shalev-Shwartz and Shai Ben-David

Jul 24, 2017

Jun 29, 2017

Many users of our software have submitted cases regarding the Third-Party Components and their functions within ThingWorx Analytics. This short blog post will provide the main components used by our software and explain their functionality. ThingWorx Analytics uses the following components in its default installation: Apache ZooKeeper ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services ThingWorx Analytics uses ZooKeeper as the gatekeeper to API calls and processes to the Application Component Homepage: https://zookeeper.apache.org/ Apache Tomcat Apache Tomcat software is an open source implementation of the Java Servlet, JavaServer Pages, Java Expression Language and Java WebSocket technologies ThingWorx Analytics uses Tomcat to handle webservices and API communications This enables the use of ThingWorx Foundation (Core) mashups with ThingWorx Analytics Server Component Homepage: http://tomcat.apache.org/ PostgreSQL Server PostgreSQL is an open source object-relational database system ThingWorx Analytics uses PostgresSQL server to store analytical results for later retrieval Component Homepage: https://www.postgresql.org/

Jun 28, 2017

Retraining the Model in ThingWorx Analytics When using ThingWorx Analytics Products to build Prediction Models, it is not enough to end up with models that are a Technical Success. The purpose is to ultimately have models that are a Business Success. What the user would want to achieve is to have Models that remain reliable and accurate in a potentially changing production environment. Therefore, when your environment changes, the model that you have used and relied on might no longer provide the same quality of results. Hence the need to retrain your model. Types of Models to be retrained: There are currently two types of models that are created with ThingWorx Analytics: Predictive models Anomaly Detection models Each of those models could require retraining based on the context in which they are created then used. When to retrain your Model: - Predictive models: For predictive analytics models, the main initiator for retraining would be a change in the production environment. resulting in the change of collected Dataset. This could nonetheless be caused by many factors: An overall change in the business objective: This could include a change in the granularity at which the Dataset is used. An Example, in a Company HR Dataset, could be moving from making predictions on a Department Level to making predictions on an Employee Level. The addition of either new features in the Dataset or even new values in the existing features which did not figure within the values of the training dataset. This type of change in the Dataset would require the retraining of the Model. The emergence of new trends in the marketplace: These new trends would appear in the generated Datasets. This could be detected by the degradation of the results that are provided by the existing Prediction Models. - Anomaly Detection models: In anomaly detection, the need to retrain the models originates mainly from a change in what is considered to be a Normal behavior of a certain monitored property. The could be caused by the following factors: A change in the context in which the Property values are measured then monitored: An example is monitoring the Traffic in a Street in the Working weekdays while excluding the weekends then adding the Weekend days to the monitored behavior. Here the change in the Traffic is normal however would be detected as an Anomaly unless the model is retrained. A change in the Thresholds of values accepted to be normal in a certain property. An example is the temperatures measured on a running device. The Device, previously, never run at full power when the model was built but since it started running at full power the temperature increased beyond the usual threshold and thus the model needs to be retrained to include the new Normal temperatures. Another reason that could justify retraining the anomaly detection model is simply that when the model was trained the Property values that were used were not representing its normal state. For example, the temperature of an Engine was being measured on a "Turned off" state when we are actually trying to build a model that would detect temperature anomalies on a running Engine. This might not be an exhaustive list of the reasons that would require either a Predictive or an Anomaly Detection Model to be retrained. As a general rule of thumb, if the model starts delivering results that are below expected or if the business context for the model is not valid, then it might be a wise decision to retrain the Analytics Model.

Jun 14, 2017

Behavior of ThingWorx Analytics Anomaly Detection with Data Gaps In ThingWorx Analytics, Anomaly detection is performed through the ThingWatcher API framework. This is done by observing the Data from an Edge device, learning what the data stream should look like and then monitoring for any unexpected sequences of Data within the incoming Data stream. Ideally, for this process to work properly, there should be no Data Gaps. However, Data Gaps do occur, this blog describes how ThingWatcher deals with them in order to achieve high performance in anomaly detection. Data Gaps and phases affected: In anomaly detection, ThingWatcher goes through three consecutive phases which are Initializing, Calibrating and then Monitoring. Both the Initializing and Monitoring phases involve either collecting or monitoring streamed Data, so these two phases are sensitive to Data streaming Gaps. The Calibrating phase involves the use of already collected data to create the Anomaly Detection Model. Thus this phase is not directly affected by Data gaps. Dealing with Long and Short Data Gaps: Initializing Phase: During this phase, Data is collected and as part of the collection process, the sampling rate is imputed. So when short data gaps occur these are interpolated so that there are no missing values. However long Gaps might also occur. A Data gap is considered to be long if there is more than three missing Data points which should amount to three times the sampling rate. Basically, if the Timestamp on a data point is greater than the previous timestamp by more than three times the sampling rate that is considered to be a long gap. If a long gap occurs, ThingWatcher would restart the Data Collection process since long Data gaps are not acceptable. The data recollection process could be initiated three times when there are long gaps before failing if the gaps persist. The Data source would then no longer be considered reliable Monitoring Phase: In this phase, the Data stream is monitored to detect any unexpected behavior. In that case, if a short time gap occurs between the previous and the current TimedValue data points, the lookback buffer would be cleared. ThingWatcher will re-enter the Buffering state and will remain in this state until the lookback window buffer is completely filled. For more information on The functionalities of ThingWatcher, Please refer to the ThingWatcher Deployment Guide https://support.ptc.com/WCMS/files/173109/en/ThingWatcher-Deployment-Guide-8.0.pdf However, if the gaps are long and exceed three times the sampling rate, data filling could no longer be a valid solution and Data collection restarts. It is important to note that these imputed values decrease the accuracy of the Anomaly Detection Model. Therefore data monitored by ThingWatcher should be incremented in regular intervals. In general, persistent data gaps should be avoided by ensuring that data is streamed such that the timestamps increase in regular increments and any gaps that exist are generally incidental and small.

Jun 14, 2017

Jun 9, 2017

This blog addresses a few points that are related to scoring with ThingWorx Analytics. It, particularly, brings a clearer understanding of the concepts behind the values of the scores that are generated when performing a scoring job. Scoring Outputs: It is important to note that when training an analytics model, the method is to create a generalizable model from a relatively small training dataset. By its nature, we expect the training process to see a limited subset and not an exhaustive list of all possible values for many constraints, especially for time and practicality. As such, these generalized models will be expected to handle unseen data in the form of new combinations or values outside of previously observed ranges (more on this below). One common way to see scores that exceed the observed ranges in training, under the assumption that the goals are continuous, is to use prescriptive scoring. Prescriptive scoring attempts to find optimal values for a lever, meaning tunable, features in order to maximize or minimize score values. See the prescriptive scoring documentation and functionality for more information. min/max constraints: these are constraints that are placed upon the inputs for training and expected inputs for scoring. • For training: If theses ranges were provided as part of the upload process, then training will raise exceptions regarding invalid data. However, if the ranges are not provided - they will be inferred from the data and, as such, training will not see values outside of observed ranges. • For scoring: Validation of the ranges will only be performed on the inputs - not the outputs. It is very important to note that the handling of these "constraints" is dependent upon the data type. For categorical (e.g. colors) and ordinal data (e.g. shirt sizes), the constraints are strict and data that was not observed in training will raise exceptions during scoring. However, for continuous values (e.g. temperature ranges) these constraints are more informational in nature. For predictive scoring, our code will accept records with values outside of those ranges. The rule of thumb is that values slightly outside these ranges are acceptable and that as the values stray farther from the ranges, the accuracy of the model degrades very quickly. For prescriptive scoring, these constraints are used to determine the acceptable ranges of values to try when determining the optimal values. Values outside of these constraints will NOT be tried.

Apr 21, 2017

This blog is about Decision tree and it is aimed at providing the Analytics user with additional information about our default algorithm; Decision tree. More specifically we will clarify what structures builds the Decision tree, understand the purpose of these structures, and last we will look at a few examples of pros and cons of applying Decision tree. Decision tree is a great tool to help us making good decisions based on a huge amount of data. The algorithm maps information provided from the dataset and constructs a tree to predict our goal. Classification and regression trees are the structures behind Decision tree – Therefore when we refer to Decision tree we collectively include classification and regression as being part of Decision tree. But what is the difference between Classification and regression? 1) Classification can be used for predicting dependent categorical variables. For example if needed to predict what type of failure occurs with a machine, or what type of car a person would buy it would be a classification tree. 2) Regression is used for dependent continues numerical variables. For example if you want to predict the amount of sugar in a person’s blood or you need to predict the price of oil per gallon in 2020, regression is uses for the prediction. Regression is addressing predictions, where the value can be continues valued, and classification tree predict the correct label/type for the class. Example of a classification tree: Keep in mind that it is the goal variable that determines the type of decision tree needed. Using Decision tree is a powerful tool for prediction: Easy to understand and interpret. Help us to make the best decisions on the basis of existing information. Can handle missing values without needing to resort. Considerations: As with all analytics models, there are also limitations of the decision tree. Users must be aware of, Decision trees can be subject to overfitting and underfitting, particularly when using a small dataset. High correlation between different variables may cause very high model accuracy.

Jan 4, 2017

Steps Get the IP address of the ThingWorx Analytics Server Type ip a Put that IP address into the desired web browser Your IP address may be different from the one in the picture above Add the port number of the server to the end of the IP address The Default port number is 8080 Make sure to put a colon " : " between the end of the IP address and the start of the port number The port number could be different in some cases, depending if it was configured differently during installation Hit Enter and the main page will load.

Jan 3, 2017

Nov 29, 2016

IoT Tips

Ways to Improve Predictive Models

Alerts via Anomaly Detection

Connecting Existing Things to ThingWorx Industrial Gateway for Anomaly Detection

Installing ThingWorx Analytics 8.0

Service to set Test Data in TWX Foundation to trigger Analytics Event & Job

Mapping previous versions of ThingWorx Analytics API to ThingWorx Analytics 8.1 Services

Where is the value unlocked in a system of connected devices?

Parquet Data Format used in ThingWorx Analytics

Ordinal Predictions

Gradient Boost Algorithm

General Troubleshooting Blog for ThingWatcher

TYPES OF LEARNING

Connecting Existing Things to ThingWorx Industrial Gateway for Anomaly Detection

ThingWorx Analytics 8.0 - Third-Party Components and their Functions

Retraining the Model in ThingWorx Analytics

Behavior of ThingWorx Analytics Anomaly Detection with Data Gaps

Installing ThingWorx Analytics 8.0

Constraints on Scores Values in ThingWorx Analytics

What is a Decision tree and how dos it work?

How to Acess ThingWorx Analytics Interactive API Guide

Create Signals In ThingWorx Analytics Builder

ThingWorx Learning Paths

Getting Started on the ThingWorx Platform Learning Path