cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Want the oppurtunity to discuss enhancements to PTC products? Join a working group! X

IoT Tips

Sort by:
The following Expert Session videos are now available for viewing within the ThingWorx Community: ThingWorx Analytics Installation - This Expert Session will walk you through the complete installation of ThingWorx Analytics from the Prerequisites to Confirming the Installation is successful and all steps in between. The first half of the video gives a breakdown of the components and the process of the installation with the second half being an actual Demo of the Installation.     ThingWorx Analytics API Overview - This Expert Session is designed to help beginners get up and running with ThingWorx Analytics. It covers basic concepts like: What are APIs, how to configure the metadata file, and a live Demo that shows you how to interact and use ThingWorx Analytics in real time. This Expert Session would also be useful for experienced users who need a refresher course.   Decision Tree, ThingWorx Analytics Builder - This Expert Session reviews the concept of “Decision Trees” and the functionality that is available in ThingWorx Analytics Builder. First, you will learn how to create and upload a dataset in ThingWorx Analytics Builder.  After that, it shows you how to train a model and score on the model that was just generated. It then goes into detail on how the prediction learner "Decision Tree" operates and classifies inputs.   Use Case Identification - This Expert Session goes over ways to identify and develop a successful use case for ThingWorx Analytics. The example use case presented here is on employee retention in a fictional company with the goal of maximizing employee retention . This presentation will provide you with all the fundamentals you need to develop your own ThingWorx Analytics use cases from the ground up.   ThingWorx Analytics Signals - This Expert Session will provide you with an in depth explanation behind how Signals are calculated in ThingWorx Analytics, what purpose they serve, and why we use them.  Some basic mathematical concepts are discussed so viewers will have a better idea of how ThingWorx Analytics operates behind the scenes.   Related Links For more information, you can visit a new space dedicated to these helpful technical videos.   Additional Expert Sessions will be highlighted here in the ThingWorx Community every few weeks. Visit the Online Success Guide to access additional information about ThingWorx training and services.
View full tip
ThingWorx's JDBC extensions - Relational Database Management System and the JDBC Extensions allow ThingWorx to connect to variety of different databases. With that comes a natural question how and what sort of SQL statements could be executed via these extensions? Note: ​​Importing the JDBC extensions i.e. the RDBMS and JDBC Extensions, creates a Database Template for that particular database. If you are working with RDBMS extension then Template of corresponding Database will be created with similar name e.g. importing the RDBMS Extension for Oracle 12 will create Template named OracleDBServer12. While importing the JDBC driver using the JDBC extension will create Template name based on the JDBC driver used or a custom name could be given. Following examples and SQL statements are adhering to Oracle's SQL*Plus standard, however these can be easily adapted to the type of RDBMS you intend to work with. Topics How to create SQL Service in ThingWorx entity Types of SQL Statements Examples on SQL Service usage and some extended use cases / examples How to create SQL Service in ThingWorx Navigate to the Thing implementing the Database Template, e.g. OracleDBServer12         2. Click on the Services section under the Entity Information and click on Add My Service         3. A new service creation section will come up, change the Service type of JavaScript (this is default selection) to either SQL (Query) or SQL (Command) depending on the type of SQL you are to create under this particular service                       4. Here's quick example on creating SQL (Query) service which takes name as input for a select *  sql … Statement, i.e. it returns complete set of rows and columns from any given table on which the user has the access to perform Select                   Note: BaseType defaults to Infotable when creating SQL (Query) service and the returned number of rows are restricted to 500. Therefore, if table contains rows more than 500, ensure to change the Max Rows parameters         5. Example on creating SQL (Command) service that delete all the rows from the database table               Note: The Base Type defaults to Number when using SQL (Command)     Additional information:     When creating a SQL service, apart from providing changing the Service Info and  Inputs /Outputs, 3rd section Tables/Columns allows users to explore the Tables and their respective columns as part of that particular user's schema - meaning the objects on which the user has select rights in his schema in the database.     Types of SQLs This is not an exhaustive list, rather contains most commonly used types of SQL statements     1. Data Definition Language (DDL)           a. Create, Alter and drop schema objects           b. Grant and Revoke privileges and roles     2. Data Manipulation Language (DML)           a. Insert           b. Delete           c. Select Examples for SQL Service usage and some extended use cases / examples     1. Data Definition Language (DDL)           a. Create statement                       b. Alter statements                         c. Drop statement                         d. Flashback statements (Oracle specific)                         e. Grant statement                     f. Rename statement                 2. Data Manipulation Language (DML)           a. Insert statement                     b. Delete statement                     c. Select statements           Use cases - Case 1 : Backing up DataTable DataTable objects in ThingWorx are for quick lookup of data and they are most performant till ~100K rows. Exceeding rows over 100K in a DataTable makes it highly susceptible to performance issues in terms of querying or writing to it. Unless, there's sharding​ on the persistence provider or multiple persistence providers used - JDBC connectivity to external data stores like RDBMS systems could help in keeping up with growing number of rows in DataTables. RDBMS tables are more than capable of storing very large amount of rows without being taxed over the performance. JDBC extension could be used to do just that in a use case requiring backing up DataTable or any Data Storage objects from ThingWorx for that matter. Here's one quick example using one of the Insert SQL service shown above to back up the entire DataTable to the Oracle's DB table. Following ThingWorx JavaScript service wraps the InsertIntoBULKDATAINSERTDT SQL service: // result: INTEGER // getting total row count in the DataTable var totalCount = Things["BulkInsertDT"].GetDataTableEntryCount(); var params = { maxItems: totalCount /* NUMBER */ }; // result: INFOTABLE // DataTable service to fetch all the rows from it var allData = Things["BulkInsertDT"].GetDataTableEntries(params); // looping over the result fetched above to get all the rows for insertion     for (var i = 0; i<totalCount; i++) {         var result = allData.getRow(i); // mapping the data for insert     var params = {         LongCol3: result.LongCol3 /* LONG */,         numcol1: result.NumCol1 /* NUMBER */,         StringCol2: result.StringCol2 /* STRING */,         IntCol4: result.IntCol4 /* INTEGER */     }; try { // result: NUMBER // calling the SQL Service InsertIntoBULKDATAINSERTDT created under a DB Thing called OracleDBThingNew     var result = Things["OracleDBThingNew"].InsertIntoBULKDATAINSERTDT(params); } catch (err) {      Logger.info ("Failed to insert the values" + err) }     }
View full tip
We are pleased to announce that the Expert Sessions video series is now available in the ThingWorx Community. We are kicking off this availability with a new space dedicated to these helpful technical videos. In the first round of videos, we are highlighting two ThingWorx Foundation videos that are designed to provide foundational knowledge to get you up and running on the ThingWorx IoT platform. New Expert Sessions Available Now ThingWorx Foundation - Installation is an introduction to installing the ThingWorx platform. The video includes information on the environment, prerequisites, and configuration steps when installing ThingWorx, and includes walkthroughs of installing with H2 and PostgreSQL databases, an introduction and demonstration of the Linux installation script, solutions to common installation problems and more. ThingWorx Foundation - Scalability talks about platform sizing with dependency on the type of environment and correlated scalability options. The video educates you about federation and high availability as well as provides visual diagrams to understand the architecture of different ThingWorx solutions. What is an Expert Session? Expert Sessions are focused, technical webcasts (both recorded and live) where PTC subject matter experts share knowledge and best practices on topics related to the design, development, deployment and operation of PTC software. Expert Sessions are designed using five categories: Get Started, Design, Develop, Deploy, and Operate. Additional Expert Sessions will be highlighted here in the ThingWorx Community every few weeks. Visit the Online Success Guide to access our Expert Session videos at any time as well as additional information about ThingWorx training and services.
View full tip
First we need to Understand below terms: Quantitative Variable: A quantitative variable is naturally measured as a number for which meaningful arithmetic operations make sense. Examples: Height, age, crop yield, GPA, salary, temperature, area, air pollution index (measured in parts per million), etc. Categorical variable: Any variable that is not quantitative is categorical. Categorical variables take a value that is one of several possible categories. As naturally measured, categorical variables have no numerical meaning. Examples: Hair color, gender, field of study, college attended, political affiliation, status of disease infection. Ordinal Variables: An ordinal variable is a categorical variable for which the possible values are ordered. Ordinal variables can be considered “in between” categorical and quantitative variables. Example: Educational level might be categorized as     1: Elementary school education     2: High school graduate     3: Some college     4: College graduate     5: Graduate degree •    In this example (and for many ordinal variables), the quantitative differences between the categories are uneven, even though the differences between the labels are the same. (e.g., the difference between 1 and 2 is four years, whereas the difference between 2 and 3 could be anything from part of a year to several years) •    Thus it does not make sense to take a mean of the values. •    Common mistake: Treating ordinal variables like quantitative variables without thinking about whether this is appropriate in the particular situation at hand. Ordinal regression: In statistics, ordinal regression (also called "ordinal classification") is a type of regression analysis used for predicting an ordinal variable. The Ordinal Regression procedure allows you to build models, generate predictions, and evaluate the importance of various predictor variables in cases where the dependent (target) variable is ordinal in nature. Ordinal dependents and linear regression: When you are trying to predict ordinal responses, the usual linear regression models don't work very well. Those methods can work only by assuming that the outcome (dependent) variable is measured on an interval scale. Because this is not true for ordinal outcome variables, the simplifying assumptions on which linear regression relies are not satisfied, and thus the regression model may not accurately reflect the relationships in the data. In particular, linear regression is sensitive to the way you define categories of the target variable. With an ordinal variable, the important thing is the ordering of categories. So, if you collapse two adjacent categories into one larger category, you are making only a small change, and models built using the old and new categorizations should be very similar. Unfortunately, because linear regression is sensitive to the categorization used, a model built before merging categories could be quite different from one built after. Below are some examples pf ordered logistic regression: Example 1: A marketing research firm wants to investigate what factors influence the size of soda (small, medium, large or extra large) that people order at a fast-food chain. These factors may include what type of sandwich is ordered (burger or chicken), whether or not fries are also ordered, and age of the consumer. While the outcome variable, size of soda, is obviously ordered, the difference between the various sizes is not consistent. The difference between small and medium is 10 ounces, between medium and large 8, and between large and extra large 12. Example 2: A researcher is interested in what factors influence modaling in Olympic swimming. Relevant predictors include at training hours, diet, age, and popularity of swimming in the athlete’s home country. The researcher believes that the distance between gold and silver is larger than the distance between silver and bronze. Example 3: A study looks at factors that influence the decision of whether to apply to graduate school. College juniors are asked if they are unlikely, somewhat likely, or very likely to apply to graduate school. Hence, our outcome variable has three categories. Data on parental educational status, whether the undergraduate institution is public or private, and current GPA is also collected. The researchers have reason to believe that the “distances” between these three points are not equal. For example, the “distance” between “unlikely” and “somewhat likely” may be shorter than the distance between “somewhat likely” and “very likely”. How to use and get result by Ordinal Regression: Clink this link for PDF                                                                                                                                                                                                                                                                                                                        PDF source: http://www.norusis.com
View full tip
Excited to announce ThingWorx 8.1 is officially available in our Support Portal. Please find the release notes below. The following feature enhancements and bug fixes exist in ThingWorx 8.1.0: Enhancements Platform: • Metrics Reporting is enabled by default, which allows usage, performance, and diagnostics data to be sent to a PTC server daily. For more information about this setting, see Platform Subsystem. • You can add and configure Notifications in New Composer. For more information, see Adding Notifications. • License files are now instance specific.. • Security for application keys has been enhanced. The defualt expiration date has been changed to 24 hours if it is not explictly set. • Additional capability has been added to New Composer. • Improvements to anomaly detection accuracy have been added. As a result, data collection restart is no longer necessary after a long gap and the H2 database that installs with the Training Microservice is stored in memory, not as a persisted file. For more information, see Anomaly Detection. • You can now load configuration/project files from KEPServerEX instances Bug Fixes Platform • Fixed an issue where Tomcat failed to start when using SAP HANA. TW-22191 • Fixed an issue that was preventing ThingWorx from starting after the File Transfer Subsystem was disabled. TW-22177 • Fixed an issue where the change history of a Mashup was automatically updated even if no changes were made. TW-22114 • Fixed an issue that was preventing the ServiceInvokeCompleted event from working after performing an in-place upgrade. TW-21784 • Fixed an issue where alert notifications were not being sent to recipients after removing a recipient. TW-21585 • Fixed an issue where the Add button in the Services page did not display after creating a Data Table. TW-21518 • Fixed an issue with alert notifications for entities containing periods in the name. TW-21347 • Fixed an issue that was causing connected assets to display as disconnected in ThingWorx Utilities. UTL-4698 • Fixed an issue where data bind was lost after changing Read-Only settings to Read/Write in Composer. TW-23506 • Fixed an issue that was causing a MetricsReportingTask error after enabling ThingWorx Performance Advisor. TW-21141 • Fixed an issue with the ThingWorx authentication window when specifying the site while using FF and IE. TW-21271 Mashup Builder • Fixed an issue with the List widget that was causing incorrect tooltips to display. TW-24012 TW-23961 TW-23038 • Fixed an issue where Chrome was automatically retrying Remote Service calls when a timeout occurred. TW-23828 • Fixed an issue after restarting the ThingWorx web app where the Runtime or Composer’s index.html were missing. TW-23984 • Fixed an issue where closing a modal dialogue did not remove the disabled state from an element. TW-11217 • Fixed an issue when creating a popup with the Navigation widget. The tab sequence of the popup was dependent on the original mashup. TW-11151 • Fixed an issue with localized values of data columns when using the Data Filter widget. TW-11059 Extensions  • Fixed an issue where CSV parser extension import failed if the text file that was being imported did not include a new line character at the end of the last line of text. TW-21863 • Fixed an issue with the Advanced Grid widget where the Reset button was not localized. TW-21457 • Fixed an issue with the jQuery library used by the WebSocketTunnel_ExtensionPackage widget. Note If you are using the WebSocketTunnel_ ExtensionPackage, you will need to upgrade to version 3.0.2 if you are upgrading to ThingWorx 8.1.0. To upgrade the extension, go to the Web Sockets Tunnel Widget and Library page of the ThingWorx Marketplace. TW-24465 End of Life Information SQUEAL functionality has been discontinued in 8.1. System requirements: http://support.ptc.com/WCMS/files/173583/en/ThingWorx_Core_8.1_System_Requirements_1.0.pdf Installation guide: http://support.ptc.com/WCMS/files/173600/en/Installing_ThingWorx_8.1_1.0_.pdf ThingWorx 8.1 Cross Platform Highlights: Security ThingWorx 8.1 Cross Platform Highlights and Q&amp;A: Licensing
View full tip
    About   This is part of a ThingBerry related blog post series.         ThingBerry is ThingWorx installed on a RaspBerry Pi, which can be used for portable demonstrations without the need of utilizing e.g. customer networks. Instead the ThingBerry provides its own custom WIFI hotspot and allows Things to connect and send / receive demo data on a small scale.   In this particual blog post we'll discuss on how to connect a ESP8266 module to the ThingBerry WIFI hotspot and send data from a DHT-11 sensor via the MQTT protocol.   As the ThingBerry is a highly unsupported environment for ThingWorx, please see this blog post for all related warnings.   Install MQTT broker on the ThingBerry     To install mosquitto as a MQTT broker, log in to the ThingBerry and run     sudo apt-get install mosquitto   This will provide a basic broker installation, which is good enough for this example. MQTT clients (including ThingWorx) will connect to this broker to exchange messages. There will be no added security like encrypted traffic shown in this example, it's however good practise to secure MQTT broker / client connections.   While the ESP8266 module is publishing information, ThingWorx will subscribe to the corresponding topics to update its internal property values with what is sent by the ESP8266 module.   For more information on MQTT, how to configure it for ThingWorx or more security relevant information also see   https://community.thingworx.com/message/5063#5063 https://community.thingworx.com/community/developers/blog/2016/08/08/securing-mqtt-connection-to-thingworx-platform?sr=tcontent   Configure the ESP8266     There are too many instructions on the web already on how to initially setup the ESP8266 and use it with the Arduino IDE. I'll therefore just refer to Google which covers the topic more extensively than I ever could.   All coding in this example is done in the Arduino IDE and is pushed to the ESP8266 (NodeMCU) via USB. For this you might need to install a CH340g USB driver for the NodeMCU.   In the Arduino IDE under Tools, I have set my environment to   Board: NodeMCU 1.0 (ESP-12E Module) CPU Frequency: 80 MHz Flash Size: 4M (3M SPIFFS) Upload Speed: 115200 Port: COM3   Under Sketch > Include Library > Manage Libraries add / install the following libraries:   DHT sensor library by Adafruit Adafruit Unified Sensor by Adafruit PubSubClient by Nick O'Leary   These bring the libraries necessary to read data from the DHT-11 sensor and to configure the ESP8266 as MQTT client.     Wiring the DHT-11 sensor     The following image shows the PINs on the ESP8266     I'm using a DHT-11 sensor with cables included and already fixed to a board with 3 PINs. In case you're using a different version, there might be additional components and wiring required, like a resistor etc. Google might help here as well.     Ensure that neither board nor sensor are plugged in, and the ESP8266 is powered off.   To hook the sensor up to the ESP8266, join   ( - ) to GND ( + ) to 3.3V (out) to D3   After all the connections are made, connect the ESP8266 via USB to a computer / laptop with the Arudino IDE configured.   Coding   In the Arduino IDE use the following code - adjust the WIFI settings and the MQTT broker configuration. Ensure to rename the ESP_xx name / topic to something more meaningful, e.g. a specific device name (or just leave it as is if in doubt).   Use the ssid and wpa_passphrase from the hostapd.conf used to configure the ThingBerry as WIFI hotspot.   Copy&paste the code below into the Arduino IDE, verify it and upload it to the ESP8266.     If searching for a WIFI connection, the device's blue LED will blink. A successful connection to the broker and publishing the values will result in a static blue LED. In case the LED is off, the connection to the broker is lost or messages cannot be published.   For troubleshooting, use the Serial Monitor function (at 115200 baud) in the Arduino IDE. In case sensor data cannot be read but the wiring is correct and the code addressing the correct PIN verify the sensor is indeed working. It took me a long time to figure out that the first sensor I used was a defective device.   The current configuration sends updates every 10 seconds - longer intervals might make more sense, but can trigger a timeout for the MQTT broker. In this case the program will re-connect automatically and log corresponding messages in the Serial Monitor. This might seem like an error, but is indeed intended behavior by the code and the MQTT broker.     Configure MQTT Thing in ThingWorx     Create a new Thing in ThingWorx based on the MQTT Template. Add two properties:   temperature humidity   Both set to persistent and logged and Data Change Type to ALWAYS. Also configure a Value Stream to log a history of values.   In the configuration, add two more subscriptions. Activate the "subscribe" checkbox and map name (local property) to topic (MQTT topic), e.g.   name = temperature; topic = ESP_xx/temp name = humidity; topic = ESP_xx/hum   Ensure the correct servernames, ports etc. are configured (an empty servername will use the localhost).   Save the configuration. Property values should now be updated from the MQTT broker, depending on what the device is sending.   Code   #include "DHT.h" #include "PubSubClient.h" #include "ESP8266WiFi.h"   /* * * Configure parameters for sensor and network / MQTT connections * */   // setup DHT 11 pin and sensor   #define DHTPin D3 #define DHTTYPE DHT11   // setup WiFi credentials   #define WLAN_SSID "mySSID" #define WLAN_PASS "WIFIpassword"   // setup MQTT   #define MQTTBROKER "mqttbrokerhostname" #define MQTTPORT 1883   // setup built-in blue LED   #define LED 2   /* * ============================================================ * * DO NOT CHANGE ANYTHING BELOW * (unless you know what you're doing) * */   // initiate DHT   DHT dht(DHTPin, DHTTYPE);   // initiate MQTT client   WiFiClient wifiClient; PubSubClient client(MQTTBROKER, MQTTPORT, wifiClient);   /* * setup */   void setup() {     // switch off internal LED     pinMode(LED, OUTPUT);   digitalWrite(LED, HIGH);     // start serial monitor     Serial.begin(115200);     // start DHT     dht.begin();     // start WiFi     WiFi.begin(WLAN_SSID, WLAN_PASS);   }   /* * the loop */   void loop() {     // while not connected to WiFi, print "."   // after connection exit the loop   // blink LED while having no WiFi signal     boolean wifiReconnect = false;     while (WiFi.status() != WL_CONNECTED) {       digitalWrite(LED, LOW);       delay(200);       Serial.print(".");       digitalWrite(LED, HIGH);       delay(300);       wifiReconnect = true;     }     // if WiFi has reconnected, print new connection information and turn on LED     if (wifiReconnect == true) {       // print connection information and local IP address, mac address       Serial.println();     Serial.println("WiFi connected");     Serial.println(WiFi.localIP());     Serial.println(WiFi.macAddress());     Serial.println();       // turn on built-in LED to indiciate successful WiFi connection       digitalWrite(LED, LOW);     }     // if MQTT client is not connected, connect again   // turn on built-in LED to indicate a successful connection     if (!client.connected()) {       Serial.println("Disconnected from MQTT server... trying to connect");       if (client.connect("ESP_xx")) {         Serial.println("Connected to MQTT server");       Serial.println("Topic = ESP_xx");         digitalWrite(LED, LOW);       } else {         Serial.println("MQTT connection failed");         digitalWrite(LED, HIGH);       }       Serial.println();     }     // read temperature and humidity from sensor     float t = dht.readTemperature();   float h = dht.readHumidity();     if (isnan(t) || isnan(h)) {       // if temperature or humidity is not a number, print error       Serial.println("Failed retrieving data from DHT sensor");     } else {       // print temperature and humidity       Serial.print(t);     Serial.print("° - ");     Serial.print(h);     Serial.print("%");     Serial.println();       // only send values to MQTT broker, if client is connected       if (client.connected()) {         // boolean to check for errors during payload transfer         bool isError = false;         // create payload and publish values via MQTT client       // use buffer to convert float to char*         char buffer[10];         dtostrf(t, 0, 0, buffer);         if (client.publish("ESP_xx/temp", buffer)) {             Serial.print("  published /temp  ");           } else {             Serial.print("  failed /temp  ");         isError = true;           }            dtostrf(h, 0, 0, buffer);         if (client.publish("ESP_xx/hum", buffer)) {             Serial.print("  published /hum  ");           } else {             Serial.print("  failed /hum  ");         isError = true;           }         Serial.println();         // on error, turn off LED         if (isError == true) {           digitalWrite(LED, HIGH);         } else {             digitalWrite(LED, LOW);           }       }     }     // sleep for 10 seconds   // if sleep > default mosquitto timeout : a reconnect is forced for each update-cycle     delay(10000);   }
View full tip
The accuracy of a predictive model can be boosted in two ways: Either by embracing Feature engineering or by applying boosting algorithms straight away. There are multiple boosting algorithms like Gradient Boosting, XGBoost, AdaBoost, Gentle Boost etc. Every algorithm has its own underlying mathematics and a slight variation is observed while applying them. While working with boosting algorithms, we have come across two frequently occurring buzzwords: Bagging and Boosting. Bagging: It is an approach where you take random samples of data, build learning algorithms and take simple means to find bagging probabilities. Boosting: Boosting is similar, however the selection of sample is made more intelligently. We subsequently give more and more weight to hard to classify observations. Below are Default Algorithms used in Predictive Models generated in ThingWorx Analytics: Decision Tree Gradient Boost Linear regression Neural Net Random Forrest Logistic Regression Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differential loss function. Let’s begin with an easy example: Assume, you are given a previous model M to improve on. Currently you observe that the model has an accuracy of 80% (any metric). How do you go further about it? One simple way is to build an entirely different model using new set of input variables and trying better ensemble learners. On the contrary, we have a much simpler way to suggest. It goes like this: Y = M(x) + error What if we are able to see that error is not a white noise but have same correlation with outcome(Y) value. What if we can develop a model on this error term? Like:error = G(x) + error2 Probably, we will see error rate will improve to a higher number, say 84%. Let’s take another step and regress against error2: error2 = H(x) + error3 Now we combine all these together: Y = M(x) + G(x) + H(x) + error3 This probably will have a accuracy of even more than 84%. What if we can find an optimal weights for each of the three learners: Y = alpha * M(x) + beta * G(x) + gamma * H(x) + error4 How Gradient Boosting Works: 1. Loss Function: The loss function used depends on the type of problem being solved. It must be differential, but many standard loss functions are supported and you can define your own. A benefit of the gradient boosting framework is that a new boosting algorithm does not have to be derived for each loss function that may want to be used, instead, it is a generic enough framework that any differential loss function can be used. 2. Weak Learner: Decision trees are used as the weak learner in gradient boosting. Specifically regression trees are used that output real values for splits and whose output can be added together, allowing subsequent models outputs to be added and “correct” the residuals in the predictions. Trees are constructed in a greedy manner, choosing the best split points based on purity scores like Gini or to minimize the loss. 3. Additive Model: Trees are added one at a time, and existing trees in the model are not changed. A gradient descent procedure is used to minimize the loss when adding trees. we have weak learner sub-models or more specifically decision trees. After calculating the loss, to perform the gradient descent procedure, we must add a tree to the model that reduces the loss. Improvements to Basic Gradient Boosting: 1. Tree Constraints: It is important that the weak learners have skill but remain weak. Below are some constraints that can be imposed on the construction of decision trees: Number of trees: ​Generally adding more trees to the model can be very slow to over fit. The advice is to keep adding trees until no further improvement is observed. Tree depth: Deeper trees are more complex trees and shorter trees are preferred. Generally, better results are seen with 4-8 levels. Number of nodes or number of leaves: like depth, this can constrain the size of the tree, but is not constrained to a symmetrical structure if other constraints are used. Number of observations per split: Imposes a minimum constraint on the amount of training data at a training node before a split can be considered Minimum improvement to loss: Is a constraint on the improvement of any split added to a tree. 2. Weighted Updates: The contribution of each tree to this sum can be weighted to slow down the learning by the algorithm. This weighting is called a shrinkage or a learning rate. "Each update is simply scaled by the value of the “learning rate parameter v". 3. Stochastic Gradient Boosting: At each iteration a sub sample of the training data is drawn at random (without replacement) from the full training data set. The randomly selected sub sample is then used, instead of the full sample, to fit the base learner. 4. Penalized Gradient Boosting: The additional regularization term helps to smooth the final learnt weights to avoid over-fitting. Intuitively, the regularized objective will tend to select a model employing simple and predictive functions.
View full tip
A common issue that is seen when trying to deploy, design or scale up a ThingWorx application is performance.  Slow response, delayed data and the application stopping have all been seen when a performance problems either slowly grows or suddenly pops up.  There are some common themes that are seen when these occur typically around application model or design.  Here are a few of the common problems and some thoughts on what to do about them or how to avoid them. Service Execution This covers a wide range of possibilities and is most commonly seen when trying to scale an application.  Data access within a loop is one particular thing to avoid.  Accessing data from a Thing, other service or query may be fast when only testing it on 100 loops, but when the application grows and you have 1000 suddenly it's slow.  Access all data in one query and use that as an in memory reference.  Writing data to a data store (Stream, Datatable or ValueStream) then querying that same data in one service can cause problems as well.  Run the query first then use all the data you have in the service variables.   To troubleshoot service executions there are a few methods that can be used.  Some for will not be practical for a production system since it is not always advisable to change code without testing first. Used browser development tools to see the execution time of a service.  This is especially helpful when a mashup is slow to load or respond.  It will allow quickly identifying which of multiple services may be the issue. Addition of logging in a service.  Once a service is identified adding simple logging points in the service can narrow what code in the service cases the slow down (it may be another service call).  These logging statements show up in the script logs with time stamps ( you can also log the current time with the logging statements). Use the test button in Composer.  This is a simple on but if the service does not have many parameters (or has defaults) it's a fast and easy way to see how long a service takes to return,' When all else fails you can get thread dumps from the JVM.  ThingWorx Support created an extension that assists with this.  You can find it on the Marketplace with instructions on how to use it.  You can manually examine the output files or open a ticket with support to allow them to assist.  Just be careful of doing memory dumps, there are much larger, hard to analyse and take a lot of memory.  https://marketplace.thingworx.com/tools/thingworx-support-tools Queries ​These of course are services too but a specific type.  Accessing data in ThingWorx storage structures or from external sources seems fairly straight forward but can be tricky when dealing with large data sets.  When designing and dealing with internal platform storage refer to this guide as a baseline to decide where to store data...  Where Should I Store My Thingworx Data?   NEVER store historical data in infotable properties.  These are held in memory (even if they are persistent) and as they grow so will the JVM memory use until the application runs out of it.  We all know what happens then.  Finally one other note that has causes occasional confusion.  The setting on a query service or standard ThingWorx query service that limits the number of records returned.  This is how many records are returned to from the service at the end of processing, not how many are processed or loaded in memory.  That number may be much higher and could cause the same types of issues. Subscriptions and Events ​This is similar to service however there is an added element frequency.  Typical events are data change and timers/schedulers.  This again is often an issue only when scaling up the number of Things or amount of data that need to be referenced.  A general reference on timers and schedulers can be found here.  This also describes some of the event processing that takes place on the platform.  Timers and Schedulers - Best Practice For data change events be very cautions about adding these to very rapidly changing property values.  When a property is updating very quickly, for example two times each second, the subscription to that event must be able to complete in under 0.5 seconds to stay ahead of processing.  Again this may work for 5-10 Things with properties but will not work with 500 due to resources, speed and need to briefly lock the value to get an accurate current read.  In these cases any data processing should be done at the edge when possible (or in the originating system) and pushed to the platform in a separate property or service call.  This allows for more parallel processing since it is de-centralized. A good practice for allowing easier testing of these types of subscription code is to take all of the script/logic and move it to a service call.  Then pass any of the needed event data to parameters in the service.  This allows for easier debug since the event does not need to fire to make the logic execute.  In fact it can essentially be stand alone by the test button in Composer. Mashup Performance This​ one can be very tricky since additional browser elements and rendering can come into play. Sometimes service execution is the root of the issue and reviewed above, other times it is UI elements and design that cause slow down. The Repeater widget is a common culprit. The biggest thing to note here is that each repeater will need to render every element that is repeated and all of the data and formatting for each of those widgets in the repeated mashup. So any complex mashup that is repeated many times may become slow to load. You can minimize this to a degree based on the Load/Unload setting of the widget and when the slowness is more acceptable (when loading or when scrolling). When a mashup is launched from Composer it comes with some debugging tools built in to see errors and execution. Using these with browser debug tools can be very helpful. Scaling an Application When initially modeling an application scale must be considered from the start. It is a challenge (but not impossible) to modify an application after deployment or design to be very efficient. Many times new developers on the ThingWorx platform fall into what I call the .Net trap. Back when .Net was released one of the quote I recall hearing about it's inefficiencies was "memory is cheap". It was more cost efficient to purchase and install more memory than to take extra development time to optimize memory use. This was absolutely true for installed applications where all of the code was complied and stored on every system. Web based applications are not quite a forgiving since most processing and execution is done on the single central web server. Keep this in mind especially when creating Shapes, Templates and Subscriptions. While you may be writing one piece of code when this code is repeated on 1,000 Things they will all be in memory and all be executing this code in parallel. You can quickly see how competition for resources, locks on databases and clean access to in memory structures can slow everything down (and just think when there are 10,000 pieces of that same code!!). Two specific things around this must be stated again (though they were covered in the above sections). Data held in properties has fast access since it is in JVM memory. But this is held in memory for each individual Thing, so hold 5 MB of information in one Thing seems small, loading 10,000 Thing mean instant use of 50 GB of memory!! Next execution of a service. When 10 things are running a service execution takes 2 seconds. Slow but not too bad and may not be too noticeable in the UI. Now 10,000 Things competing for the same data structure and resources. I have seen execution time jump to 2 minutes or more. Aside from design the best thing you can do is TEST on a scaled up structure. If you will have 1,000 Things next year test your application early at that level of deployment to help identify any potential bottlenecks early. Never assume more memory will alleviate the issue. Also do NOT test scale on your development system. This introduces edits changes and other variables which can affect actual real world results. Have a QA system setup that mirrors a production environment and simulate data and execution load. Additional suggestions are welcome in comments and will likely update this as additional tool and platform updates change.
View full tip
This Expert Session will walk you through the Components involved in the ThingWorx Studio Augmented Reality Environment, a detailed Architecture, supported devices, and exploring the resources. The session shall provide great insight into the working and the technicalities involved in the ThingWorx Studio.   For full-sized viewing, click on the YouTube link in the player controls.   Visit the Online Success Guide to access our Expert Session videos at any time as well as additional information about ThingWorx training and services.
View full tip
​​​There are four types of Analytics:                                                                 Prescriptive analytics: What should I do about it? Prescriptive analytics is about using data and analytics to improve decisions and therefore the effectiveness of actions.Prescriptive analytics is related to both Descriptive and Predictive analytics. While Descriptive analytics aims to provide insight into what has happened and Predictive analytics helps model and forecast what might happen, Prescriptive analytics seeks to determine the best solution or outcome among various choices, given the known parameters. “Any combination of analytics, math, experiments, simulation, and/or artificial intelligence used to improve the effectiveness of decisions made by humans or by decision logic embedded in applications.”These analytics go beyond descriptive and predictive analytics by recommending one or more possible courses of action. Essentially they predict multiple futures and allow companies to assess a number of possible outcomes based upon their actions. Prescriptive analytics use a combination of techniques and tools such as business rules, algorithms, machine learning and computational modelling procedures. Prescriptive analytics can also suggest decision options for how to take advantage of a future opportunity or mitigate a future risk, and illustrate the implications of each decision option. In practice, prescriptive analytics can continually and automatically process new data to improve the accuracy of predictions and provide better decision options. Prescriptive analytics can be used in two ways: Inform decision logic with analytics: Decision logic needs data as an input to make the decision. The veracity and timeliness of data will insure that the decision logic will operate as expected. It doesn’t matter if the decision logic is that of a person or embedded in an application — in both cases, prescriptive analytics provides the input to the process. Prescriptive analytics can be as simple as aggregate analytics about how much a customer spent on products last month or as sophisticated as a predictive model that predicts the next best offer to a customer. The decision logic may even include an optimization model to determine how much, if any, discount to offer to the customer. Evolve decision logic: Decision logic must evolve to improve or maintain its effectiveness. In some cases, decision logic itself may be flawed or degrade over time. Measuring and analyzing the effectiveness or ineffectiveness of enterprises decisions allows developers to refine or redo decision logic to make it even better. It can be as simple as marketing managers reviewing email conversion rates and adjusting the decision logic to target an additional audience. Alternatively, it can be as sophisticated as embedding a machine learning model in the decision logic for an email marketing campaign to automatically adjust what content is sent to target audiences. Different technologies of Prescriptive analytics to create action: Search and knowledge discovery: Information leads to insights, and insights lead to knowledge. That knowledge enables employees to become smarter about the decisions they make for the benefit of the enterprise. But developers can embed search technology in decision logic to find knowledge used to make decisions in large pools of unstructured big data. Simulation: ​Simulation imitates a real-world process or system over time using a computer model. Because digital simulation relies on a model of the real world, the usefulness and accuracy of simulation to improve decisions depends a lot on the fidelity of the model. Simulation has long been used in multiple industries to test new ideas or how modifications will affect an existing process or system. Mathematical optimization: Mathematical optimization is the process of finding the optimal solution to a problem that has numerically expressed constraints. Machine learning: “Learning” means that the algorithms analyze sets of data to look for patterns and/or correlations that result in insights. Those insights can become deeper and more accurate as the algorithms analyze new data sets. The models created and continuously updated by machine learning can be used as input to decision logic or to improve the decision logic automatically. Paragmetic AI: ​Enterprises can use AI to program machines to continuously learn from new information, build knowledge, and then use that knowledge to make decisions and interact with people and/or other machines.                                               Use of Prescriptive Analytics in ThingWorx Analytics: Thing Optimizer: Thing Optimizer functionality provides the prescriptive scoring and optimization capabilities of ThingWorx Analytics. While predictive scoring allows you to make predictions about future outcomes, prescriptive scoring allows you to see how certain changes might affect future outcomes. After you have generated a prediction model (also called training a model), you can modify the prescriptive attributes in your data (those attributes marked as levers) to alter the predictions. The prescriptive scoring process evaluates each lever attribute, and returns an optimal value for that feature, depending on whether you want to minimize or maximize the goal variable. Prescriptive scoring results include both an original score (the score before any lever attributes are changed) and an optimized score (the score after optimal values are applied to the lever attributes). In addition, for each attribute identified in your data as a lever, original and optimal values are included in the prescriptive scoring results. How to Access Thing Optimizer Functionality: ThingWorx Analytics prescriptive scoring can only be accessed via the REST API Service. Using a REST client, you can access the Scoring service which includes a series of API endpoints to submit scoring requests, retrieve results, list jobs, and more. Requires installation of the ThingWorx Analytics Server. How to avoid mistakes - Below are some common mistakes while doing Prescriptive analytics: Starting digital analytics without a clear goal Ignoring core metrics Choosing overkill analytics tools Creating beautiful reports with little business value Failing to detect tracking errors                                                                                                                                 Image source: Wikipedia, Content: go.forrester.com(Partially)
View full tip
Check out this new KCS article which links to all known best practice documents available for ThingWorx. This article will get larger in time as more articles are published related to the Dos and Don'ts of building an IoT application! Do you know when to use timers, and where to implement their subscriptions? How about ensuring info tables are used at the proper time, and data tables at others? Pesky performance issues wherein ThingWorx runs slow for apparently no reason? All of these questions and more are addressed here!
View full tip
New Generation Composer is available from ThingWorx 7.4 and later. Each subsequent release of ThingWorx will contain additional New Composer features/functionalities. This video is focused on the layout change and new features implemented from ThingWorx 7.4.     How to enable the new Composer? 1. In the top right-hand corner, click on the User Menu and select the Preferences option                Figure 1 2. Click the check box for Turn on New Composer Features. 3. Click Done. A New Composer link is added to the menu bar at the top of the Composer window.     Figure 2   4. Click the New Composer link to open a new tab for the New Composer view   Figure 3   What's the layout change in New Composer?     Three areas layout   Figure 4     Menu bar on the left (Area 1) Set Project Context to set default project name for new entities Two views: Recent and Browse Recent view will quickly locate the recent access entities Browse is almost the same to the old menu navigation bar Could be sizable or hidden, and the main idea here is to increase screen real-estate to allow bigger view/edit area Main area for listing and feeding entities in the middle (Area 2) It provides you a wider area to edit entities (author services, mashup builder, etc.) An extra area on the right for preview, properties/events editing, etc. (Area 3) It gave you an easier and handy way to get a glance of an entity’s basic information   Layout change in an entity’s editing page When you create a new Thing, you will find all the facets (general information, properties, services, events, subscriptions) of the Thing are listed in a dropdown list Figure 5       By doing so it will save more area for the feeding   Properties and Alerts Properties and Alerts are now listed separately (in two different Tabs) Figure 6 Figure 7  Properties and Alert are now edited on the right area of the page (See Figure 4 Area 3 Services and Subscription A bigger editor area Events Events are edited on right area of the page (See Figure 4 Area 3)   What's the main function/features change in New Composer   Industrial Connections The Industrial Connections entity allows you to connect with and configure industrial things in ThingWorx. With the “Discover” feature, you could easily bind the Industrial device’s (e.g., Kepware) Tags to a ThingWorx Entity Figure 8 From ThingWorx 8.0.0, Anomaly Alert type is supported An anomaly alert is only useful if you have configured Anomaly Detection to monitor a stream of data Anomaly metrics settings are allowed to be configured in the Alert edit page Figure 9 Subscriptions You could now manually remove a subscription permanently from the system in New Composer which is impossible in old Composer Services New Composer provided assistant scripting tools like static code analysis, String search or replace, etc Figure 10   How could I switch back to the Old Composer when editing an entity? It is so easy! As long as you have opened an ThingWorx Entity, you will notice there is a button “Edit in Composer”; it will lead you back to the old Composer, and all the editing that have been saved will be logged in the old Composer.     Video demonstration for the New Composer is also available now. Feel free to review from: New Composer Video
View full tip
Welcome to the ThingWorx Manufacturing Apps Community! The ThingWorx Manufacturing Apps are easy to deploy, pre-configured role-based starter apps that are built on PTC’s industry-leading IoT platform, ThingWorx. These Apps provide manufacturers with real-time visibility into operational information, improved decision making, accelerated time to value, and unmatched flexibility to drive factory performance.   This Community page is open to all users-- including licensed ThingWorx users, Express (“freemium”) users, or anyone interested in trying the Apps. Tech Support community advocates serve users on this site, and are here to answer your questions about downloading, installing, and configuring the ThingWorx Manufacturing Apps.     A. Sign up: ThingWorx Manufacturing Apps Community: PTC account credentials are needed to participate in the ThingWorx Community. If you have not yet registered a PTC eSupport account, start with the Basic Account Creation page.   Manufacturing Apps Web portal: Register a login for the ThingWorx Manufacturing Apps web portal, where you can download the free trial and navigate to the additional resources discussed below.     B. Download: Choose a download/packaging option to get started.   i. Express/Freemium Installer (best for users who are new to ThingWorx): If you want to quickly install ThingWorx Manufacturing Apps (including ThingWorx) use the following installer: Download the Express/Freemium Installer   ii. 30-day Developer Kit trial: To experience the capabilities of the ThingWorx Platform with the Manufacturing Apps and create your own Apps: Download the 30-day Developer Kit trial   iii. Import as a ThingWorx Extension (for users with a Manufacturing Apps entitlement-- including ThingWorx commercial customers, PTC employees, and PTC Partners): ThingWorx Manufacturing apps can be imported as ThingWorx extensions into an existing ThingWorx Platform install (v8.1.0). To locate the download, open the PTC Software Download Page and expand the following folders:   ThingWorx Platform | Release 8.x | ThingWorx Manufacturing Apps Extension | Most Recent Datacode     C. Learn After downloading the installer or extensions, begin with Installation and Configuration.   Follow the steps laid out in the ThingWorx Manufacturing Apps Setup and Configuration Guide 8.2   Find helpful getting-started guides and videos available within the 'Get Started' section of the ThingWorx Manufacturing Apps Portal.     D. Customize Once you have successfully downloaded, installed, and configured the Manufacturing Apps, begin to explore the deeper potential of the Apps and the ThingWorx Platform.   Follow along with the discussion and steps contained in the ThingWorx Manufacturing Apps and Service Apps Customization Guide  8.2   Also contained within the the 'Get Started' page of the ThingWorx Manufacturing Apps Portal, find the "Evolve and Expand" section, featuring: -Custom Plant Layout application -Custom Asset Advisor application -Global Plant View application -Thingworx Manufacturing Apps Technical Lab with Sigma Tile (Raspberry Pi application) -Configuring the Apps with demo data set and simulator -Additional Advanced Documentation     E. Get help / give feedback / interact Use the ThingWorx Manufacturing Apps Community page as a resource to find documentation, peruse past forum threads, or post a question to start a discussion! For advanced troubleshooting, licensed users are encouraged to submit support tickets to the PTC My eSupport portal.
View full tip
While it is not a requirement, it is a best practice to install KEPServerEX (v6.2 or higher) before installing ThingWorx (v8.0.1 or higher). If ThingWorx is already installed, close the application and complete the install of KEPServerEX by following these install instructions: How do I download and install KEPServerEX? Now, when you attempt to launch ThingWorx, if you are presented with a "null pointer exception" error, follow this workaround: 1. Navigate to the 'PostgreSQL\installer' directory, within the directory where the Manufacturing Apps are installed. By default this will be: <ThingWorx install path>\ThingWorxManufacturingApps\PostgreSQL\installer 2. Run the 'vcredist.exe' located there. This application should re-install the conflicting redistributables, and you should be able to launch ThingWorx again normally.
View full tip
KEPServerEX requires the 32-bit version of Java if you are using the IoT Gateway Plug-in. If you do not have the 32-bit version installed and attempt to connect the IoT Gateway, the KEPServerEX Event Log will report the following error: “IoT Gateway failed to start, 32-bit JRE required." Some of the Manufacturing Applications training content relies on this Plug-in, as well. As a best practice, it is recommended that both the 32-bit and 64-bit versions of Java be installed. This install is available for download from the Oracle website, here: Java SE Runtime Environment 8 - Downloads
View full tip
  You might have seen the Performance Advisor for some of your other favorite PTC Products like Creo, Windchill or Integrity.  Good news....it's now also available for ThingWorx!   In case you're not familiar with the Performance Advisor, it's new functionality allowing you to work closer with the PTC / ThingWorx team for improving your usage with ThingWorx and improving ThingWorx itself in the areas that matter most to you.      ThingWorx Performance Advisor   delivers information dashboards driven by data on the features, usage and performance of your ThingWorx systems unlocks information that can reduce wasted development and improve design cycles allows comprehensive visibility into software versions in use to manage software upgrade plans simplifies compliance and revenue allocation by monitoring usage enables quick access to system and usage statistics across your organization uses personalized dashboards to viewing, reporting and trend analysis   The Performance Advisor for ThingWorx has just been released, so we want you to share your experience and data to get you and us started on analyzing usage statistics and needs for further features.   The Performance Advisor is easy to connect. It just takes three simple steps and a minute of your time. This will result in improved transparency, improved stability, improved productivity, improved product performance, improved compliance administration and an increased administrative efficiency and allows the ThingWorx R&D team to continuously improve the platform through the analytical insights from the data collected.   As ThingWorx is growing fast, be sure to participate and actively shape the way you're using ThingWorx and the way that ThingWorx is designed.   With newer versions of ThingWorx, capabilites and benefits for the Performance Advisor will be improved to ensure we're capturing the most accurate information to help you grow your Internet of Things business and scale your solutions to your / your application's needs and requirements. We're just at the beginning of the journey...   How to enable ThingWorx Performance Advisor   Enable Metrics Reporting and setting up the Performance Advisor capabilties is described in detail in CS262960 Just follow the steps and: Congratulations!   It's as simple and fast as that - you enabled the ThingWorx Performance Advisor... quite easy, right?   Where can I see the data / metrics I have sent to PTC?   The information can be seen on the Performance Advisor Homepage   Here's how the current views look like - they might change over time, introducing new features and views to maximize the impact and benefit for you.   In a first glance the basic information of what has been collected can be seen in the Summary     In the Connection System Details it shows more about what systems are currently connected with its user counts and number of remote things. The Connected System History shows a historical overview on how those parameters changed over time.   For a more detailed historic overview of all the data being sent, check out the Historical Property Data.     Questions?   For specific questions, check out article CS262967 which holds the FAQs for the Performance Advisor   If you have specific questions not addressed in the article, you can always comment on this blog post, open a new community thread or open a case with Support Services.   We want your feedback   After enabling metrics collection and reviewing the Performance Advisor dashboards, what do you think? What features would you like to see in the future? Is there anything missing that would help you as a System Administrator making your life easier?   As we're trying to improve functionality over time, make sure your voice is heard as well and feel free to leave some feedback.
View full tip
There are Four Types of Analytics:                         Descriptive: What Happened? Descriptive analytics is a preliminary stage of data processing that creates a summary of historical data to yield useful information and possibly prepare the data for further analysis. Analytics, which use data aggregation and data mining to provide insight into the past and answer: “What has happened? Descriptive analysis or statistics does exactly what the name implies they “Describe”, or summarize raw data and make it something that is interpret-able by humans. They are analytics that describe the past. The past refers to any point of time that an event has occurred, whether it is one minute ago, or one year ago. Descriptive analytics are useful because they allow us to learn from past behaviors, and understand how they might influence future outcomes. The vast majority of the statistics we use fall into this category. (Think basic arithmetic like sums, averages, percent changes). Usually, the underlying data is a count, or aggregate of a filtered column of data to which basic math is applied. For all practical purposes, there are an infinite number of these statistics. Descriptive statistics are useful to show things like, total stock in inventory, average dollars spent per customer and Year over year change in sales. Common examples of descriptive analytics are reports that provide historical insights regarding the company’s production, financials, operations, sales, finance, inventory and customers. Note: Use Descriptive Analytics when you need to understand at an aggregate level what is going on in your company, and when you want to summarize and describe different aspects of your business.                                     Different techniques of Descriptive Analytics: Sampling Mean Mode Median Standard Deviation Range and Variance Stem and Leaf Diagram Histogram Quartiles Frequency Distributions Use of Descriptive Analytics in ThingWorx Analytics: Signal Detection: When analyzing volumes of data, it is helpful to know which data is actually useful and which data is just noise. Signals are based on a correlation algorithm that examines historical data to identify the strength of a given input in predicting future outcomes. Signals can identify meaningful correlations within the data. Signals are useful during initial analysis to determine which features you want to curate in a given data-set for predictive model generation. For example, knowing the month of the year is more important to accurately predicting tomorrow’s weather than knowing the day of the week. The month has a much stronger signal than the day of the week for this prediction. ThingWorx Analytics reports signal strength in a mutual information (MI) score that represents the probability of predicting the goal variable when a given feature is provided. It can effectively capture non-linear relationships. ThingWorx Analytics evaluates each feature, or combination of features, to identify the top signals. Cluster Analysis: Cluster analysis categorizes data into groups based on similarities relative to a goal variable. Like a clique, objects in a cluster minimize intra-distances (distances within the cluster) while maximizing inter-distances (distances between clusters). Clusters are mutually exclusive, meaning that each record can belong to only one cluster. However, ThingWorx Analytics supports a user-defined cluster hierarchy that can include sub-clusters inside other clusters. The higher the number of clusters in the data, the smaller each cluster’s population will be, but the stronger the potential insights can be. How to Access Descriptive Analysis Functionality via ThingWorx Analytics: REST API Service — Using a REST client, you can access the Signals Service and the Clusters Service. Each service includes a series of API endpoints to submit analysis requests, retrieve results, list jobs, and more. Requires installation of the ThingWorx Analytics Server. Analytics Builder — As part of the ThingWorx Analytics Extension, Analytics Builder provides a user interface for interacting with your data. In addition to generating and scoring predictive models in Analytics Builder, you can also run procedures to generate signals. How to avoid mistakes - Useful tips for Different Techniques of Descriptive Analytics: Crystallize the research problem → Operability of it! Read literature on data analysis techniques. Evaluate various techniques that can do similar things w.r.t. to research problem. Know what a technique does and what it doesn’t. Consult people, esp. supervisor.
View full tip
This blog addresses a few points that are related to scoring with ThingWorx Analytics. It, particularly, brings a clearer understanding of the concepts behind the values of the scores that are generated when performing a scoring job.   Scoring Outputs:   It is important to note that when training an analytics model, the method is to create a generalizable model from a relatively small training dataset.   By its nature, we expect the training process to see a limited subset and not an exhaustive list of all possible values for many constraints, especially for time and practicality.   As such, these generalized models will be expected to handle unseen data in the form of new combinations or values outside of previously observed ranges (more on this below).   One common way to see scores that exceed the observed ranges in training, under the assumption that the goals are continuous, is to use prescriptive scoring.   Prescriptive scoring attempts to find optimal values for a lever, meaning tunable, features in order to maximize or minimize score values. See the prescriptive scoring documentation and functionality for more information.   min/max constraints: these are constraints that are placed upon the inputs for training and expected inputs for scoring.   •          For training: If theses ranges were provided as part of the upload process, then training will raise exceptions regarding invalid data. However, if the ranges are not provided - they will be inferred from the data and, as such, training will not see values outside of observed ranges.   •          For scoring: Validation of the ranges will only be performed on the inputs - not the outputs. It is very important to note that the handling of these "constraints" is dependent upon the data type.   For categorical (e.g. colors) and ordinal data (e.g. shirt sizes), the constraints are strict and data that was not observed in training will raise exceptions during scoring.   However, for continuous values (e.g. temperature ranges) these constraints are more informational in nature. For predictive scoring, our code will accept records with values outside of those ranges.   The rule of thumb is that values slightly outside these ranges are acceptable and that as the values stray farther from the ranges, the accuracy of the model degrades very quickly.   For prescriptive scoring, these constraints are used to determine the acceptable ranges of values to try when determining the optimal values. Values outside of these constraints will NOT be tried.
View full tip
A Feature - a piece of information that is potentially useful for prediction. Any attribute could be a feature, as long as it is useful to the model. Feature engineering – Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data. It’s a vaguely agreed space of tasks related to designing feature sets for Machine Learning applications. Components: First, understanding the properties of the task you’re trying to solve and how they might interact with the strengths and limitations of the model you are going to use. Second, experimental work were you test your expectations and find out what actually works and what doesn’t. Feature engineering as a technique, has three sub categories of techniques: Feature selection, Dimension reduction and Feature generation. Feature Selection: Sometimes called feature ranking or feature importance, this is the process of ranking the attributes by their value to predictive ability of a model. Algorithms such as decision trees automatically rank the attributes in the data set. The top few nodes in a decision tree are considered the most important features from a predictive stand point. As a part of a process, feature selection using entropy based methods like decision trees can be employed to filter out less valuable attributes before feeding the reduced dataset to another modeling algorithm. Regression type models usually employ methods such as forward selection or backward elimination to select the final set of attributes for a model. For example: Project Development decision-tree:                                                  Dimension Reduction: This is sometimes called feature extraction. The most classic example of dimension reduction is principle component analysis or PCA. PCA allows us to combine existing attributes into a new data frame consisting of a much reduced number of attributes by utilizing the variance in the data. The attributes which "explain" the highest amount of variance in the data form the first few principal components and we can ignore the rest of the attributes if data dimensionality is a problem from a computational standpoint. Feature Generation or Feature Construction: Quite simply, this is the process of manually constructing new attributes from raw data. It involves intelligently combining or splitting existing raw attributes into new one which have a higher predictive power. For example a date stamp may be used to generate 2 new attributes such as AM and PM which may be useful in discriminating whether day or night has a higher propensity to influence the response variable. Feature construction is essentially a data transformation process. Tips for Better Feature Engineering Tip 1: Think about inputs you can create by rolling up existing data fields to a higher/broader level or category. As an example, a person’s title can be categorized into strategic or tactical. Those with titles of “VP” and above can be coded as strategic. Those with titles “Director” and below become tactical. Strategic contacts are those that make high-level budgeting and strategic decisions for a company. Tactical are those in the trenches doing day-to-day work.  Other roll-up examples include: Collating several industries into a higher-level industry: Collate oil and gas companies with utility companies, for instance, and call it the energy industry, or fold high tech and telecommunications industries into a single area called “technology.” Defining “large” companies as those that make $1 billion or more and “small” companies as those that make less than $1 billion.   Tip 2: Think about ways to drill down into more detail in a single field. As an example, a contact within a company may respond to marketing campaigns, and you may have information about his or her number of responses. Drilling down, we can ask how many of these responses occurred in the past two weeks, one to three months, or more than six months in the past. This creates three additional binary (yes=1/no=0) data fields for a model. Other drill-down examples include: Cadence: Number of days between consecutive marketing responses by a contact: 1–7, 8–14, 15–21, 21+ Multiple responses on same day flag (multiple responses = 1, otherwise =0) Tip 3: Split data into separate categories also called bins. For example, annual revenue for companies in your database may range from $50 million (M) to over $1 billion (B). Split the revenue into sequential bins: $50–$200M, $201–$500M, $501M–$1B, and $1B+. Whenever a company falls with the revenue bin it receives a one; otherwise the value is zero. There are now four new data fields created from the annual revenue field. Other examples are: Number of marketing responses by contact: 1–5, 6–10, 10+ Number of employees in company: 1–100, 101–500, 502–1,000, 1,001–5,000, 5,000+ Tip 4: Think about ways to combine existing data fields into new ones. As an example, you may want to create a flag (0/1) that identifies whether someone is a VP or higher and has more than 10 years of experience. Other examples of combining fields include: Title of director or below and in a company with less than 500 employees Public company and located in the Midwestern United States You can even multiply, divide, add, or subtract one data field by another to create a new input. Tip 5: Don’t reinvent the wheel – use variables that others have already fashioned. Tip 6: Think about the problem at hand and be creative. Don’t worry about creating too many variables at first, just let the brainstorming flow.
View full tip
Sampling Strategy​ This Blog Post will cover the 4 sampling Strategies that are available in ThingWorx Analytics.  It will tell you how the sampling strategy runs behind the scenes, when you may want to use that strategy, and will give you the pros and cons of each strategy. SAMPLE_WITH_REPLACEMENT This strategy is not often used by professionals but still may be useful in certain circumstances.  When you sample with replacement, the value that you randomly selected is then returned to the sample pool.  So there is a chance that you can have the same record multiple times in your sample. Example Let’s say you have a hat that contain 3 cards with different people’s names on them. John Sarah Tom Let’s say you make 2 random selections. The first selection you pull out the name Tom. When you sample with replacement, you would put the name Tom back into the hat and then randomly select a card again.  For your second selection, it is possible to get another name like Sarah, or the same one you selected, Tom. Pros May find improved models in smaller datasets with low row counts Cons The Accuracy of the model may be artificially inflated due to duplicates in the sample SAMPLE_WITHOUT_REPLACEMENT This is the default setting in ThingWorx Analytics and the most commonly used sampling strategy by professionals.  The way this strategy works is after the value is randomly selected from the sample pool, it is not returned.  This ensures that all the values that are selected for the sample, are unique. Example Let’s say you have a hat that contain 3 cards with different people’s names on them. John Sarah Tom Let’s say you make 2 random selections. The first selection you pull out the name Tom. When you sample without replacement, you would randomly select a card from the hat again without adding the card Tom.  For your second selection, you could only get the Sarah or John card. Pros This is the sampling strategy that is most commonly used It will deliver the best results in most cases Cons May not be the best choice if the desired goal is underrepresented in the dataset UPSAMPLE_AND_SAMPLE_WITHOUT_REPLACEMENT This is useful when the desired goal is underrepresented in the dataset.  The features that represent the desired outcome of the goal are copied multiple times so they represent a larger share of the total dataset. Example Let’s say you are trying to discover if a patient is at risk for developing a rare condition, like chronic kidney failure, that affects around .5% of the US population.  In this case, the most accurate model that would be generated would say that no one will get this condition, and according to the numbers, it would be right 99.5% of the time.  But in reality, this is not helpful at all to the use case since you want to know if the patient is at risk of developing the condition. To avoid this from happening, copies are made of the records where the patient did develop the condition so it represents a larger share of the dataset.  Doing this will give ThingWorx Analytics more examples to help it generate a more accurate model. Pros Patterns from the original dataset remain intact Cons Longer training time DOWNSAMPLE_AND_SAMPLE_WITHOUT_REPLACEMENT This is also useful when the desired goal is underrepresented in the dataset. In downsample and sample without replacement, some features that do not represent the desired goal outcome are removed. This is done to increase the desired features percentage of the dataset. Example Let’s continue using the medical example from above.  Instead of creating copies of the desired records, undesired record are removed from the dataset.  This causes the records where patients did develop the condition to occupy a larger percentage of the dataset. Pros Shorter training time Cons Patterns from the original dataset may be lost
View full tip
1. Add an Json parameter Example: { ​    "rows":[         {             "email":"example1@ptc.com"         },         {             "name":"Qaqa",             "email":"example2@ptc.com"         }     ] } 2. Create an Infotable with a DataShape usingCreateInfoTableFromDataShape(params) 3. Using a for loop, iterate through each Json object and add it to the Infotable usingInfoTableName.AddRow(YourRowObjectHere) Example: var params = {     infoTableName: "InfoTable",     dataShapeName : "jsontest" }; var infotabletest = Resources["InfoTableFunctions"].CreateInfoTableFromDataShape(params); for(var i=0; i<json.rows.length; i++) {     infotabletest.AddRow({name:json.rows.name,email:json.rows.email}); }
View full tip