IoT & Connectivity Tips

A common issue that is seen when trying to deploy, design or scale up a ThingWorx application is performance. Slow response, delayed data and the application stopping have all been seen when a performance problems either slowly grows or suddenly pops up. There are some common themes that are seen when these occur typically around application model or design. Here are a few of the common problems and some thoughts on what to do about them or how to avoid them. Service Execution This covers a wide range of possibilities and is most commonly seen when trying to scale an application. Data access within a loop is one particular thing to avoid. Accessing data from a Thing, other service or query may be fast when only testing it on 100 loops, but when the application grows and you have 1000 suddenly it's slow. Access all data in one query and use that as an in memory reference. Writing data to a data store (Stream, Datatable or ValueStream) then querying that same data in one service can cause problems as well. Run the query first then use all the data you have in the service variables. To troubleshoot service executions there are a few methods that can be used. Some for will not be practical for a production system since it is not always advisable to change code without testing first. Used browser development tools to see the execution time of a service. This is especially helpful when a mashup is slow to load or respond. It will allow quickly identifying which of multiple services may be the issue. Addition of logging in a service. Once a service is identified adding simple logging points in the service can narrow what code in the service cases the slow down (it may be another service call). These logging statements show up in the script logs with time stamps ( you can also log the current time with the logging statements). Use the test button in Composer. This is a simple on but if the service does not have many parameters (or has defaults) it's a fast and easy way to see how long a service takes to return,' When all else fails you can get thread dumps from the JVM. ThingWorx Support created an extension that assists with this. You can find it on the Marketplace with instructions on how to use it. You can manually examine the output files or open a ticket with support to allow them to assist. Just be careful of doing memory dumps, there are much larger, hard to analyse and take a lot of memory. https://marketplace.thingworx.com/tools/thingworx-support-tools Queries These of course are services too but a specific type. Accessing data in ThingWorx storage structures or from external sources seems fairly straight forward but can be tricky when dealing with large data sets. When designing and dealing with internal platform storage refer to this guide as a baseline to decide where to store data... Where Should I Store My Thingworx Data? NEVER store historical data in infotable properties. These are held in memory (even if they are persistent) and as they grow so will the JVM memory use until the application runs out of it. We all know what happens then. Finally one other note that has causes occasional confusion. The setting on a query service or standard ThingWorx query service that limits the number of records returned. This is how many records are returned to from the service at the end of processing, not how many are processed or loaded in memory. That number may be much higher and could cause the same types of issues. Subscriptions and Events This is similar to service however there is an added element frequency. Typical events are data change and timers/schedulers. This again is often an issue only when scaling up the number of Things or amount of data that need to be referenced. A general reference on timers and schedulers can be found here. This also describes some of the event processing that takes place on the platform. Timers and Schedulers - Best Practice For data change events be very cautions about adding these to very rapidly changing property values. When a property is updating very quickly, for example two times each second, the subscription to that event must be able to complete in under 0.5 seconds to stay ahead of processing. Again this may work for 5-10 Things with properties but will not work with 500 due to resources, speed and need to briefly lock the value to get an accurate current read. In these cases any data processing should be done at the edge when possible (or in the originating system) and pushed to the platform in a separate property or service call. This allows for more parallel processing since it is de-centralized. A good practice for allowing easier testing of these types of subscription code is to take all of the script/logic and move it to a service call. Then pass any of the needed event data to parameters in the service. This allows for easier debug since the event does not need to fire to make the logic execute. In fact it can essentially be stand alone by the test button in Composer. Mashup Performance This one can be very tricky since additional browser elements and rendering can come into play. Sometimes service execution is the root of the issue and reviewed above, other times it is UI elements and design that cause slow down. The Repeater widget is a common culprit. The biggest thing to note here is that each repeater will need to render every element that is repeated and all of the data and formatting for each of those widgets in the repeated mashup. So any complex mashup that is repeated many times may become slow to load. You can minimize this to a degree based on the Load/Unload setting of the widget and when the slowness is more acceptable (when loading or when scrolling). When a mashup is launched from Composer it comes with some debugging tools built in to see errors and execution. Using these with browser debug tools can be very helpful. Scaling an Application When initially modeling an application scale must be considered from the start. It is a challenge (but not impossible) to modify an application after deployment or design to be very efficient. Many times new developers on the ThingWorx platform fall into what I call the .Net trap. Back when .Net was released one of the quote I recall hearing about it's inefficiencies was "memory is cheap". It was more cost efficient to purchase and install more memory than to take extra development time to optimize memory use. This was absolutely true for installed applications where all of the code was complied and stored on every system. Web based applications are not quite a forgiving since most processing and execution is done on the single central web server. Keep this in mind especially when creating Shapes, Templates and Subscriptions. While you may be writing one piece of code when this code is repeated on 1,000 Things they will all be in memory and all be executing this code in parallel. You can quickly see how competition for resources, locks on databases and clean access to in memory structures can slow everything down (and just think when there are 10,000 pieces of that same code!!). Two specific things around this must be stated again (though they were covered in the above sections). Data held in properties has fast access since it is in JVM memory. But this is held in memory for each individual Thing, so hold 5 MB of information in one Thing seems small, loading 10,000 Thing mean instant use of 50 GB of memory!! Next execution of a service. When 10 things are running a service execution takes 2 seconds. Slow but not too bad and may not be too noticeable in the UI. Now 10,000 Things competing for the same data structure and resources. I have seen execution time jump to 2 minutes or more. Aside from design the best thing you can do is TEST on a scaled up structure. If you will have 1,000 Things next year test your application early at that level of deployment to help identify any potential bottlenecks early. Never assume more memory will alleviate the issue. Also do NOT test scale on your development system. This introduces edits changes and other variables which can affect actual real world results. Have a QA system setup that mirrors a production environment and simulate data and execution load. Additional suggestions are welcome in comments and will likely update this as additional tool and platform updates change.

Sep 26, 2017

This Expert Session will walk you through the Components involved in the ThingWorx Studio Augmented Reality Environment, a detailed Architecture, supported devices, and exploring the resources. The session shall provide great insight into the working and the technicalities involved in the ThingWorx Studio. For full-sized viewing, click on the YouTube link in the player controls. Visit the Online Success Guide to access our Expert Session videos at any time as well as additional information about ThingWorx training and services.

Sep 25, 2017

There are four types of Analytics: Prescriptive analytics: What should I do about it? Prescriptive analytics is about using data and analytics to improve decisions and therefore the effectiveness of actions.Prescriptive analytics is related to both Descriptive and Predictive analytics. While Descriptive analytics aims to provide insight into what has happened and Predictive analytics helps model and forecast what might happen, Prescriptive analytics seeks to determine the best solution or outcome among various choices, given the known parameters. “Any combination of analytics, math, experiments, simulation, and/or artificial intelligence used to improve the effectiveness of decisions made by humans or by decision logic embedded in applications.”These analytics go beyond descriptive and predictive analytics by recommending one or more possible courses of action. Essentially they predict multiple futures and allow companies to assess a number of possible outcomes based upon their actions. Prescriptive analytics use a combination of techniques and tools such as business rules, algorithms, machine learning and computational modelling procedures. Prescriptive analytics can also suggest decision options for how to take advantage of a future opportunity or mitigate a future risk, and illustrate the implications of each decision option. In practice, prescriptive analytics can continually and automatically process new data to improve the accuracy of predictions and provide better decision options. Prescriptive analytics can be used in two ways: Inform decision logic with analytics: Decision logic needs data as an input to make the decision. The veracity and timeliness of data will insure that the decision logic will operate as expected. It doesn’t matter if the decision logic is that of a person or embedded in an application — in both cases, prescriptive analytics provides the input to the process. Prescriptive analytics can be as simple as aggregate analytics about how much a customer spent on products last month or as sophisticated as a predictive model that predicts the next best offer to a customer. The decision logic may even include an optimization model to determine how much, if any, discount to offer to the customer. Evolve decision logic: Decision logic must evolve to improve or maintain its effectiveness. In some cases, decision logic itself may be flawed or degrade over time. Measuring and analyzing the effectiveness or ineffectiveness of enterprises decisions allows developers to refine or redo decision logic to make it even better. It can be as simple as marketing managers reviewing email conversion rates and adjusting the decision logic to target an additional audience. Alternatively, it can be as sophisticated as embedding a machine learning model in the decision logic for an email marketing campaign to automatically adjust what content is sent to target audiences. Different technologies of Prescriptive analytics to create action: Search and knowledge discovery: Information leads to insights, and insights lead to knowledge. That knowledge enables employees to become smarter about the decisions they make for the benefit of the enterprise. But developers can embed search technology in decision logic to find knowledge used to make decisions in large pools of unstructured big data. Simulation: Simulation imitates a real-world process or system over time using a computer model. Because digital simulation relies on a model of the real world, the usefulness and accuracy of simulation to improve decisions depends a lot on the fidelity of the model. Simulation has long been used in multiple industries to test new ideas or how modifications will affect an existing process or system. Mathematical optimization: Mathematical optimization is the process of finding the optimal solution to a problem that has numerically expressed constraints. Machine learning: “Learning” means that the algorithms analyze sets of data to look for patterns and/or correlations that result in insights. Those insights can become deeper and more accurate as the algorithms analyze new data sets. The models created and continuously updated by machine learning can be used as input to decision logic or to improve the decision logic automatically. Paragmetic AI: Enterprises can use AI to program machines to continuously learn from new information, build knowledge, and then use that knowledge to make decisions and interact with people and/or other machines. Use of Prescriptive Analytics in ThingWorx Analytics: Thing Optimizer: Thing Optimizer functionality provides the prescriptive scoring and optimization capabilities of ThingWorx Analytics. While predictive scoring allows you to make predictions about future outcomes, prescriptive scoring allows you to see how certain changes might affect future outcomes. After you have generated a prediction model (also called training a model), you can modify the prescriptive attributes in your data (those attributes marked as levers) to alter the predictions. The prescriptive scoring process evaluates each lever attribute, and returns an optimal value for that feature, depending on whether you want to minimize or maximize the goal variable. Prescriptive scoring results include both an original score (the score before any lever attributes are changed) and an optimized score (the score after optimal values are applied to the lever attributes). In addition, for each attribute identified in your data as a lever, original and optimal values are included in the prescriptive scoring results. How to Access Thing Optimizer Functionality: ThingWorx Analytics prescriptive scoring can only be accessed via the REST API Service. Using a REST client, you can access the Scoring service which includes a series of API endpoints to submit scoring requests, retrieve results, list jobs, and more. Requires installation of the ThingWorx Analytics Server. How to avoid mistakes - Below are some common mistakes while doing Prescriptive analytics: Starting digital analytics without a clear goal Ignoring core metrics Choosing overkill analytics tools Creating beautiful reports with little business value Failing to detect tracking errors Image source: Wikipedia, Content: go.forrester.com(Partially)

Sep 25, 2017

Check out this new KCS article which links to all known best practice documents available for ThingWorx. This article will get larger in time as more articles are published related to the Dos and Don'ts of building an IoT application! Do you know when to use timers, and where to implement their subscriptions? How about ensuring info tables are used at the proper time, and data tables at others? Pesky performance issues wherein ThingWorx runs slow for apparently no reason? All of these questions and more are addressed here!

Jul 18, 2017

New Generation Composer is available from ThingWorx 7.4 and later. Each subsequent release of ThingWorx will contain additional New Composer features/functionalities. This video is focused on the layout change and new features implemented from ThingWorx 7.4. How to enable the new Composer? 1. In the top right-hand corner, click on the User Menu and select the Preferences option Figure 1 2. Click the check box for Turn on New Composer Features. 3. Click Done. A New Composer link is added to the menu bar at the top of the Composer window. Figure 2 4. Click the New Composer link to open a new tab for the New Composer view Figure 3 What's the layout change in New Composer? Three areas layout Figure 4 Menu bar on the left (Area 1) Set Project Context to set default project name for new entities Two views: Recent and Browse Recent view will quickly locate the recent access entities Browse is almost the same to the old menu navigation bar Could be sizable or hidden, and the main idea here is to increase screen real-estate to allow bigger view/edit area Main area for listing and feeding entities in the middle (Area 2) It provides you a wider area to edit entities (author services, mashup builder, etc.) An extra area on the right for preview, properties/events editing, etc. (Area 3) It gave you an easier and handy way to get a glance of an entity’s basic information Layout change in an entity’s editing page When you create a new Thing, you will find all the facets (general information, properties, services, events, subscriptions) of the Thing are listed in a dropdown list Figure 5 By doing so it will save more area for the feeding Properties and Alerts Properties and Alerts are now listed separately (in two different Tabs) Figure 6 Figure 7 Properties and Alert are now edited on the right area of the page (See Figure 4 Area 3 Services and Subscription A bigger editor area Events Events are edited on right area of the page (See Figure 4 Area 3) What's the main function/features change in New Composer Industrial Connections The Industrial Connections entity allows you to connect with and configure industrial things in ThingWorx. With the “Discover” feature, you could easily bind the Industrial device’s (e.g., Kepware) Tags to a ThingWorx Entity Figure 8 From ThingWorx 8.0.0, Anomaly Alert type is supported An anomaly alert is only useful if you have configured Anomaly Detection to monitor a stream of data Anomaly metrics settings are allowed to be configured in the Alert edit page Figure 9 Subscriptions You could now manually remove a subscription permanently from the system in New Composer which is impossible in old Composer Services New Composer provided assistant scripting tools like static code analysis, String search or replace, etc Figure 10 How could I switch back to the Old Composer when editing an entity? It is so easy! As long as you have opened an ThingWorx Entity, you will notice there is a button “Edit in Composer”; it will lead you back to the old Composer, and all the editing that have been saved will be logged in the old Composer. Video demonstration for the New Composer is also available now. Feel free to review from: New Composer Video

Jul 11, 2017

Welcome to the ThingWorx Manufacturing Apps Community! The ThingWorx Manufacturing Apps are easy to deploy, pre-configured role-based starter apps that are built on PTC’s industry-leading IoT platform, ThingWorx. These Apps provide manufacturers with real-time visibility into operational information, improved decision making, accelerated time to value, and unmatched flexibility to drive factory performance. This Community page is open to all users-- including licensed ThingWorx users, Express (“freemium”) users, or anyone interested in trying the Apps. Tech Support community advocates serve users on this site, and are here to answer your questions about downloading, installing, and configuring the ThingWorx Manufacturing Apps. A. Sign up: ThingWorx Manufacturing Apps Community: PTC account credentials are needed to participate in the ThingWorx Community. If you have not yet registered a PTC eSupport account, start with the Basic Account Creation page. Manufacturing Apps Web portal: Register a login for the ThingWorx Manufacturing Apps web portal, where you can download the free trial and navigate to the additional resources discussed below. B. Download: Choose a download/packaging option to get started. i. Express/Freemium Installer (best for users who are new to ThingWorx): If you want to quickly install ThingWorx Manufacturing Apps (including ThingWorx) use the following installer: Download the Express/Freemium Installer ii. 30-day Developer Kit trial: To experience the capabilities of the ThingWorx Platform with the Manufacturing Apps and create your own Apps: Download the 30-day Developer Kit trial iii. Import as a ThingWorx Extension (for users with a Manufacturing Apps entitlement-- including ThingWorx commercial customers, PTC employees, and PTC Partners): ThingWorx Manufacturing apps can be imported as ThingWorx extensions into an existing ThingWorx Platform install (v8.1.0). To locate the download, open the PTC Software Download Page and expand the following folders: ThingWorx Platform | Release 8.x | ThingWorx Manufacturing Apps Extension | Most Recent Datacode C. Learn After downloading the installer or extensions, begin with Installation and Configuration. Follow the steps laid out in the ThingWorx Manufacturing Apps Setup and Configuration Guide 8.2 Find helpful getting-started guides and videos available within the 'Get Started' section of the ThingWorx Manufacturing Apps Portal. D. Customize Once you have successfully downloaded, installed, and configured the Manufacturing Apps, begin to explore the deeper potential of the Apps and the ThingWorx Platform. Follow along with the discussion and steps contained in the ThingWorx Manufacturing Apps and Service Apps Customization Guide 8.2 Also contained within the the 'Get Started' page of the ThingWorx Manufacturing Apps Portal, find the "Evolve and Expand" section, featuring: -Custom Plant Layout application -Custom Asset Advisor application -Global Plant View application -Thingworx Manufacturing Apps Technical Lab with Sigma Tile (Raspberry Pi application) -Configuring the Apps with demo data set and simulator -Additional Advanced Documentation E. Get help / give feedback / interact Use the ThingWorx Manufacturing Apps Community page as a resource to find documentation, peruse past forum threads, or post a question to start a discussion! For advanced troubleshooting, licensed users are encouraged to submit support tickets to the PTC My eSupport portal.

Jul 10, 2017

While it is not a requirement, it is a best practice to install KEPServerEX (v6.2 or higher) before installing ThingWorx (v8.0.1 or higher). If ThingWorx is already installed, close the application and complete the install of KEPServerEX by following these install instructions: How do I download and install KEPServerEX? Now, when you attempt to launch ThingWorx, if you are presented with a "null pointer exception" error, follow this workaround: 1. Navigate to the 'PostgreSQL\installer' directory, within the directory where the Manufacturing Apps are installed. By default this will be: <ThingWorx install path>\ThingWorxManufacturingApps\PostgreSQL\installer 2. Run the 'vcredist.exe' located there. This application should re-install the conflicting redistributables, and you should be able to launch ThingWorx again normally.

Jun 30, 2017

KEPServerEX requires the 32-bit version of Java if you are using the IoT Gateway Plug-in. If you do not have the 32-bit version installed and attempt to connect the IoT Gateway, the KEPServerEX Event Log will report the following error: “IoT Gateway failed to start, 32-bit JRE required." Some of the Manufacturing Applications training content relies on this Plug-in, as well. As a best practice, it is recommended that both the 32-bit and 64-bit versions of Java be installed. This install is available for download from the Oracle website, here: Java SE Runtime Environment 8 - Downloads

Jun 29, 2017

You might have seen the Performance Advisor for some of your other favorite PTC Products like Creo, Windchill or Integrity. Good news....it's now also available for ThingWorx! In case you're not familiar with the Performance Advisor, it's new functionality allowing you to work closer with the PTC / ThingWorx team for improving your usage with ThingWorx and improving ThingWorx itself in the areas that matter most to you. ThingWorx Performance Advisor delivers information dashboards driven by data on the features, usage and performance of your ThingWorx systems unlocks information that can reduce wasted development and improve design cycles allows comprehensive visibility into software versions in use to manage software upgrade plans simplifies compliance and revenue allocation by monitoring usage enables quick access to system and usage statistics across your organization uses personalized dashboards to viewing, reporting and trend analysis The Performance Advisor for ThingWorx has just been released, so we want you to share your experience and data to get you and us started on analyzing usage statistics and needs for further features. The Performance Advisor is easy to connect. It just takes three simple steps and a minute of your time. This will result in improved transparency, improved stability, improved productivity, improved product performance, improved compliance administration and an increased administrative efficiency and allows the ThingWorx R&D team to continuously improve the platform through the analytical insights from the data collected. As ThingWorx is growing fast, be sure to participate and actively shape the way you're using ThingWorx and the way that ThingWorx is designed. With newer versions of ThingWorx, capabilites and benefits for the Performance Advisor will be improved to ensure we're capturing the most accurate information to help you grow your Internet of Things business and scale your solutions to your / your application's needs and requirements. We're just at the beginning of the journey... How to enable ThingWorx Performance Advisor Enable Metrics Reporting and setting up the Performance Advisor capabilties is described in detail in CS262960 Just follow the steps and: Congratulations! It's as simple and fast as that - you enabled the ThingWorx Performance Advisor... quite easy, right? Where can I see the data / metrics I have sent to PTC? The information can be seen on the Performance Advisor Homepage Here's how the current views look like - they might change over time, introducing new features and views to maximize the impact and benefit for you. In a first glance the basic information of what has been collected can be seen in the Summary In the Connection System Details it shows more about what systems are currently connected with its user counts and number of remote things. The Connected System History shows a historical overview on how those parameters changed over time. For a more detailed historic overview of all the data being sent, check out the Historical Property Data. Questions? For specific questions, check out article CS262967 which holds the FAQs for the Performance Advisor If you have specific questions not addressed in the article, you can always comment on this blog post, open a new community thread or open a case with Support Services. We want your feedback After enabling metrics collection and reviewing the Performance Advisor dashboards, what do you think? What features would you like to see in the future? Is there anything missing that would help you as a System Administrator making your life easier? As we're trying to improve functionality over time, make sure your voice is heard as well and feel free to leave some feedback.

May 22, 2017

There are Four Types of Analytics: Descriptive: What Happened? Descriptive analytics is a preliminary stage of data processing that creates a summary of historical data to yield useful information and possibly prepare the data for further analysis. Analytics, which use data aggregation and data mining to provide insight into the past and answer: “What has happened? Descriptive analysis or statistics does exactly what the name implies they “Describe”, or summarize raw data and make it something that is interpret-able by humans. They are analytics that describe the past. The past refers to any point of time that an event has occurred, whether it is one minute ago, or one year ago. Descriptive analytics are useful because they allow us to learn from past behaviors, and understand how they might influence future outcomes. The vast majority of the statistics we use fall into this category. (Think basic arithmetic like sums, averages, percent changes). Usually, the underlying data is a count, or aggregate of a filtered column of data to which basic math is applied. For all practical purposes, there are an infinite number of these statistics. Descriptive statistics are useful to show things like, total stock in inventory, average dollars spent per customer and Year over year change in sales. Common examples of descriptive analytics are reports that provide historical insights regarding the company’s production, financials, operations, sales, finance, inventory and customers. Note: Use Descriptive Analytics when you need to understand at an aggregate level what is going on in your company, and when you want to summarize and describe different aspects of your business. Different techniques of Descriptive Analytics: Sampling Mean Mode Median Standard Deviation Range and Variance Stem and Leaf Diagram Histogram Quartiles Frequency Distributions Use of Descriptive Analytics in ThingWorx Analytics: Signal Detection: When analyzing volumes of data, it is helpful to know which data is actually useful and which data is just noise. Signals are based on a correlation algorithm that examines historical data to identify the strength of a given input in predicting future outcomes. Signals can identify meaningful correlations within the data. Signals are useful during initial analysis to determine which features you want to curate in a given data-set for predictive model generation. For example, knowing the month of the year is more important to accurately predicting tomorrow’s weather than knowing the day of the week. The month has a much stronger signal than the day of the week for this prediction. ThingWorx Analytics reports signal strength in a mutual information (MI) score that represents the probability of predicting the goal variable when a given feature is provided. It can effectively capture non-linear relationships. ThingWorx Analytics evaluates each feature, or combination of features, to identify the top signals. Cluster Analysis: Cluster analysis categorizes data into groups based on similarities relative to a goal variable. Like a clique, objects in a cluster minimize intra-distances (distances within the cluster) while maximizing inter-distances (distances between clusters). Clusters are mutually exclusive, meaning that each record can belong to only one cluster. However, ThingWorx Analytics supports a user-defined cluster hierarchy that can include sub-clusters inside other clusters. The higher the number of clusters in the data, the smaller each cluster’s population will be, but the stronger the potential insights can be. How to Access Descriptive Analysis Functionality via ThingWorx Analytics: REST API Service — Using a REST client, you can access the Signals Service and the Clusters Service. Each service includes a series of API endpoints to submit analysis requests, retrieve results, list jobs, and more. Requires installation of the ThingWorx Analytics Server. Analytics Builder — As part of the ThingWorx Analytics Extension, Analytics Builder provides a user interface for interacting with your data. In addition to generating and scoring predictive models in Analytics Builder, you can also run procedures to generate signals. How to avoid mistakes - Useful tips for Different Techniques of Descriptive Analytics: Crystallize the research problem → Operability of it! Read literature on data analysis techniques. Evaluate various techniques that can do similar things w.r.t. to research problem. Know what a technique does and what it doesn’t. Consult people, esp. supervisor.

May 7, 2017

This blog addresses a few points that are related to scoring with ThingWorx Analytics. It, particularly, brings a clearer understanding of the concepts behind the values of the scores that are generated when performing a scoring job. Scoring Outputs: It is important to note that when training an analytics model, the method is to create a generalizable model from a relatively small training dataset. By its nature, we expect the training process to see a limited subset and not an exhaustive list of all possible values for many constraints, especially for time and practicality. As such, these generalized models will be expected to handle unseen data in the form of new combinations or values outside of previously observed ranges (more on this below). One common way to see scores that exceed the observed ranges in training, under the assumption that the goals are continuous, is to use prescriptive scoring. Prescriptive scoring attempts to find optimal values for a lever, meaning tunable, features in order to maximize or minimize score values. See the prescriptive scoring documentation and functionality for more information. min/max constraints: these are constraints that are placed upon the inputs for training and expected inputs for scoring. • For training: If theses ranges were provided as part of the upload process, then training will raise exceptions regarding invalid data. However, if the ranges are not provided - they will be inferred from the data and, as such, training will not see values outside of observed ranges. • For scoring: Validation of the ranges will only be performed on the inputs - not the outputs. It is very important to note that the handling of these "constraints" is dependent upon the data type. For categorical (e.g. colors) and ordinal data (e.g. shirt sizes), the constraints are strict and data that was not observed in training will raise exceptions during scoring. However, for continuous values (e.g. temperature ranges) these constraints are more informational in nature. For predictive scoring, our code will accept records with values outside of those ranges. The rule of thumb is that values slightly outside these ranges are acceptable and that as the values stray farther from the ranges, the accuracy of the model degrades very quickly. For prescriptive scoring, these constraints are used to determine the acceptable ranges of values to try when determining the optimal values. Values outside of these constraints will NOT be tried.

Apr 21, 2017

A Feature - a piece of information that is potentially useful for prediction. Any attribute could be a feature, as long as it is useful to the model. Feature engineering – Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data. It’s a vaguely agreed space of tasks related to designing feature sets for Machine Learning applications. Components: First, understanding the properties of the task you’re trying to solve and how they might interact with the strengths and limitations of the model you are going to use. Second, experimental work were you test your expectations and find out what actually works and what doesn’t. Feature engineering as a technique, has three sub categories of techniques: Feature selection, Dimension reduction and Feature generation. Feature Selection: Sometimes called feature ranking or feature importance, this is the process of ranking the attributes by their value to predictive ability of a model. Algorithms such as decision trees automatically rank the attributes in the data set. The top few nodes in a decision tree are considered the most important features from a predictive stand point. As a part of a process, feature selection using entropy based methods like decision trees can be employed to filter out less valuable attributes before feeding the reduced dataset to another modeling algorithm. Regression type models usually employ methods such as forward selection or backward elimination to select the final set of attributes for a model. For example: Project Development decision-tree: Dimension Reduction: This is sometimes called feature extraction. The most classic example of dimension reduction is principle component analysis or PCA. PCA allows us to combine existing attributes into a new data frame consisting of a much reduced number of attributes by utilizing the variance in the data. The attributes which "explain" the highest amount of variance in the data form the first few principal components and we can ignore the rest of the attributes if data dimensionality is a problem from a computational standpoint. Feature Generation or Feature Construction: Quite simply, this is the process of manually constructing new attributes from raw data. It involves intelligently combining or splitting existing raw attributes into new one which have a higher predictive power. For example a date stamp may be used to generate 2 new attributes such as AM and PM which may be useful in discriminating whether day or night has a higher propensity to influence the response variable. Feature construction is essentially a data transformation process. Tips for Better Feature Engineering Tip 1: Think about inputs you can create by rolling up existing data fields to a higher/broader level or category. As an example, a person’s title can be categorized into strategic or tactical. Those with titles of “VP” and above can be coded as strategic. Those with titles “Director” and below become tactical. Strategic contacts are those that make high-level budgeting and strategic decisions for a company. Tactical are those in the trenches doing day-to-day work. Other roll-up examples include: Collating several industries into a higher-level industry: Collate oil and gas companies with utility companies, for instance, and call it the energy industry, or fold high tech and telecommunications industries into a single area called “technology.” Defining “large” companies as those that make $1 billion or more and “small” companies as those that make less than $1 billion. Tip 2: Think about ways to drill down into more detail in a single field. As an example, a contact within a company may respond to marketing campaigns, and you may have information about his or her number of responses. Drilling down, we can ask how many of these responses occurred in the past two weeks, one to three months, or more than six months in the past. This creates three additional binary (yes=1/no=0) data fields for a model. Other drill-down examples include: Cadence: Number of days between consecutive marketing responses by a contact: 1–7, 8–14, 15–21, 21+ Multiple responses on same day flag (multiple responses = 1, otherwise =0) Tip 3: Split data into separate categories also called bins. For example, annual revenue for companies in your database may range from $50 million (M) to over $1 billion (B). Split the revenue into sequential bins: $50–$200M, $201–$500M, $501M–$1B, and $1B+. Whenever a company falls with the revenue bin it receives a one; otherwise the value is zero. There are now four new data fields created from the annual revenue field. Other examples are: Number of marketing responses by contact: 1–5, 6–10, 10+ Number of employees in company: 1–100, 101–500, 502–1,000, 1,001–5,000, 5,000+ Tip 4: Think about ways to combine existing data fields into new ones. As an example, you may want to create a flag (0/1) that identifies whether someone is a VP or higher and has more than 10 years of experience. Other examples of combining fields include: Title of director or below and in a company with less than 500 employees Public company and located in the Midwestern United States You can even multiply, divide, add, or subtract one data field by another to create a new input. Tip 5: Don’t reinvent the wheel – use variables that others have already fashioned. Tip 6: Think about the problem at hand and be creative. Don’t worry about creating too many variables at first, just let the brainstorming flow.

Apr 21, 2017

Sampling Strategy This Blog Post will cover the 4 sampling Strategies that are available in ThingWorx Analytics. It will tell you how the sampling strategy runs behind the scenes, when you may want to use that strategy, and will give you the pros and cons of each strategy. SAMPLE_WITH_REPLACEMENT This strategy is not often used by professionals but still may be useful in certain circumstances. When you sample with replacement, the value that you randomly selected is then returned to the sample pool. So there is a chance that you can have the same record multiple times in your sample. Example Let’s say you have a hat that contain 3 cards with different people’s names on them. John Sarah Tom Let’s say you make 2 random selections. The first selection you pull out the name Tom. When you sample with replacement, you would put the name Tom back into the hat and then randomly select a card again. For your second selection, it is possible to get another name like Sarah, or the same one you selected, Tom. Pros May find improved models in smaller datasets with low row counts Cons The Accuracy of the model may be artificially inflated due to duplicates in the sample SAMPLE_WITHOUT_REPLACEMENT This is the default setting in ThingWorx Analytics and the most commonly used sampling strategy by professionals. The way this strategy works is after the value is randomly selected from the sample pool, it is not returned. This ensures that all the values that are selected for the sample, are unique. Example Let’s say you have a hat that contain 3 cards with different people’s names on them. John Sarah Tom Let’s say you make 2 random selections. The first selection you pull out the name Tom. When you sample without replacement, you would randomly select a card from the hat again without adding the card Tom. For your second selection, you could only get the Sarah or John card. Pros This is the sampling strategy that is most commonly used It will deliver the best results in most cases Cons May not be the best choice if the desired goal is underrepresented in the dataset UPSAMPLE_AND_SAMPLE_WITHOUT_REPLACEMENT This is useful when the desired goal is underrepresented in the dataset. The features that represent the desired outcome of the goal are copied multiple times so they represent a larger share of the total dataset. Example Let’s say you are trying to discover if a patient is at risk for developing a rare condition, like chronic kidney failure, that affects around .5% of the US population. In this case, the most accurate model that would be generated would say that no one will get this condition, and according to the numbers, it would be right 99.5% of the time. But in reality, this is not helpful at all to the use case since you want to know if the patient is at risk of developing the condition. To avoid this from happening, copies are made of the records where the patient did develop the condition so it represents a larger share of the dataset. Doing this will give ThingWorx Analytics more examples to help it generate a more accurate model. Pros Patterns from the original dataset remain intact Cons Longer training time DOWNSAMPLE_AND_SAMPLE_WITHOUT_REPLACEMENT This is also useful when the desired goal is underrepresented in the dataset. In downsample and sample without replacement, some features that do not represent the desired goal outcome are removed. This is done to increase the desired features percentage of the dataset. Example Let’s continue using the medical example from above. Instead of creating copies of the desired records, undesired record are removed from the dataset. This causes the records where patients did develop the condition to occupy a larger percentage of the dataset. Pros Shorter training time Cons Patterns from the original dataset may be lost

Mar 31, 2017

1. Add an Json parameter Example: { "rows":[ { "email":"example1@ptc.com" }, { "name":"Qaqa", "email":"example2@ptc.com" } ] } 2. Create an Infotable with a DataShape usingCreateInfoTableFromDataShape(params) 3. Using a for loop, iterate through each Json object and add it to the Infotable usingInfoTableName.AddRow(YourRowObjectHere) Example: var params = { infoTableName: "InfoTable", dataShapeName : "jsontest" }; var infotabletest = Resources["InfoTableFunctions"].CreateInfoTableFromDataShape(params); for(var i=0; i<json.rows.length; i++) { infotabletest.AddRow({name:json.rows.name,email:json.rows.email}); }

Mar 26, 2017

There are multiples approaches to improve the performance Increase the NetWork bandwidth between client PC and ThingWorx Server Reduce the unnecessary handover when client submit requests to ThingWorx server through NetWork Here are suggestions to reduce the unnecessary handover between client and server Eliminate the use of proxy servers between client and ThingWorx It is compulsory to download Combined.version.date.xxx.xxx.js file when the first time to load mashup page (TTFB and Content Download time). Loading performance is affected by using of proxy servers between client and ThingWorx Server. This is testing result with proxy server set up This is the test result after eliminiating proxy server from the same environment Cut off extensions that not used in ThingWorx server After installed extensions, the size of Combined.version.date.xxx.xxx.js increased Avoid Http/Https request errors There is a https request error when calling Google map. It takes more than 20 seconds

Mar 26, 2017

Welcome to the Thingworx Community area for code examples and sharing. We have a few how-to items and basic guidelines for posting content in this space. The Jive platform the our community runs on provides some tools for posting and highlighting code in the document format that this area is based on. Please try to follow these settings to make the area easy to use, read and follow. At the top of your new document please give a brief description of the code sample that you are posting. Use the code formatting tool provided for all parts of code samples (including if there are multiple in one post). Try your best to put comments in the code to describe behavior where needed. You can edit documents, but note each time you save them a new version is created. You can delete old versions if needed. You may add comments to others code documents and modify your own code samples based on comments. If you have alternative ways to accomplish the same as an existing code sample please post it to the comments. We encourage everyone to add alternatives noted in comments to the main post/document. Format code: The double blue arrows allow you to select the type of code being inserted and will do key word highlighting as well as add line numbers for reference and discussions.

Mar 16, 2017

1. Create a network and added all Entities that implement from a specific ThingShape in the network 2. Create a ThingShape mashup as below Note: Bind the Entity parameter to DynamicThingShapes_TracotrShape's service GetProperties input EntityName. Laso bind mashup RefreshRequested event to that service 3. Create a mashup named ContentShape, add Tree widget and ContainedMashp in it 4. Bind Service GetNetworkConnection's Selected Row(s) result and Selected RowsChanged event to ContainedMashup widget Note: Master can total replace ThingShape mashup. Suggest to use Master after ThingWorx 6.0

Mar 9, 2017

Timers and schedulers can be useful tool in a Thingworx application. Their only purpose, of course, is to create events that can be used by the platform to perform and number of tasks. These can range from, requesting data from an edge device, to doing calculations for alerts, to running archive functions for data. Sounds like a simple enough process. Then why do most platform performance issues seem to come from these two simple templates? It all has to do with how the event is subscribed to and how the platform needs to process events and subscriptions. The tasks of handling MOST events and their related subscription logic is in the EventProcessingSubsystem. You can see the metrics of this via the Monitoring -> Subsystems menu in Composer. This will show you how many events have been processed and how many events are waiting in queue to be processed, along with some other settings. You can often identify issues with Timers and Schedulers here, you will see the number of queued events climb and the number of processed events stagnate. But why!? Shouldn't this multi-threaded processing take care of all of that. Most times it can easily do this but when you suddenly flood it with transaction all trying to access the same resources and the same time it can grind to a halt. This typically occurs when you create a timer/scheduler and subscribe to it's event at a template level. To illustrate this lets look at an example of what might occur. In this scenario let's imagine we have 1,000 edge devices that we must pull data from. We only need to get this information every 5 minutes. When we retrieve it we must lookup some data mapping from a DataTable and store the data in a Stream. At the 5 minute interval the timer fires it's event. Suddenly all at once the EventProcessingSubsystem get 1000 events. This by itself is not a problem, but it will concurrently try to process as many as it can to be efficient. So we now have multiple transactions all trying to query a single DataTable all at once. In order to read this table the database (no matter which back end persistence provider) will lock parts or all of the table (depending on the query). As you can probably guess things begin to slow down because each transaction has the lock while many others are trying to acquire one. This happens over and over until all 1,000 transactions are complete. In the mean time we are also doing other commands in the subscription and writing Stream entries to the same database inside the same transactions. Additionally remember all of these transactions and data they access must be held in memory while they are running. You also will see a memory spike and depending on resource can run into a problem here as well. Regular events can easily be part of any use case, so how would that work! The trick to know here comes in two parts. First, any event a Thing raises can be subscribed to on that same Thing. When you do this the subscription transaction does not go into the EventProcessingSubsystem. It will execute on the threads already open in memory for that Thing. So subscribing to a timer event on the Timer Thing that raised the event will not flood the subsystem. In the previous example, how would you go about polling all of these Things. Simple, you take the exact logic you would have executed on the template subscription and move it to the timer subscription. To keep the context of the Thing, use the GetImplimentingThings service for the template to retrieve the list of all 1,000 Things created based on it. Then loop through these things and execute the logic. This also means that all of the DataTable queries and logic will be executed sequentially so the database locking issue goes away as well. Memory issues decrease also because the allocated memory for the quries is either reused or can be clean during garbage collection since the use of the variable that held the result is reallocated on each loop. Overall it is best not to use Timers and Schedulers whenever possible. Use data triggered events, UI interactions or Rest API calls to initiate transactions whenever possible. It lowers the overall risk of flooding the system with recourse demands, from processor, to memory, to threads, to database. Sometimes, though, they are needed. Follow the basic guides in logic here and things should run smoothly!

Feb 26, 2017

Concepts of Anomaly Detection used in ThingWatcher ThingWatcher is based on anomaly detection with the normal distribution. What does that mean? Actually, normally distributed metrics follow a set of probabilistic rules. Upcoming values who follow those rules are recognized as being “normal” or “usual”. Whereas value who break those rules are recognized as being unusual. What is a normal distribution? A normal distribution is a very common probability distribution. In real life, the normal distribution approximates many natural phenomena. A data set is known as “normally distributed” when most of the data aggregate around it's mean, in a symmetric way. Also, it's extreme values get less and less likely to appear. Example When a factory is making 1 kg sugar bags it doesn’t always produce exactly 1 kg. In reality, it is around 1 kg. Most of the time very close to 1 kg and very rarely far from 1 kg. Indeed, the production of 1 kg sugar bag follows a normal distribution. Mathematical rules When a metric appears to be normally distributed it follows some interesting law. As does the sugar bag example. The mean and the median are the same. Both are equal to 1000. It’s because of the perfectly symmetric “bell-shape” It is the standard deviation called sigma σ that defines how the normal distribution is spread around the mean. In this example σ = 20 68% of all values fall between [mean-σ; mean+σ] For the sugar bag [980; 1020] 95% of all values fall between [mean-2*σ; mean+2*σ] For the sugar bag [960; 1040] 99,7% of all values fall between [mean-3*σ; mean+3*σ] For the sugar bag [940; 1060] The last 3 rules are also known as the 68–95–99.7 rule also called the three-sigma rule of thumb When the rules get broken: it’s an anomaly As previously stated, When a system has been proven normally distributed, it follows a set of rules. Those rules become the model representing the normal behavior of the metric. Under normal conditions, upcoming values will match the normal distribution and the model will be followed. But what happens when the rules get broken? This is when things turn different as something unusual is happening. In theory, in a normal distribution, no values are impossible. If the weights of the bags of sugar were really distributed, we would probably find a bag of sugar of 860 g every billion products. In reality, we approximate this sugar bag example as normally distributed. Also, almost impossible value are approximated as impossible Techniques of Anomaly Detection Technique n°1: outlier value An almost impossible value could be considered as an anomaly. When the value deviates too much from the mean, let’s say by ± 4σ, then we can consider this almost impossible value as an anomaly. (This limit can also be calculated using the percentile). Sugar bags who weigh less than 920 g or more than 1080 g are considered anomalous. Chances are, there is a problem in the production chain. This provides a simple way to define maximum and minimum thresholds. Technique 2: detecting change in the normal distribution Technique n°2 can detect unusual distribution fast, using only some points. But it can’t detect anomalies who move from one sigma σ to another in a usual manner. To detect this kind of anomaly we use a “window” of n last elements. If the mean and standard derivation of this window change too much from usual then we can deduce an anomaly. Using a big window with a lot of values is more stable, but it requires more time to detect the anomaly. The bigger the window is the more stable it becomes. But it would require more time to detect the anomaly as it needs to aggregate more values for the detection.

Jan 30, 2017

In this particular scenario, the server is experiencing a severe performance drop.The first step to check first is the overall state of the server -- CPU consumption, memory, disk I/O. Not seeing anything unusual there, the second step is to check the Thingworx condition through the status tool available with the Tomcat manager. Per the observation: Despite 12 GB of memory being allocated, only 1 GB is in use. Large number of threads currently running on the server is experiencing long run times (up to 35 minutes) Checking Tomcat configuration didn't show any errors or potential causes of the problem, thus moving onto the second bullet, the threads need to be analyzed. That thread has been running 200,936 milliseconds -- more than 3 minutes for an operation that should take less than a second. Also, it's noted that there were 93 busy threads running concurrently. Causes: Concurrency on writing session variable values to the server. The threads are kept alive and blocking the system. Tracing the issue back to the piece of code in the service recently included in the application, the problem has been solved by adding an IF condition in order to perform Session variable values update only when needed. In result, the update only happens once a shift. Conclusion: Using Tomcat to view mashup generated threads helped identify the service involved in the root cause. Modification required to resolve was a small code change to reduce the frequency of the session variable update.

Jan 29, 2017

Hi everybody, In this blogpost I want to share with you my local ThingWorx installation, with some optimizations that I did for local development. -use the -XX:+UseConcMarkSweepGC . This uses the older Garbage Collector from the JVM, instead of the newer G1GC recommended by the ThingWorx Installation guide since version 7.2. The advantage of ConcMarkSweepGC is that the startup time is faster and the total memory footprint of the Tomcat is far lower than G1GC. -use -agentlib:jdwp=transport=dt_socket,address=1049,server=y,suspend=n. This allows using your Java IDE of choice to connect directly to the Tomcat server, then debugging your Extension code, or even the ThingWorx code using the Eclipse Class Decompilers for example. Please modify the 1049 to your port of choice for exposing the server debugging port. -use -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=60000 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false This sets up the server to allow JMX monitoring. I usually use VisualVM from the JDK bin folder, but you can use any JMX monitoring tool. This uses no Authentication, no SSL and uses port 6000 - modify if you need. I usually startup Tomcat manually from a folder via startup.bat, and the setenv.bat looks like: set JAVA_HOME=C:\Program Files\Java\jdk1.8.0_102 set JRE_HOME=C:\Program Files\Java\jdk1.8.0_102 set THINGWORX_PLATFORM_SETTINGS=D:\Work\servers\apache-tomcat-8.0.33 // this is where the platform-settings.json file is located set CATALINA_OPTS=-d64 -XX:+UseNUMA -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8 -agentlib:jdwp=transport=dt_socket,address=1049,server=y,suspend=n -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=60000 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false In this mode I can look at any errors in almost real time from the console and it makes killing the server for Java Extension reload a breeze -> Ctrl+C Please don't hesitate to provide feedback on this document, I certainly welcome it. Be warned: THESE ARE NOT PRODUCTION SETTINGS. Best regards, Vladimir

Jan 18, 2017

IoT & Connectivity Tips

Performance Design, Pitfalls and Troubleshooting

Expert Session: ThingWorx Studio Overview

Prescriptive analytics

Best Practice Hub Available in KCS!

Layout and new features in New Generation Composer (Next-Gen Composer)

Getting Started with the ThingWorx Manufacturing Apps Community

Do I need to uninstall ThingWorx before installing KEPServerEX?

Why do I need to install 32-bit java even though I am using a 64-bit operating system?

Performance Advisor for ThingWorx - explore, configure and share

Descriptive Analytics

Constraints on Scores Values in ThingWorx Analytics

Feature Engineering

Overview on Sampling Strategy

How to convert a "Json input" to an "Infotable" in a ThingWorx service

How to improve mashup loading performance

How-To and Guide for Posting Code

A demo for how to use ThingShape Mashup

Timers and Schedulers - Best Practice

Concepts of Anomaly Detection used in ThingWatcher

Troubleshooting example of extremely slow server performance with low CPU consumption

Seting up your ThingWorx server for local development

ThingWorx Learning Paths

Transaction Handling in ThingWorx

Getting Started on the ThingWorx Platform Learning Path