There have been a number of questions from customers and partners on when they should use different tools for calculation of descriptive analytics within ThingWorx applications. The platform includes two different approaches for the implementation of many common statistical calculations on data for a property: descriptive services and property transforms. Both of these tools are easy to implement and orchestrate as part of a ThingWorx application. However, these tools are targeted for handling different scenarios and also differ in utilization of compute resources. When choosing between these two approaches it is important to consider the specific use case being implemented along with how the implemented approach will fit into the overall design and architecture of the ThingWorx environment. This article will provide some guidance on scenarios to use each of these approaches in ThingWorx applications and things to consider with each approach.
Let's look at the two different approaches and some guidelines for when they should be used.
Descriptive services (click for more details) provide a set of ThingWorx services to analyze a data set and perform many common data transformations. These services are targeted for performing calculations and transformations on recent operating history of a single property. Descriptive services are called on demand to perform batch calculations.
Scenarios to use descriptive services:
On demand calculations performed within a mashup, a service call or an event to determine action and calculation results are not (always) stored
Regular occurring calculations on logged property values or generated datasets (batch calculations)
Calculations are done regularly in minutes, hours or days on a discrete set of data. Examples: average value in last hour, median value in last day, or max value in last half hour.
Time between data creation and analysis is minutes or hours. Some latency in the calculation result is acceptable for the use case.
Input data set has 10s to 100s to 1000s of values. Keep the size of the input data at 10,800 values or less. If larger data sizes are required, then break them into micro batches if possible or use other tools to handle the processing.
Multiple calculations need to be done from the same set of input data. Examples: average value in last hour, max value in the last hour and standard deviation value in the last hour are all required to be calculated.
Things to consider when using descriptive services
Requires input dataset to be in the specific datashape format that is used by descriptive services.
If property values are logged in a value stream, there is a service to query the values and prepare the dataset for processing.
If scenarios where the data is not for a logged property, then another service or sql query can be used to prepare the dataset for processing.
Requires javascript development work to implement. This includes creation of a service to execute the descriptive services and usage of subscriptions and events to orchestrate calculations. An example of the javascript to execute descriptive services is available in the help center (here)
Typically retrieval of the input data from value stream (QueryTimedValuesForProperty) is slowest part of the process. The input data is sent to an out of process platform analytics service for all calculations.
Broader set of calculation services available (see table at the end of this article)
Remember that these services are not meant to be used for big data calculations or big data preparation. Look for other approaches if the input data sets grow larger than 10,800 values
Property Transforms (click for more details) provide a set of transformation services for streaming data as it enters ThingWorx. Property transforms are targeted for performing continuous calculations on recent values in the stream of a single property and delivering results in (near) real-time. Since property transforms are continuous calculations, they are always running and using compute resources. Before implementing property transforms review the information in the property transform sizing guide to better understand factors that impact the scaling of property transforms.
Scenarios to use:
Continuous calculations on a stream for a single property as new data comes into ThingWorx
New values enter the stream faster than one value per minute (as a general guideline)
Calculations required to be done in seconds or minutes. Examples: average electrical current in last 10 seconds, median pressure in the last 10 readings, or max torque in last minute
Time between data creation and analysis is small (in seconds). Results of property transform is required for rapid decisions and action so reducing latency is critical
Data sets used for calculation are small and contain 10s to 100s of values.
Calculated results are stored in a new property in the ThingModel
Things to consider when using property transforms
Codeless process to create new property transforms on a single property in the ThingModel
Does not require input property values to be logged as calculations are performed on streaming data as it enters ThingWorx
Unlike descriptive services which only execute when called, each property transform creates a continuously running job that will always be using compute resources. Resource allocations for property transforms must be included in the overall system architecture. Before selecting the property transform approach, refer to the Property Transform Sizing Guide for more information about how different parameters affect the performance of Property Transforms and results of performance load test scenarios.
Let’s apply these guidelines to a few different use cases to determine which approach to select.
1. Mashup application that allows users to calculate and view median temperature over a selected time window
In this scenario, the calculation will be executed on-demand with a user defined time window. Descriptive services are the only option here since there is not a pre-defined schedule and the user can select which data to use for the calculation.
2. Calculate the max torque (readings arriving one per second) on a press over each minute without storing all of the individual readings.
In this scenario, the calculation will be executed without storing the individual readings coming from the machine. The transformation is made to the data on its way into ThingWorx and continuously calculating based on new values. Property transforms are the only option here since the individual values are not being stored.
3. Calculation of average pressure value (readings arriving one per second) over a five minute window to monitor conditions and raise an alert when the median value is more than two standard deviations from expected.
In this scenario, both descriptive services and property transforms can perform the calculation required. The calculation is going to occur every 5 minutes and each data set will have about 300 values. The selection of batch (descriptive services) or streaming (property transforms) will likely be determined by the usage of the result. In this case, the calculation result will be used to raise an alert for a specific five minute window which likely will require immediate action. Since the alert needs to be raised as soon as possible, property transforms are the best option (although descriptive services will handle this case also with less compute resource requirements).
4, Calculation of median temperature (readings each 20 seconds) over 48 hour period to use as input to predict error conditions in the next week.
In this scenario, the calculation will be performed relatively infrequently (once every 48 hours) over a larger data set (about 8,640 values). Descriptive services are the best option in this case due to the data size and calculation frequency. If property transforms were used, then compute resources would be tied up holding all of the incoming values in memory for an extended period before performing a calculation. With descriptive services, the calculation will only consume resource when needed, or once every 48 hours.
Hopefully this information above provides some more insight and guidelines to help choose between property transforms and descriptive services. The table below provides some additional comparisons between the two approaches.
Descriptive Services
Property Transforms
Purpose
Provide a set of ThingWorx services to analyze a data set and perform many common data transformations.
Provide a set of prescribed transformation services for streaming data as it enters ThingWorx.
Processing Mode
Batch
Streaming / Continuous
Delivery
API / Service
Composer interface
API / Service
Input Data
Discrete data set
Must be logged
Single property
Configurable by time or lookback
Rolling data set on property X
Persistence is optional
Single property
Configurable by time or lookback
Output Data
Return object handled programmatically
Single output for discrete data set
New property f_X in the input model
Continuous output at configurable frequency
Output time aligned with input data
Available Services
Statistics (min, max, mean, median, mode, std deviation)
SPC calculations (# continuous data points: above threshold, in / out of range, increasing / decreasing, alternating)
Data distribution: count by bins (histogram)
Five numbers (min, lower quartile, median, upper quartile, max)
Confidence interval
Sampling frequency
Frequency transform (FFT)
Statistics (min, max, mean, median, mode, std deviation)
SPC calculations (# continuous data points: above threshold, in / out of range, increasing / decreasing, alternating)
View full tip