Discover how we embed security throughout the entire lifecycle of the ThingWorx platform in our latest “ThingWorx on Air” episode!
Hear Walter walk through how the ThingWorx platform is secured from end to end. Walter breaks it down into three simple parts: secure design, secure coding practices and continuous security improvements via our maintenance releases.
Listen to Episode 07 to hear the steps we’re taking in each of these areas and how security is at the forefront of what we do.
Finally, Walter mentions the Secure Deployment Hub, our brand-new set of resources to help you securely deploy your ThingWorx apps. Check out my last tech tip to learn more.
As always, stay connected,
... View more
Monitoring ThingWorx performance is crucially important, both during the load testing of a newly completed application, and after the deployment of new code in an existing application. Monitoring performance ensures that everything works as expected at the Enterprise level. This tutorial steps you through configuring and installing a tool authored by EDC team member Desheng Xu ( @xudesheng ), which runs on the same network as the ThingWorx instance. This tool collects data from the Platform and translates it into something visual and easy to understand via Grafana.
Attached to this blog post is a file containing the tool, called "tsample". It is small and customizable, and it plays a similar role to telegraf . Its focus is on gathering ThingWorx performance metrics. Historically, this tool also supported collecting OS level performance metrics, but this is no longer supported. It is highly recommended to collect OS level performance metrics by using telegraf , a tool designed specifically for that purpose (and not discussed here). This is not the only way to go about monitoring ThingWorx performance, but this tool uses a very good approach that has been proven effective both at customer sites and internally by PTC to monitor scale tests.
Recommended Deployment Architecture
tsample can be deployed in the same box where ThingWorx Tomcat is running, but it's recommended to deploy it on a separated box to minimize any performance impact caused by the collector. tsample supports export to InfluxDB and/or local file. In this document, it is assumed that InfluxDB will be used for monitoring purpose. Please note that this is not the same instance of InfluxDB being used by ThingWorx (if configured). This article will not cover setting up InfluxDB or NGINX (if necessary), so please configure these before beginning this tutorial.
tsample has been tested on Windows 2016, MacOS 10.15, Ubuntu 16.04, and Redhat 7.x. It's anticipated to work on a more general Ubuntu/Redhat/Mac/Windows release as well. Please leave a comment or contact the author, @xudesheng , if Raspberry Pi support is needed.
Where to Store the Configuration File
tsample will pick up the configuration file in the following sequence:
from the command line... ./tsample -c <path to configuration file>
from the environment...
Linux: export TSAMPLE_CONFIG=<path to configuration file>
Windows: set TSAMPLE_CONFIG=<path to configuration file>
from a default location... tsample will try to find a file with the name "c onfig.toml " from the same folder in which it starts.
How to Craft a Configuration File
You can use following command to generate a sample file:
./tsample -c config.toml -e
./tsample -c config.toml --export
A file with the name "config.toml " will be generated with a sample configuration. You can then adjust its content in accordance with the following.
Configuration File Content
Configuration file must be in toml format.
title and owner sections Both sections are optional. The intention of these two sections is to support doc tool in future.
TestMachine section This is section is required, and it defines where this tool will run.
This section is where you define targeted ThingWorx applications. Multiple ThingWorx servers can be defined with the same or different metrics to be collected.
thingworx_servers.metrics sections Underneath each thingworx_servers section, there are several metrics. In default example, following metrics have been included:
You can add your own customized metrics, as long as the result follows the same Data Shape. The default Data Shape has 3 columns:
If the output Data Shape exceeds this limit, the tool will likely not work properly.
This section defines the target InfluxDB as a sink of collected performance metrics.
This section defines the target file storage for collected performance metrics.
Grafana Configuration Example
Monitor Value Stream
Step 1. Connect Grafana to InfluxDB
Step 2: Create a New Dashboard
Step 3. Create a New Query
Depending on which metrics you defined to collect in the tsample configuration file, you will see a different choice of measurement in Grafana. Here, we will use ValueStreamProcessingSubsystem as an example.
Step 4. Choose the Right Platform and Storage Provider
Some metrics depend on the database storage provider, like Value Stream and Stream.
Step 5. Choose the Metrics Figures
Select "remove" to get rid of the default 'mean' calculation.
Select "non_negative_difference" from Transformations. Using this transformation, Grafana can show us the speed of writes.
Then, remove the default GROUP BY "time" clause.
Assign a meaningful alias of this query.
Step 6. Add Another Query
You can add another query as 'Value Stream Queued Speed' by following the same steps.
Step 7. Assign a Panel Title
Step 8. Review the Result
Let's go back to the dashboard page and select "last 15 minutes" or "last 5 minutes" from the top right corner. It should show a result similar to the chart below.
Step 9. Save the Dashboard
Don't forget to save your dashboard before we add more panels.
Step 10. Refine the Panel
It's difficult to figure out the high-level write speed from the above panel, so let's enhance it.
Add a new query with the following configuration:
In the above query, there are two additional figures: 20s and 1m... How do you choose? 20s should be the same as sampling_cycle_inseconds in your tsample configuration file. If you choose a different value, then you could end up with misleading results.
Larger values such as "1m" may give you a smoother result, but they could also hide system instability. Going larger than 1m is not recommended in most cases.
With this new query, it's much easier to figure out what the average write speed in current testing is.
Tips: if your sampling_Cycle_inseconds is 30s, then you may not need this additional query. The following image is a sample at the 30s interval time. You would not need an additional average query to get a smooth write speed.
The next example is a sample at the 10s interval time. Without additional queries, you may not be able to get a meaningful understanding of the write speed.
From the above three examples, it's recommended to configure the sampling interval time at 30s, or anything larger than 20s. You can then choose whether you need additional queries based on the visualization result.
Step 11. Further Refinement
The above charts illustrate the queuing and writing speed. However, it is possible that the Value Stream may perform at a reasonable speed, but the Value Stream queue may be growing and could exceed its capacity. Let's add another query to monitor this:
However, it is difficult to read this chart, since it has a different value range on the y-axis:
Let's move this query to a second y-axis on the right:
This will make the view much easier to see:
The current queue size or remaining queue size will always move up and down; it is healthy as long as it does not continue to grow to a high level.
What Else Can Be Monitored?
The following metrics would be monitored by most customers:
Value Stream write and queue speed
Value Stream queue size
Stream write and queue speed
Stream queue size
Event performed speed (completedTaskCount)
Event submitted speed (submittedTaskCount)
Event queue size
ThingWorx Memory Usage Monitoring
Create a new panel and add a new query:
In a running system, memory usage will always move up and down - at times sharply or quickly - when the system is busy. The system is healthy as long as memory doesn't go up continuously or stay at a maximum for a long period of time.
Setting up monitoring is absolutely crucial to managing the performance of an enterprise ThingWorx application. Using Grafana makes tracking and visualizing the performance much easier. Stay tuned to the EDC tag for more monitoring tips to come!
... View more
We’re actively working towards the ThingWorx 9.0 release and we’re ready to provide a sneak peek into the biggest feature of 9.0: Active-Active Clustering for High Availability configuration.
You may be wondering: Doesn’t ThingWorx already offer High Availability?
Yes, ThingWorx already supports a High Availability configuration. Previous versions of ThingWorx, such as ThingWorx 8.X version releases, support Active-Passive configuration, where one “active” ThingWorx server performs all processing and maintains the live connections to other systems such as databases and connected assets. Meanwhile, in parallel, there is a second “passive” ThingWorx server that is a mirror image and regularly updated with data but does not maintain active connections to any of the other systems. If the “active” ThingWorx server fails, the “passive” ThingWorx server is made the primary server, but this can take a few minutes to establish connections to the other systems.
So, how is ThingWorx Active-Active different?
Active-Active configuration differs from Active-Passive in that all the ThingWorx servers in the cluster are “active.” Not only is data mirrored across all ThingWorx servers, but all of the servers, instead of only one, maintain live connections with the other systems. This way, if any of the ThingWorx servers fail, the other ThingWorx servers take over instantaneously with no recovery time.
Since all ThingWorx servers are active, they are processing in parallel and, as a result, the cluster can process more data than that of a single server or a cluster with an Active-Passive configuration. Simply put, multiple servers working together outperform a single server. This allows customers to scale their deployment by simply adding more ThingWorx servers to the cluster (horizontally scaling), which does not have the same limitations of scale that is achieved by increasing the performance of the server itself.
What does that mean for me?
Higher Availability - You can avoid single points of failure and configure the ThingWorx Foundation platform in an Active-Active cluster mode to achieve the highest availability for your IIoT systems and applications.
Increased Scalability - Now, you can horizontally scale from one to many ThingWorx servers to easily manage large amounts of your IIoT data at scale more smoothly than ever before.
Stay tuned! We’ll be posting more information on Active-Active Clustering—how it's achieved in ThingWorx, architectural component overviews, and what it means for your ThingWorx deployment!
In the meantime, we're running the Active-Active Clustering Beta Program. Interested in participating? Reach out to Ryan Servais (firstname.lastname@example.org) or Ayush Tiwari (email@example.com) to learn more about participation!
... View more
Support for Microsoft Edge browser (version 79 or greater)
Bug fixes and minor improvements
Bug fixes and minor improvements
Experience Service 8.5.6 address a critical security issue (CVE) in Node.js
Bug fixes and minor improvements
... View more
Time series prediction uses a model to predict future values based on previously observed values. Time series data differs somewhat from non-time series data in both the formatting of the data and the training of predictive models. This article will highlight several important considerations when dealing with time series data.
Preparing Time Series Data :
The data must contain exactly one field with Op Type “TEMPORAL” and one field with Op Type “ENTITY_ID”, which defines the identifier for an entity, such as a machine serial number. The ENTITY_ID field should remain the same as long as there are no missing timestamps and it is within the same asset but should be different for different assets or asset runs in order to accurately assign history during model training and scoring.
The TEMPORAL field is a numeric field indicating the order of the data rows for a specific entity . One should also ensure that data is formatted such that the timestamps are equally spaced (for example, one data point every minute) and that no gaps exist in the sequence of numbers.
If there are gaps in the time series data, it is recommended to restart the series after the gap as a new entity. Alternatively, if the gap is small enough (few data points), linear interpolation based on the gap endpoint values within the same entity is generally acceptable.
Model Creation in Time Series :
When creating a timeseries model in Analytics Builder, you will be asked to specify a lookback size and lookahead parameter. The lookback size determines how many historical datapoints (including the current row) will be used in the model. The lookahead indicates how many time steps ahead to predict. If the value of the goal variable is not known at time of scoring, unchecking Use Goal History will use the goal column during training but not its history during scoring.
Time Series models can also be created in Services using the Training Thing. The lookback size and lookahead parameter are specified in the CreateJob service. The virtualSensor field is used to indicate if the model should be trained to predict values for a field that will not be available during scoring. For example, one can train a time series model to predict Volume using evolving Temperature and Pressure, based on sensor data for these three variables over a period of time. However, the Volume sensor may be removed from further assets in order to reduce costs, and the predictive model can be used instead.
Two important considerations:
ThingWorx Analytics will expand historical data in the time series into new columns. This process creates new features using the values of the previous time steps. Additionally, low order derivatives, together with average and standard deviation features are computed over small contiguous subgroups of the historical data.
The expansion process can make the dataset exceptionally wide, so time series training is generally significantly slower compared to training with no history on the same dataset. This gets exacerbated when lookback size = 0 (auto-windowing, a process where the system is trying to find the optimal lookback). If there are columns that are not changing or change infrequently (such as a device serial number or zip code of the device’s location), these should be marked as Static when importing the data. Any columns labeled Static will not be expanded to create new features. Care also needs to be taken to exclude any features that are known to not be relevant to the prediction.
Using a large lookback can eliminate how many examples / entities the model has available to train. For example, if a lookback of 8 is used, then any entities that have less than 8 examples will not be used in training. For the same reason, scoring for time series produces less results than the number of rows provided as input: if 10 rows are provided and lookback is 6, then only 5 predictions will be produced.
... View more
Checkout the below video which explores the Creo As A Service (CAAS) feature of the Product Insight extension.
This allows to retrieve Creo analysis computation inside ThingWorx through the Analytics Manager framework.
... View more