cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

ThingWorx Navigate is now Windchill Navigate Learn More

IoT & Connectivity Tips

Sort by:
Hi Community,   Although we have reference architectures and integration paths for connecting devices to ThingWorx through Azure IoT; no one has ever written anything about doing the same from one ThingWorx to another.  I thought I’d change that and put some ideas out there around how one might go about doing this.  Although this is not officially supported or recommended by PTC; I have consulted with a number of leading SMEs on the subject, which have participated in forming the basis of my thinking outlined here.   Components Required (in order of communication path): On-premise ThingWorx Platform Protocol Adapter Toolkit* (CXS) - MQTT Azure IoT Edge Azure IoT Hub ThingWorx Azure IoT Hub Connector (CXS) Azure Cloud-hosted ThingWorx Platform   PAT (2) with codec to encode MQTT messages publishes to on-premise IoT Edge MQTT endpoint which handles store-and-forward of messages to IoT Hub.  An Azure IoT device would exist for each Thing you wish to represent on the ThingWorx servers.  The Azure IoT Hub Connector would pick-up the incoming messages and pass them on to the cloud ThingWorx which would decode the MQTT payload and map to Thing property updates.   The only part that I presently don’t like about this approach is that you’ll need to decode the MQTT messages on the ThingWorx platform in the cloud when they are received from the IoT Hub, and this mechanism will need to also need to handle encoding and publishing back to the IoT Hub if C2D (Cloud-to-Device) messages are to be implemented (aka bi-directional).  This is required as ThingWorx only supports AlwaysOn as an application level protocol so some form of mapping needs to be done.   * Another approach would be to replace the PAT with a custom agent which implements both the ThingWorx Edge SDK and the Azure IoT device SDK   Regards,   Greg Eva
View full tip
Hi Community,   I've recently had a number of questions from colleagues around architectures involving MQTT and what our preferred approach was.  After some internal verification, I wanted to share an aggregate of my findings with the ThingWorx Architect and Developer Community.   PTC currently supports four methods for integrating with MQTT for IoT projects. ThingWorx Azure IoT Hub Connector ThingWorx MQTT Extension ThingWorx Kepware Server Choice is nice, but it adds complexity and sometimes confusion.  The intent of this article is to clarify and provide direction on the subject to help others choose the path best suited for their situation.   ThingWorx MQTT Extension The ThingWorx MQTT extension has been available on the marketplace as an unsupported “PTC Labs” extension for a number of years.  Recently its status has been upgraded to “PTC Supported” and it has received some attention from R&D getting some bug fixes and security enhancements.  Most people who have used MQTT with ThingWorx are familiar with this extension.  As with anything, it has advantages and disadvantages.  You can easily import the extension without having administrative access to the machine, it’s easy to move around and store with projects, and can be up and running quite quickly.  However it is also quite limited when it comes to the flexibility required when building a production application, is tied directly to the core platform, and does not get feature/functionality updates.   The MQTT extension is a good choice for PoCs, demos, benchmarks, and prototypes as it provides MQTT integration relatively quickly and easily.  As an extension which runs with the core platform, it is not a good choice as a part of a client/enterprise application where MQTT communication reliability is critical.   ThingWorx Azure IoT Hub Connector Although Azure IoT Hub is not a fully functional MQTT broker, Azure IoT does support MQTT endpoints on both IoT Hub and IoT Edge.  This can be an interesting option to have MQTT devices publish to Azure IoT and be integrated to ThingWorx using the Azure IoT Hub Connector without actually requiring an MQTT broker to run and be maintained.  The Azure IoT Hub Connector works similarly to the PAT and is built on the Connection Server, but adds the notion of device management and security provided by Azure IoT.   When using Azure IoT Edge configured as a transparent gateway with buffering (store and forward) enabled, this approach has the added benefit of being able to buffer MQTT device messages at a remote site with the ability to handle Internet interruptions without losing data.   This approach has the added benefit of having far greater integrated security capabilities by leveraging certificates and tying into Azure KeyVault, as well as easily scaling up resources receiving the MQTT messages (IoT Hub and Azure IoT Hub Connector).  Considering that this approach is build on the Connection Server core, it also follows our deployment guidance for processing communications outside of the core platform (unlike the extension approach).   ThingWorx Kepware Server As some will note, KepWare has some pretty awesome MQTT capabilities: both as north and southbound interfaces.  The MQTT Client driver allows creating an MQTT channel to devices communicating via MQTT with auto-tag creation (from the MQTT payload).  Coupled with the native ThingWorx AlwaysOn connection, you can easily connect KepWare to an on-premise MQTT broker and connect these devices to ThingWorx over AlwaysOn.   The IoT Gateway plug-in has an MQTT agent which allows publishing data from all of your KepWare connected devices to an MQTT broker or endpoint.  The MQTT agent can also receive tag updates on a different topic and write back to the controllers.  We’ve used this MQTT agent to connect industrial control system data to ThingWorx through cloud platforms like Azure IoT, AWS, and communications providers.   ThingWorx Product Segment Direction A key factor in deciding how to design your solution should be aligned with our product development direction.  The ThingWorx Product Management and R&D teams have for years been putting their focus on scalable and enterprise-ready approaches that our partners and customers can build upon.  I mention this to make it clear that not all supported approaches carry the same weight.  Although we do support the MQTT extension, it is not in active development due to the fact that out-of-platform microservices-based communication interfaces are our direction forward.   The Azure IoT Hub Connector, being built on the Connection Server is currently the way forward for MQTT communications to the ThingWorx Foundation.   Regards,   Greg Eva
View full tip
  Ever dreamed of participating in the design of the latest ThingWorx features? Now’s your chance! Join the UX Lab in usability and research sessions to help inform the design of your favorite ThingWorx features from the Edge to Kepware to remote monitoring to Solution Central and more!   For those of you newer to PTC, the UX Lab is an opportunity to see early views of product mockups, wireframes, etc. to provide your direct feedback; the UX Lab is part of LiveWorx, our definitive event for digital transformation, and this year both are virtual!   Click here to influence the design of Edge, Kepware, remote monitoring, Solution Central, and so much more!   Reach out with any questions!   Stay connected, Kaya
View full tip
We are bringing the Liveworx UX Lab to you this year! Contribute to the design and development of PTC’s IoT & AR products from the comfort of your home.   Participate in online 1:1 session with PTC Researchers and Designers to take our prototypes & conceptual designs for a spin, share your expertise to directly impact the experience of our future products.   For all Liveworx 2020 UX Lab sessions, click here   IoT & AR session links: SOLUTION CENTRAL: User Management SOLUTION CENTRAL: PTC Solution Deployment SERVICE: Remote Monitoring Solution THINGWORX: Managing Building Blocks in Composer THINGWORX KEPWARE EDGE: Container Based Software at Scale EDGE: Next Generation IoT Edge (Asset) Manager THINGWORX: Asset Modeling DIGITAL PERFORMANCE MANAGEMENT SOLUTION: Bottleneck ID & Balanced Scorecard MANUFACTURING: PTC Solution Strategy using Building Blocks IMPLEMENTING AR and IOT: Successes and Difficulties VUFORIA EDITOR: Authoring Work Instructions VUFORIA EDITOR: Authoring Work Containing Augmented Reality
View full tip
ThingWorx 8.5 Architecture Deployment Guide The ThingWorx 8.5 Architecture Deployment Guide has recently been updated with a few bug fixes and semantic corrections. Note that older guides are still available as well. Look forward to the 9.0 Deployment Guide coming soon.   Happy developing!
View full tip
  Hi, everyone!   In previous tech tips, we’ve introduced 9.0’s biggest feature, active-active clustering, and covered some of the main architectural components. Today, I’ll cover some other architectural configurations and the flexibility that active-active clustering provides to meet your IoT deployment needs.   Deployment Flexibility With Active-Active Clustering, ThingWorx allows you to achieve the highest availability of your IIoT system while still retaining the high degree of deployment flexibility that ThingWorx is known for.   It’s important to emphasize flexibility not only in where you deploy ThingWorx, but also in how  you deploy it. Flexibility is embedded in the ThingWorx architecture to suit your deployment needs. Let’s look at two interesting options.   Cost-Efficient Deployment The first example is deploying ThingWorx in a cost-efficient architecture for smaller or cost-sensitive deployments. Instead of each component on its own separate VM or box, ThingWorx Connection Server, ThingWorx Foundation Server, ZooKeeper, and Ignite are all installed on the same box.   With three boxes in parallel, the system still achieves high availability while reducing deployment costs by minimizing the number of servers utilized. A minimum of three servers is needed to achieve an odd number of instances required by ZooKeeper while still providing redundancy.   In order to optimize for cost while still providing high availability, there are a few factors to consider with this deployment. In this configuration, Ignite is run within the same JVM as ThingWorx Foundation, rather than on a different server. While this does reduce the overall footprint, Ignite will consume additional memory, sharing the resources with the ThingWorx Foundation platform. Since ThingWorx Connection Server is run on the same VM as ThingWorx Foundation Server, this introduces socket limitations, which can restrict the maximum number of connected devices.     In short, the example above illustrates how, while being in an active-active clustered environment, you still retain the deployment flexibility to optimize for your deployment needs. In this scenario, the architecture is optimized for a leaner, more cost-effective deployment at the expense of numbers of connections and memory.   Now, let’s walk through some common use cases where ThingWorx is deployed on premise on VMs or baremetal to support use cases that require a balance of availability and scalability at an affordable cost. A few of these examples include: On-premise deployment of ThingWorx Foundation to support a dedicated factory production line on a plant floor to improve its connected manufacturing operations. VM based on-site deployment of ThingWorx Foundation to support IoT applications for a Smart Building scenario. Smart Ship deployments managing key ship operations where ThingWorx is deployed on a ship and sends data intermittently to private data centers running ThingWorx on shore through a satellite network leveraging ThingWorx federation technology. A failover-proof redundant setup required to capture key service metrics for medical equipment for predictive failures and maintenance where ThingWorx is deployed in hospital premises. OEM Deployment with Common Components The second example covers how active-active clustering architectural components can be shared across multiple clusters to reduce the cost and complexity of deployment. This can include scenarios where a large deployment spans multiple sites, geographies or end customers.   In this deployment, there are two ThingWorx clusters that share a common ZooKeeper cluster and database layer. Within each cluster, both the Connection Server and ThingWorx Foundation with Ignite running as embedded are deployed on a VM.   For the shared components, proper separation measures must be taken to ensure security. For ZooKeeper, separate namespaces and authentication from the clients are required. For the shared database layer, each cluster must have a unique schema, a separate shared file system and separate user credentials with database-level isolation. Please note that with a large shared database across tenants, rather than a dedicated database for each tenant of HA cluster, there is a risk of impacting performance and availability of other tenants due to issues caused by one of the shared ones.     This architecture is optimized for a larger deployment comprised of multiple ThingWorx clusters requiring separation between the clusters, while still allowing the provider to share a common infrastructure for cost reduction.   Again, let’s walk through another few common use cases, including: A ThingWorx OEM hosting multiple ThingWorx Foundation instances in their managed data center or public cloud to offer applications with further customization abilities to their end users. In this case, an OEM vendor can offer multiple ThingWorx clusters in active-active cluster mode with shared pieces of infrastructure across multiple tenants. A ThingWorx enterprise customer looking to build and host different ThingWorx-based applications in their IT data center where each application is powered by an individual active-active cluster satisfying specific digital transformation use cases across the enterprise value chain, including those related to engineering, product, sales, marketing, supply chain, partners, service and support. A ThingWorx factory customer looking to host multiple ThingWorx applications from their main regional factory IT data center to cater to the IoT needs of different factories located in surrounding geographic regions. In this case, a customer could dedicate one HA cluster to each factory to enable that factory/site to power multiple IoT applications efficiently.   These two configurations are some of the lesser known—yet equally powerful—deployments of this new architecture. Again, flexibility of our deployment architecture is one of the key superpowers of the ThingWorx platform. And, the active-active fun doesn’t stop there! Be on the lookout for an upcoming LiveWorx session titled Active-Active Clustering with ThingWorx 9.0  (Session ID: IP1117B), future tech tips like this one, and, of course, the GA release of ThingWorx 9.0 (targeted for June 2020) where you can take advantage of this new functionality!   Let us know what you think in the comments!   Stay connected, Kaya      
View full tip
Here is a spreadsheet that I created which helps to estimate data transfer volumes for the purpose of estimating egress costs when transferring data out of region.   You find that there are a number of input parameters like numbers of assets, properties, file sizes, compression ratio, as well as a page with the cost elements which can be updated from the Interweb.    
View full tip
  Hi everyone,   In anticipation of ThingWorx 9.0’s biggest feature, active-active clustering, we’d like to provide an architectural overview of a sample active-active configuration and its underlying components. If you haven’t already seen it, we invite you to read our previous Community tech tip, where we introduce the concept of active-active clustering for ThingWorx Foundation, which enables you to: significantly reduce unplanned downtime for your mission-critical services and apps support horizontal scalability of the ThingWorx Server where you can scale your services up and down based on your requirements easily run, package, deploy, and operate advanced apps and services with the help of intuitive browser-based navigation, interactive monitoring and debugging tools, and more deploy anywhere - public cloud, private datacenter, on-premise, hybrid, or even locally on your laptop with deep optimizations for Azure Now, before we go too deep, we’d like to let you know that you can continue to seamlessly upgrade from previous versions of ThingWorx releases to upcoming ThingWorx 9.0  releases. Previously, you could deploy ThingWorx Foundation in a “single server” mode, and, for a high availability in “active-passive cluster” mode (see here for details). In the ThingWorx 9.0 release and onwards, you’ll be able to continue to deploy ThingWorx Foundation in a “single server” mode and for high availability scenarios via our new “active-active cluster” mode. Please note that active-passive clustering configuration will no longer be supported in ThingWorx 9.0 or onwards.   Let’s start with a quick recap on how the ThingWorx Foundation 9.0 release would look like in single server deployment.   ThingWorx 9.0 Deployed in a "Single-Server" Architecture Below is a high-level diagram depicting the main architectural layers and components of ThingWorx Foundation deployed in a single server mode.   Deployment Components Below is a brief summary of all major architectural components and their purpose in the deployment architecture:   The Client Layer This layer is comprised of everything that connects with, sends data to, and receives content from the ThingWorx platform. It be broken down into two groups: Devices/Things: Things, devices, agents, and other assets. Users/Clients: People and the respective products (primarily web browsers) they use to access ThingWorx.   The Application Layer This layer is where ThingWorx Foundation and other applications deployed with ThingWorx Foundation reside, such as ThingWorx Analytics, ThingWorx Connection Server, ThingWorx Azure IoT Hub Connector and others. This layer provides connectivity to the client layer, performs authentication and authorization checks, ingests/processes/analyzes content, and reacts to conditions by sending alerts. For a specific ThingWorx Foundation deployment that needs basic device data ingestion, processing and storage, you can setup only ThingWorx Foundation server. In some cases, with large number of device connections, you may want to setup ThingWorx Connection Server with ThingWorx Foundation in a single server for further scalable connectivity. ThingWorx Foundation: ThingWorx Foundation is a java-based application that serves as a rapid, model-based application development platform. Shared File Storage: Shared Disk space to contain ThingWorx Storage repositories, store and archive log files accessed by all ThingWorx Foundation servers. A NAS file storage, AWS Elastic File System, or Azure Files and others could be used for this purpose.   The Data Layer ThingWorx Foundation includes several persistence provider implementations that enable you to choose a database option that best fits your use case. A persistence provider enables the connection to a data store and the ability to perform a CRUD operation on that data. See here for more information. Currently, there are two basic variations of persistence providers: Model Provider – Responsible for ThingWorx model metadata and system data. Data Provider – Responsible for runtime data ingested against the model elements, including streams, value streams, data tables, etc. ThingWorx supports H2 (in-memory Database), PostgreSQL, MS SQL Server and AzureSQL as both model and data providers, and InfluxDB as only a data provider. Please see here for model and data best practices.   ThingWorx 9.0 Deployed in an Active-Active Clustering Reference Architecture Below is a reference architecture diagram for ThingWorx 9.0 with multiple ThingWorx Foundation servers configured in an active-active cluster deployment. Please note that this is only one reference example of how ThingWorx 9.0 can be deployed in an active-active clustered environment. There could be other architectural configurations dependent upon the needs of the specific deployment. Deployment Components Once you have developed an understanding of the basic architectural components in a single server mode, below are the additional components required to run ThingWorx in active-active cluster mode.   The Client Layer This will be similar to what has been mentioned in the above single server configuration.   The Application Layer In this layer, if you’re familiar with ThingWorx active-passive cluster configuration, then you may be aware of most of the components used below—with the exception of a new component: Apache Ignite that provides Distributed Caching for the horizontally scalable ThingWorx Foundation servers. Load Balancer: A third-party device that receives network traffic and distributes requests among available servers. In active-active cluster configuration, the load balancer is used to direct WebSocket-based traffic to the ThingWorx Connection Servers while user requests (http/https) traffic is directly distributed to the ThingWorx Foundation servers. Users can continue to use a load balancer with ThingWorx 9.0 that they might already be using for their existing active-passive or single server deployments with ThingWorx 8.X or previous releases.  Some example load balancers include, but are not limited to: HAProxy, Azure Application Gateway, and AWS Application Load Balancer. ThingWorx Connection Services: These services handle all messages routing to and from devices, providing scalable connectivity to the ThingWorx Foundation Server. With the ThingWorx 9.0 release, ThingWorx Connection Services have been upgraded with many additional features to support active-active clustering of the ThingWorx Foundation servers, where now they route all WebSocket traffic in a round robin fashion to the connected ThingWorx Foundation servers. Depending upon the various use cases, one could use multiple ThingWorx Connection Services available, such as ThingWorx Connection Server, ThingWorx Azure IoT Hub Connector, and ThingWorx Protocol Adapter Toolkit. Please see here for further details. Please note that for ThingWorx 9.0 releases, ThingWorx Connection Server would be required in an active-active configuration to support all the WebSocket-based traffic routing, including egress of files and device messages from multiple ThingWorx Foundation servers back to the devices, and it would also serve as WebSocket communication from ThingWorx Mashup-based applications to ThingWorx Composer. ThingWorx Foundation: With ThingWorx 9.0 and onwards, you can set up ThingWorx Foundation servers in an N-active-active cluster model to provide higher availability to your applications and horizontally scale the Foundation server nodes up and down based on your scalability needs. Apache Zookeeper: Apache ZooKeeper is a centralized service for maintaining configuration information and naming as well as providing distributed synchronization and group services. It is a coordination service for distributed applications that enables synchronization across a cluster. Specific to ThingWorx, ZooKeeper is used for distributed locking, selecting a singleton server during the server initialization, service discovery for Apache Ignite, allowing it to find instances of ThingWorx Foundation servers. Apache Ignite: This offers a distributed cache for the active-active cluster setup. It is used by ThingWorx Foundation Servers to share state. It may be embedded with each ThingWorx instance or can be run as a standalone cluster for larger scale. In this configuration, Ignite is set up in a standalone cluster but can be run embedded within the ThingWorx Foundation. Running Ignite in a standalone cluster is more ideal for larger scale, as it supports higher vertical scale of memory in the deployment setup.   The Data Layer ThingWorx is largely database agnostic. You can continue to use officially supported persistence providers that you may already be using in your existing deployments based off of ThingWorx 8.X or previous releases. Please look out for an upcoming ThingWorx update as well as enhanced installation documents to help with your upgrade and migration questions with the general availability of ThingWorx 9.0. Please note that this diagram does not make the distinction between model and data providers; depending on your data ingestion needs, separate model and data providers can be used. As a reminder, all databases should be deployed in a high-availability configuration to help eliminate any single point of failure.   In closing, we can't wait to launch active-active clustering in 9.0 to help you: dramatically further reduce application downtime scale your deployments and more efficiently manage your apps, regardless of where they’re deployed   If you have any questions about active-active clustering or its architecture, please do not hesitate to reach out!   Stay connected! Kaya
View full tip
I created this recently for another group -  might be useful    Video Link to Create a Database connection to you Postgres Thingwork Database 
View full tip
Still not sure what the reference benchmarks have to offer? Check out this short video abstract reviewing the purpose of the reference benchmarks, some notes on how to read the guide, and information about what these guides will have to offer. Then, check out the reference benchmark for remote monitoring here!   ~~
View full tip
  Hi, everyone!   We’re actively working towards the ThingWorx 9.0 release and we’re ready to provide a sneak peek into the biggest feature of 9.0: Active-Active Clustering for High Availability configuration.   You may be wondering: doesn’t ThingWorx already offer High Availability?   Yes, ThingWorx already supports a High Availability configuration. Previous versions of ThingWorx, such as ThingWorx 8.X version releases, support Active-Passive configuration, where one “active” ThingWorx server performs all processing and maintains the live connections to other systems such as databases and connected assets. Meanwhile, in parallel, there is a second “passive” ThingWorx server that is a mirror image and regularly updated with data but does not maintain active connections to any of the other systems. If the “active” ThingWorx server fails, the “passive” ThingWorx server is made the primary server, but this can take a few minutes to establish connections to the other systems.   So, how is ThingWorx Active-Active different?   Active-Active configuration differs from Active-Passive in that all the ThingWorx servers in the cluster are “active.” Not only is data mirrored across all ThingWorx servers, but all of the servers, instead of only one, maintain live connections with the other systems. This way, if any of the ThingWorx servers fail, the other ThingWorx servers take over instantaneously with no recovery time.   Since all ThingWorx servers are active, they are processing in parallel and, as a result, the cluster can process more data than that of a single server or a cluster with an Active-Passive configuration. Simply put, multiple servers working together outperform a single server. This allows customers to scale their deployment by simply adding more ThingWorx servers to the cluster (horizontally scaling), which does not have the same limitations of scale that is achieved by increasing the performance of the server itself.   What does that mean for me? Higher Availability - You can avoid single points of failure and configure the ThingWorx Foundation platform in an Active-Active cluster mode to achieve the highest availability for your IIoT systems and applications. Increased Scalability - Now, you can horizontally scale from one to many ThingWorx servers to easily manage large amounts of your IIoT data at scale more smoothly than ever before.   Stay tuned! We’ll be posting more information on Active-Active Clustering—how it's achieved in ThingWorx, architectural component overviews, and what it means for your ThingWorx deployment!   In the meantime, we're running the Active-Active Clustering Beta Program. Interested in participating? Reach out to Ryan Servais (rservais@ptc.com) or Ayush Tiwari (atiwari@ptc.com) to learn more about participation!   Stay connected! Kaya
View full tip
ThingWorx Performance Monitoring with Grafana authored by EDC team member Desheng Xu ( @xudesheng )   Monitoring ThingWorx performance is crucially important, both during the load testing of a newly completed application, and after the deployment of new code in an existing application. Monitoring performance ensures that everything works as expected at the Enterprise level.  This tutorial steps you through configuring and installing a tool  which runs on the same network as the ThingWorx instance. This tool collects data from the Platform and translates it into something visual and easy to understand via Grafana.    tsample is  small and customizable, and it plays a similar role to telegraf. Its focus is on gathering ThingWorx performance metrics. Historically, this tool also supported collecting OS level performance metrics, but this is no longer supported. It is highly recommended to collect OS level performance metrics by using telegraf, a tool designed specifically for that purpose (and not discussed here). This is not the only way to go about monitoring ThingWorx performance, but this tool uses a very good approach that has been proven effective both at customer sites and internally by PTC to monitor scale tests.   Find the most recent release here.   Recommended Deployment Architecture tsample can be deployed in the same box where ThingWorx Tomcat is running, but it's recommended to deploy it on a separated box to minimize any performance impact caused by the collector. tsample supports export to InfluxDB and/or local file. In this document, it is assumed that InfluxDB will be used for monitoring purpose. Please note that this is not the same instance of InfluxDB being used by ThingWorx (if configured). This article will not cover setting up InfluxDB or NGINX (if necessary), so please configure these before beginning this tutorial.   Supported Platform tsample has been tested on Windows 2016, MacOS 10.15, Ubuntu 16.04, and Redhat 7.x.  It's anticipated to work on a more general Ubuntu/Redhat/Mac/Windows release as well. Please leave a comment or contact the author, @xudesheng , if Raspberry Pi support is needed.    Configuration File Where to Store the Configuration File tsample will pick up the configuration file in the following sequence: from the command line...   ./tsample -c <path to configuration file>​     from the environment... Linux:   export TSAMPLE_CONFIG=<path to configuration file> ./tsample​   Windows:   set TSAMPLE_CONFIG=<path to configuration file> tsample.exe   from a default location... tsample will try to find a file with the name "config.toml " from the same folder in which it starts.   How to Craft a Configuration File You can use following command to generate a sample file:     ./tsample -c config.toml -e     or:     ./tsample -c config.toml --export       A file with the name "config.toml " will be generated with a sample configuration. You can then adjust its content in accordance with the following.   Configuration File Content Format Configuration file must be in toml format. title and owner sections Both sections are optional. The intention of these two sections is to support doc tool in future. TestMachine section This is section is required, and it defines where this tool will run.   thingworx_servers section This section is where you define targeted ThingWorx applications. Multiple ThingWorx servers can be defined with the same or different metrics to be collected.   thingworx_servers.metrics sections Underneath each thingworx_servers section, there are several metrics. In default example, following metrics have been included: ValueStreamProcessingSubsystem DataTableProcessingSubsystem EventProcessingSubsystem PlatformSubsystem StreamProcessingSubsystem WSCommunicationsSubsystem WSExecutionProcessingSubsystem TunnelSubsystem AlertProcessingSubsystem FederationSubsystem You can add your own customized metrics, as long as the result follows the same Data Shape. The default Data Shape has 3 columns: If the output Data Shape exceeds this limit, the tool will likely not work properly.   result_export_to_db section This section defines the target InfluxDB as a sink of collected performance metrics.   result_export_to_file section This section defines the target file storage for collected performance metrics.   Grafana Configuration Example Monitor Value Stream Step 1. Connect Grafana to InfluxDB   Step 2: Create a New Dashboard   Step 3. Create a New Query Depending on which metrics you defined to collect in the tsample configuration file, you will see a different choice of measurement in Grafana. Here, we will use ValueStreamProcessingSubsystem as an example.   Step 4. Choose the Right Platform and Storage Provider Some metrics depend on the database storage provider, like Value Stream and Stream.   Step 5. Choose the Metrics Figures   Select "remove" to get rid of the default 'mean' calculation. Select "non_negative_difference" from Transformations. Using this transformation, Grafana can show us the speed of writes.     Then, remove the default GROUP BY "time" clause. Assign a meaningful alias of this query.   Step 6. Add Another Query You can add another query as 'Value Stream Queued Speed' by following the same steps.   Step 7. Assign a Panel Title   Step 8. Review the Result Let's go back to the dashboard page and select "last 15 minutes" or "last 5 minutes" from the top right corner. It should show a result similar to the chart below.   Step 9. Save the Dashboard Don't forget to save your dashboard before we add more panels.   Step 10. Refine the Panel It's difficult to figure out the high-level write speed from the above panel, so let's enhance it. Add a new query with the following configuration: In the above query, there are two additional figures: 20s and 1m... How do you choose? 20s should be the same as sampling_cycle_inseconds in your tsample configuration file. If you choose a different value, then you could end up with misleading results. Larger values such as "1m" may give you a smoother result, but they could also hide system instability. Going larger than 1m is not recommended in most cases. With this new query, it's much easier to figure out what the average write speed in current testing is.   Tips: if your sampling_Cycle_inseconds is 30s, then you may not need this additional query. The following image is a sample at the 30s interval time. You would not need an additional average query to get a smooth write speed.   The next example is a sample at the 10s interval time. Without additional queries, you may not be able to get a meaningful understanding of the write speed. From the above three examples, it's recommended to configure the sampling interval time at 30s, or anything larger than 20s. You can then choose whether you need additional queries based on the visualization result.   Step 11. Further Refinement The above charts illustrate the queuing and writing speed. However, it is possible that the Value Stream may perform at a reasonable speed, but the Value Stream queue may be growing and could exceed its capacity. Let's add another query to monitor this: However, it is difficult to read this chart, since it has a different value range on the y-axis: Let's move this query to a second y-axis on the right: This will make the view much easier to see: The current queue size or remaining queue size will always move up and down; it is healthy as long as it does not continue to grow to a high level.   What Else Can Be Monitored? The following metrics would be monitored by most customers: Value Stream write and queue speed Value Stream queue size Stream write and queue speed Stream queue size Event performed speed (completedTaskCount) Event submitted speed (submittedTaskCount) Event queue size Websocket communication Websocket connection   ThingWorx Memory Usage Monitoring Create a new panel and add a new query: In a running system, memory usage will always move up and down - at times sharply or quickly - when the system is busy. The system is healthy as long as memory doesn't go up continuously or stay at a maximum for a long period of time.   Conclusion Setting up monitoring is absolutely crucial to managing the performance of an enterprise ThingWorx application. Using Grafana makes tracking and visualizing the performance much easier. Stay tuned to the EDC tag for more monitoring tips to come!
View full tip
  Hello, everyone! Discover how we embed security throughout the entire lifecycle of the ThingWorx platform in our latest “ThingWorx on Air” episode!   Hear Walter walk through how the ThingWorx platform is secured from end to end. Walter breaks it down into three simple parts: secure design, secure coding practices and continuous security improvements via our maintenance releases.   Listen to Episode 07 to hear the steps we’re taking in each of these areas and how security is at the forefront of what we do.   Finally, Walter mentions the Secure Deployment Hub, our brand-new set of resources to help you securely deploy your ThingWorx apps. Check out my last tech tip to learn more.   As always, stay connected, Kaya
View full tip
Hi,   If you need to change the used hostname at installation of Thingworx Flow, some manual changes should be done without re-installing Flow. Basically, hostname for Flow should be changed in the nginx configuration and in Flow modules configuration; whenever you see the hostname used at Flow installation, change it with the new hostname.   Change the following configurations after renaming the ThingWorx Flow server in Windows OS : 1. Stop Flow, Nginx and ThingWorx Tomcat services 2. Update C:\Program Files\ <nginx>\conf\conf.d\vhost-flow.conf server_name : change hostname with new one      3. Update C:\Program Files (x86)\ <ThingWorxFlow>\modules\lookup\deploymentConfig.json ENDPOINT : change hostname with new one      4. Update <ThingWorxFlow>\modules\oauth\deploymentConfig.json UI_ENDPOINT : change hostname with new one ENDPOINT : change hostname with new one      5. Update <ThingWorxFlow>\modules\trigger\deploymentConfig.json DOMAIN : change hostname with new one TRIGGER_HOST : change hostname with new one 6. Update <ThingWorxFlow>\modules\ux\deploymentConfig.json api_endpoint : change hostname with new one view > oauth_server : change hostname with new one service_api_endpoint : change hostname with new one      If the ThingWorx Platform is installed on the Flow server : enterprise > built > host + prefix_url : change hostname with new one 7. If the ThingWorx Platform is not installed on the Flow server: Stop Thingworx Tomcat service Update <ThingworxPlatform>\platform-settings.json         PlatformSettingsConfig >  OrchestrationSettings > QueueHost : change flow hostname with new one 8. Restart the Thingworx, Flow and Nginx services   After these steps, Flow should be accessible with the new hostname: https://new_hostname:port/Thingworx/Composer/apps/flow/     Regards, Raluca Edu    
View full tip
  Whether your ThingWorx instances are deployed on premise, in the cloud or a hybrid of the two, I’d like you to imagine: You have a super cool app. You want to deploy it securely. You’re not a security expert. What do you do? How do you know how to securely deploy your app? Where do you go for security best practices? Introducing the new ThingWox Secure Deployment Hub!   The ThingWorx Secure Deployment Hub is a new section of our support site that will introduce you to the ThingWorx security landscape and direct you to security resources pertaining to the Edge, the platform and beyond.   From permissions and provisioning, to subsystems and SSO, the hub is packed with our recommendations and best practices for you to deploy your app in a secure fashion.   Happy deploying! Kaya
View full tip
Recently I needed to be able to parse and handle XML data natively inside of a ThingWorx script, and this XML file happened to have a SOAP namespace as well. I learned a few things along the way that I couldn’t find a lot of documentation on, so am sharing here.   Lessons Learned The biggest lesson I learned is that ThingWorx uses “E4X” XML handling. This is a language that Mozilla created as a way for JavaScript to handle XML (the full name is “ECMAscript for XML”). While Mozilla deprecated the language in 2014, Rhino, the JavaScript engine that ThingWorx uses on the server, still supports it, so ThingWorx does too. Here’s a tutorial on E4X - https://developer.mozilla.org/en-US/docs/Archive/Web/E4X_tutorial The built-in linter in ThingWorx will complain about E4X syntax, but it still works. I learned how to get to the data I wanted and loop through to create an InfoTable. Hopefully this is what you want to do as well.   Selecting an Element and Iterating My data came inside of a SOAP envelope, which was meaningless information to me. I wanted to get down a few layers. Here’s a sample of my data that has made-up information in place of the customer's original data:                <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" headers="">     <SOAP-ENV:Body>         <get_part_schResponse xmlns="urn:schemas-iwaysoftware-com:iwse">             <get_part_schResult>                 <get_part_schRow>                     <PART_NO>123456</PART_NO>                     <ORD_PROC_DIV_CD>E</ORD_PROC_DIV_CD>                     <MFG_DIV_CD>E</MFG_DIV_CD>                     <SCHED_DT>2020-01-01</SCHED_DT>                 </get_part_schRow>                 <get_part_schRow>                     <PART_NO>789456</PART_NO>                     <ORD_PROC_DIV_CD>E</ORD_PROC_DIV_CD>                     <MFG_DIV_CD>E</MFG_DIV_CD>                     <SCHED_DT>2020-01-01</SCHED_DT>                 </get_part_schRow>             </get_part_schResult>         </get_part_schResponse>     </SOAP-ENV:Body> </SOAP-ENV:Envelope> To get to the schRow data, I need to get past SOAP and into a few layers of XML. To do that, I make a new variable and use the E4X selections to get there: var data = resultXML.*::Body.*::get_part_schResponse.*::get_part_schResult.*; Note a few things: resultXML is a variable in the service that contains the XML data. I skipped the Envelope tag since that’s the root. The .* syntax does not mean “all the following”, it means “all namespaces”. You can define and specify the namespaces instead of using .*, but I didn’t find value in that. I found some sample code that theoretically should work on a VMware forum: https://communities.vmware.com/thread/592000. This gives me schRow as an XML List that I can iterate through. You can see what you have at this point by converting the data to a String and outputting it: var result = String(data); Now that I am to the schRow data, I can use a for loop to add to an InfoTable: for each (var row in data) {      result.AddRow({         PartNumber: row.*::PART_NO,         OrderProcessingDivCD: row.*::ORD_PROC_DIV_CD,         ManufacturingDivCD: row.*::MFG_DIV_CD,         ScheduledDate: row.*::SCHED_DT     }); } Shoo! That’s it! Data into an InfoTable! Next time, I'll ask for a JSON API. 😊
View full tip
This might be a well-known topic for some, but I recently had a need that Event Routers fit into perfectly and wanted to share. If you have some neat applications for Event Routers on mashups, feel free to reply!   What? Event Routers are a function on Mashups that let you connect multiple inputs to a single output. For my use case, this was extremely helpful to let me have two different Service Outputs go to the same Widget. They are a really simple tool that can save a lot of headache.   How? Event Routers work by funneling the latest data through to a single output. This is particularly useful for  user-activated actions with the output tied to a widget or another service. The Event Router automatically activates when any one of the Inputs changes.   Example I have two services that generate HTML from different sources, but I want to display just the latest one that the user had activated in a single HTML Text Area widget. The two different services are activated with two different buttons. But how do I show these two outputs in a single widget? Create an Event Router with two HTML inputs!         Now I just tie each service output to the Inputs and tie the Output to the HTML Text Area Text (note: the icon for Input2 is incorrect—it should be HTML as well; this system is running 8.5.1, perhaps it's an issue in that release).         Now when the user clicks on either button, the correct service’s HTML is sent to the HTML Text Area. Ta-da!   P.S. I noticed in some older posts that Event Routers used to be a widget or extension that came and went. Now (8.5+) it is baked into the Functions on the far right side of Mashup Builder.
View full tip
The Property Set Approach This article details an approach developed by Prachi Rath and Roy Clarke, refined by the EDC team in the December 2019's Remote Monitoring of Assets Reference Benchmark , and used to handle multi-property business rules in an Enterprise ThingWorx application.   Introduction If there are logic rules which depend upon multiple properties, and each property receives its updates one at a time, then each property will need to have an identical subscription, because there is no way for any one subscription to know the most up-to-date values for the other properties. This inefficient approach would create redundancy and sizing constraints, reducing the capacity of the application to scale up to the Enterprise level. The Property Set Approach resolves this issue by sending in all property updates as one Info Table or JSON property (called the “property set”), which can then have a single subscription. The property set is assembled on the Edge when an update needs to be sent, and then the Platform dissects, processes, and stores the data within this property set as required by the business logic.   This approach also involves caching the last property value into a runtime variable so that it can be referenced within the business logic subscription without having to be retrieved from the database. This can significantly improve the runtime of the subscription, reducing the number of resources required to sustain the business logic and ensuring that any alerts or events resulting from the business logic occur as soon as possible. It also reduces the load on the database, ensuring that data ingestion can complete unhindered.   So, while there are many benefits to this approach, it is also more complicated. It tightly couples the development of the Edge and Platform code and increases the application complexity, making it slightly less easy to maintain the application in the long run. The property set also requires a little more bandwidth and a more stable internet connection between Edge devices and the Platform since there is more metadata in an Info Table property, and therefore every update is slightly larger than it would be otherwise. So this approach is only recommended when multi-property rules are a requirement of the application and a stable internet connection exists between the Edge and Platform.   Platform Implementation I. Create an Info Table (or JSON) Property This tutorial uses the out-of-the-box Data Shape called NamedVTQ for the Info Table property, which is defined on a Thing Template as a remote property. It is important that this is not marked as persistent or logged, as the purpose is to reduce the amount of database writes and reads required by the Platform. The Info Table property has the following property definition:     <PropertyDefinition aspect.dataChangeType="ALWAYS" aspect.dataShape="NamedVTQ" aspect.isPersistent="false" baseType="INFOTABLE" isLocalOnly="false" name="numberPropertySetAsInfotable"/>       II. Create a Data Change Event Subscription for the Info Table Property The subscription has three parts: Cache the last value for the property in a runtime variable Start off the business rules processing, sending in the whole Info Table Send the Info Table to be logged as individual local properties in the database     // First step caches the last Value, refer to the next step… // Second step sets off the business rules processing with the Info Table me.ScaleTestBusinessRuleForPropertySetAsInfotable({ PropertySetAsInfotable: eventData.newValue.value }); // Third step sends the Info Table as one property into a service which parses it into the // individual properties, updating both the runtime properties on the remote thing and the database me.UpdatePropertyValues({ values: eventData.newValue.value /* INFOTABLE */ });       III. Set-Up Caching Each property which needs to be cached should be created on the Thing Template level and named in a similar way, say by placing the word “Last” at the end, such as “Property1” => “Property1Last”, “Property2” => “Property2Last”, etc. This property should NOT be logged or persistent, as the point of this is to store the most recent value in memory, removing any superfluous dependency on database queries in the process. Note that while storing the property in runtime memory makes it much more accessible, it also means that the property needs to be rewritten manually upon Platform restart. Additional code (not provided here) must be written to populate these properties from the database upon application start-up.   The following code should be placed in the data change event subscription (option 1 in the case where only a few properties need caching, or option 2 if every property value needs to be cached):   Option 1: Some but Not All Properties Need Caching     // Names of properties for which you want to cache the last value var propertyNames = ['number1', 'number2']; // Loop through the properties and cache their time if they are found in the property set propertyNames.map(assignLast); // This function can be split into two functions for Age and Last separately if need be function assignLast(propertyName) { logger.debug("Looping for property -> "+ propertyName); var searchprop = new Object(); searchprop.name = propertyName; property = eventData.newValue.value.Find(searchprop); if(property){ logger.debug("Found Row. Name= " + property.name); var lastPropertyName = propertyName+"Last"; if(property.value) { // Set the cache property on me, this entity, to the current property value me[lastPropertyName] = me[propertyName]; } } else { logger.debug("Property Not Found in property set -> " + propertyName); } }       Option 2: All Properties Need Caching     var rowCount = eventData.newValue.value.getRowCount(); for(var i=0; i<rowCount; i++){ logger.warn("property name->" + eventData.newValue.value[i].name + "----- property new value->" + eventData.newValue.value[i].value.value); var propertyName = eventData.newValue.value[i].name; var lastPropertyName = propertyName+"Last"; me[lastPropertyName] = me[propertyName]; logger.warn("done last subscription, last property value for lastPropertyName" + me[lastPropertyName]); }         Useful Platform Code Snippets I. Age Calculation     var date1 = new Date(); var date2 = me.GetPropertyTime({ propertyName: propertyName /* STRING */ }); var result = millisToMinutesAndSeconds (dateDifference(date1, date2) ); // This function converts from an unintelligibly large number in milliseconds to something formatted in minutes and seconds function millisToMinutesAndSeconds(millis) { var minutes = Math.floor(millis / 60000); var seconds = ((millis % 60000) / 1000).toFixed(0); return (seconds == 60 ? (minutes+1) + ":00" : minutes + ":" + (seconds < 10 ? "0" : "") + seconds); }       II. Sort the Info Table by Time     var params = { sortColumn: "time" /* STRING */, t: me.propertySet/* INFOTABLE */, ascending: ascending /* BOOLEAN */ }; var result = Resources["InfoTableFunctions"].Sort(params);       III. Search the Info Table for a Property     var searchprop = new Object(); searchprop.name = propertyName; property = PropertySetAsInfotable.Find(searchprop); if(property === null){ logger.info("Property Not Found -> " + propertyNumber1); } else { logger.info("Found Row. Name= [" + property.name + "], value= " + property.value.value); }         Edge Implementation This example implementation uses the .NET Edge SDK to build a property set Info Table at the Edge.   I. Define the Data Shape A standard Data Shape is used (NamedVTQ), but because this Data Shape is not exposed in the Edge SDK code, it has to be created manually.     // Data Shape definition for NamedVTQ FieldDefinitionCollection namedVTQFields = new FieldDefinitionCollection(); namedVTQFields.addFieldDefinition(new FieldDefinition(CommonPropertyNames.PROP_NAME, BaseTypes.STRING)); namedVTQFields.addFieldDefinition(new FieldDefinition(CommonPropertyNames.PROP_VALUE, BaseTypes.VARIANT)); namedVTQFields.addFieldDefinition(new FieldDefinition(CommonPropertyNames.PROP_TIME, BaseTypes.DATETIME)); namedVTQFields.addFieldDefinition(new FieldDefinition(CommonPropertyNames.PROP_QUALITY, BaseTypes.STRING)); base.defineDataShapeDefinition("NamedVTQ", namedVTQFields);     II. Define the Info Table Property The property defined should NOT be logged or persistent, and it can be read-only, since data is always pushed from the Edge and read from the server cache when accessed on the Platform. Note that the push type of the info table property MUST be set to "ALWAYS" (if set to "VALUE", the data change event will only fire if the number of rows changes).   // Property Set Definitions [ThingworxPropertyDefinition( name = "DevicePropertySet", description = "Alternative representation of properties as an Info Table for rules processing", baseType = "INFOTABLE", category = "Status", aspects = new string[] { "isReadOnly:true", "isPersistent:false", "isLogged:false", "dataShape:NamedVTQ", "cacheTime:0", "pushType:ALWAYS" } ) ]     III. Define a Property to Store the GOOD Quality Status   private static String QUALITY_STATUS_GOOD = QualityStatus.GOOD.name();     IV. Define Functions to Populate the Value Collections  An Info Table is really just made up of many Value Collections, where each Value Collection is considered a row. These services take in the name and value of a property and return a Value Collection object which can be added to the property set Info Table.   public ValueCollection createNumberValueCollection(String name, double value) { ValueCollection vc = new ValueCollection(); // Add quality and time entries to the Value Collection vc.SetStringValue(CommonPropertyNames.PROP_QUALITY, QUALITY_STATUS_GOOD); vc.SetDateTimeValue(CommonPropertyNames.PROP_TIME, new DatetimePrimitive(DateTime.UtcNow)); vc.SetStringValue(CommonPropertyNames.PROP_NAME, name); vc.SetNumberValue(CommonPropertyNames.PROP_VALUE, value); return vc; } public ValueCollection createBooleanValueCollection(String name, Boolean value) { ValueCollection vc = new ValueCollection(); // Add quality and time entries to the Value Collection vc.SetStringValue(CommonPropertyNames.PROP_QUALITY, QUALITY_STATUS_GOOD); vc.SetDateTimeValue(CommonPropertyNames.PROP_TIME, new DatetimePrimitive(DateTime.UtcNow)); vc.SetStringValue(CommonPropertyNames.PROP_NAME, name); vc.SetBooleanValue(CommonPropertyNames.PROP_VALUE, value); return vc; }     V. Build the Property Set Call this code from the processScanRequest method to build the property set.   // Create an instance of a new Info Table using the standard "NamedVTQ" Data Shape InfoTable propertySet = new InfoTable(getDataShapeDefinition("NamedVTQ")); // Set name/value for Temperature using convenience function propertySet.addRow(createNumberValueCollection("Temperature", temperature)); // Set name/value for Pressure using convenience function propertySet.addRow(createNumberValueCollection("Pressure", pressure)); // Set name/value for TotalFlow using convenience function propertySet.addRow(createNumberValueCollection("TotalFlow", this._totalFlow)); // Set name/value for InletValve using convenience function propertySet.addRow(createBooleanValueCollection("InletValve", inletValveStatus)); // Set name/value for FaultStatus using convenience function propertySet.addRow(createBooleanValueCollection("FaultStatus", faultStatus)); // Set the property set Info Table property base.setProperty("DevicePropertySet", propertySet);     VI. Update the subscribed properties These two lines of code update the properties and events, actually sending the property set (containing all property updates) to the Platform.   base.updateSubscribedProperties(15000); base.updateSubscribedEvents(60000);     Conclusion Following these steps will enable the Edge to build a property set before sending any property updates to the Platform. The Platform can then rely on caching to process the business logic with no database dependency, which is faster and more efficient than any other approach. Finally the updates are still written to the database, so in the end, there is no functional difference between using a property set and binding each property individually. Please don't hesitate to comment here with any questions about this approach.
View full tip
Checkout the below video which explores the Creo As A Service (CAAS) feature of the Product Insight extension. This allows to retrieve Creo analysis computation inside ThingWorx through the Analytics Manager framework.  
View full tip
Time series prediction uses a model to predict future values based on previously observed values. Time series data differs somewhat from non-time series data in both the formatting of the data and the training of predictive models. This article will highlight several important considerations when dealing with time series data. Preparing Time Series Data: The data must contain exactly one field with Op Type “TEMPORAL” and one field with Op Type “ENTITY_ID”, which defines the identifier for an entity, such as a machine serial number. The ENTITY_ID field should remain the same as long as there are no missing timestamps and it is within the same asset but should be different for different assets or asset runs in order to accurately assign history during model training and scoring.     The TEMPORAL field is a numeric field indicating the order of the data rows for a specific entity . One should also ensure that data is formatted such that the timestamps are equally spaced (for example, one data point every minute) and that no gaps exist in the sequence of numbers.   If there are gaps in the time series data, it is recommended to restart the series after the gap as a new entity. Alternatively, if the gap is small enough (few data points), linear interpolation based on the gap endpoint values within the same entity is generally acceptable.   Model Creation in Time Series: When creating a timeseries model in Analytics Builder, you will be asked to specify a lookback size and lookahead parameter. The lookback size determines how many historical datapoints (including the current row) will be used in the model. The lookahead indicates how many time steps ahead to predict.  If the value of the goal variable is not known at time of scoring, unchecking Use Goal History will use the goal column during training but not its history during scoring.   Time Series models can also be created in Services using the Training Thing. The lookback size and lookahead parameter are specified in the CreateJob service. The virtualSensor field is used to indicate if the model should be trained to predict values for a field that will not be available during scoring. For example, one can train a time series model to predict Volume using evolving Temperature and Pressure, based on sensor data for these three variables over a period of time. However, the Volume sensor may be removed from further assets in order to reduce costs, and the predictive model can be used instead.   Two important considerations: ThingWorx Analytics will expand historical data in the time series into new columns. This process creates new features using the values of the previous time steps. Additionally, low order derivatives, together with average and standard deviation features are computed over small contiguous subgroups of the historical data.   The expansion process can make the dataset exceptionally wide, so time series training is generally significantly slower compared to training with no history on the same dataset. This gets exacerbated when lookback size = 0 (auto-windowing, a process where the system is trying to find the optimal lookback). If there are columns that are not changing or change infrequently (such as a device serial number or zip code of the device’s location), these should be marked as Static when importing the data. Any columns labeled Static will not be expanded to create new features. Care also needs to be taken to exclude any features that are known to not be relevant to the prediction. Using a large lookback can eliminate how many examples / entities the model has available to train. For example, if a lookback of 8 is used, then any entities that have less than 8 examples will not be used in training. For the same reason, scoring for time series produces less results than the number of rows provided as input: if 10 rows are provided and lookback is 6, then only 5 predictions will be produced.
View full tip
With the advent of faster and cheaper hardware, and owing to vast improvements in connectivity such as Gigabit networks and 5G, data is collected and processed at an ever increasing pace. As such, there is a related need for the Analytical Models built on top of such data to evolve over time. As part of this evolution, modelling experiments with more data, new variables, new techniques, and different training parameters will be performed. The resulting models need to be tracked, monitored, and appropriate versions need to be used for scoring in various scenarios. This creates the need for version control of Analytical Models.   ThingWorx Analytics makes it easy to implement an “Analytics Model Version Control” system via two mechanisms: A job Id and timestamp based identification system, so if a model with the same jobName (model name) is retrained, the old model will still exist and can be easily retrieved A tagging mechanism for the above jobs   Before using the above techniques to version control Analytics Models, it is critical to decide what represents “the same model” to be versioned. Unlike in the world of Software Engineering where the concept to be versioned is a file / folder that evolves over time, in the world of Analytics Models, there could be several, potentially customer specific definitions. For example, one definition could be “same training parameters, but trained on the latest, most comprehensive data available”. Yet another, more relaxed definition could be “any training parameters, but same training dataset and goal” or even “any training parameters, on any historical version of the training dataset”.   Once this concept is agreed upon within the customer's organization, and if training is done in a ThingWorx service by calling the APIs, for a given jobName (model name) one can simply query the tags for a LatestVersion type tag, increment, and create the new model with the same jobName and incremented tag. Any model version with the same jobName and its corresponding performance metrics can then be accessed using the tag. Additional tags (such as techniques used, dataset version, etc) can be added if desired to make retrieval of context dependent models more efficient.
View full tip
Announcements