Timers and schedulers can be useful tool in a Thingworx application. Their only purpose, of course, is to create events that can be used by the platform to perform and number of tasks. These can range from, requesting data from an edge device, to doing calculations for alerts, to running archive functions for data. Sounds like a simple enough process. Then why do most platform performance issues seem to come from these two simple templates?
It all has to do with how the event is subscribed to and how the platform needs to process events and subscriptions. The tasks of handling MOST events and their related subscription logic is in the EventProcessingSubsystem. You can see the metrics of this via the Monitoring -> Subsystems menu in Composer. This will show you how many events have been processed and how many events are waiting in queue to be processed, along with some other settings. You can often identify issues with Timers and Schedulers here, you will see the number of queued events climb and the number of processed events stagnate.
But why!? Shouldn't this multi-threaded processing take care of all of that. Most times it can easily do this but when you suddenly flood it with transaction all trying to access the same resources and the same time it can grind to a halt.
This typically occurs when you create a timer/scheduler and subscribe to it's event at a template level. To illustrate this lets look at an example of what might occur. In this scenario let's imagine we have 1,000 edge devices that we must pull data from. We only need to get this information every 5 minutes. When we retrieve it we must lookup some data mapping from a DataTable and store the data in a Stream. At the 5 minute interval the timer fires it's event. Suddenly all at once the EventProcessingSubsystem get 1000 events. This by itself is not a problem, but it will concurrently try to process as many as it can to be efficient. So we now have multiple transactions all trying to query a single DataTable all at once. In order to read this table the database (no matter which back end persistence provider) will lock parts or all of the table (depending on the query). As you can probably guess things begin to slow down because each transaction has the lock while many others are trying to acquire one. This happens over and over until all 1,000 transactions are complete. In the mean time we are also doing other commands in the subscription and writing Stream entries to the same database inside the same transactions. Additionally remember all of these transactions and data they access must be held in memory while they are running. You also will see a memory spike and depending on resource can run into a problem here as well.
Regular events can easily be part of any use case, so how would that work! The trick to know here comes in two parts. First, any event a Thing raises can be subscribed to on that same Thing. When you do this the subscription transaction does not go into the EventProcessingSubsystem. It will execute on the threads already open in memory for that Thing. So subscribing to a timer event on the Timer Thing that raised the event will not flood the subsystem.
In the previous example, how would you go about polling all of these Things. Simple, you take the exact logic you would have executed on the template subscription and move it to the timer subscription. To keep the context of the Thing, use the GetImplimentingThings service for the template to retrieve the list of all 1,000 Things created based on it. Then loop through these things and execute the logic. This also means that all of the DataTable queries and logic will be executed sequentially so the database locking issue goes away as well. Memory issues decrease also because the allocated memory for the quries is either reused or can be clean during garbage collection since the use of the variable that held the result is reallocated on each loop.
Overall it is best not to use Timers and Schedulers whenever possible. Use data triggered events, UI interactions or Rest API calls to initiate transactions whenever possible. It lowers the overall risk of flooding the system with recourse demands, from processor, to memory, to threads, to database. Sometimes, though, they are needed. Follow the basic guides in logic here and things should run smoothly!
First of all thanks for the Amazing Post.
But with this approach you will have a big running transaction which will lock a lot of resources on it's execution, and for my experience that can be worst... As you know, when you call a Service a new transaction lock it's get and it doesn't gets freed until the service ends it's call, would be good to have partial transactions for your approach, I mean after each Thing's service call stop and start transaction.
The solution it's not that easy, and what if only few of the Implementing things should need to be triggered? and when I mean few, let's say we have 10k things based on the given template and only 1k has to be triggered? you will iterate over 10k?
We have lots of Timers and Schedulers on our Platform, for instance to cut in pieces different Systems of our solution and for a lot of other reasons.. This Timer/Scheduler "centered based approach" doesn't lets to componentize well solution.
One use case that we found usual and which depends on Timers/Schedulers and could be overcome if this feature comes with the platform it's Queues implementation, do you know if there's any plan on implementing a queue system out-of-the box.
Just one last thing, It's hard to know which Timer/Scheduler it's the bootle neck actually, the total number of events doesn't help when you have lots of them, would be good to have a more in-depth Telemetry ( Service Level, Thing Level,... )
I've seen the negative effects of this happen both ways, in a single subscription at ME/THING level transaction looping through many things, and with a whole bunch of things subscribed to a single event. Generally, I think Adam is right regarding the best practice here, you can loop through a lot of stuff very fast in a single transaction. Security plays a role here, if this is executing in an Administrator context it will be quite fast, but it will be slower if you are executing as a Non-Administrator, because security will be checked on all things/properties/services that you are accessing in the loop. The best practice there is to use Template.QueryImplementingThings() and not Template.QueryImplementingThingsWithData(), which is far slower as it checks security on every property on every thing.
Great post, glad to have sort of an unofficial best practice published here now.
Agree with Carles,
We would deeply appreciate a Detailed performance monitor (task monitor). Sometimes is very difficult to find out the root cause. And in some cases the logs are not helpful at all (mainly for websocket communiction).
Although the platform is packed with functionalities, it is a shame that I can see lot of these around:
Overall it is best not to use Timers and Schedulers (and this and that) whenever possible....
Without a doubt it is a great post, it is a shame that one needs to find out these issues for oneself.
So which one is good to have.
Shall i implement Timer on thing template level or creating the subscription on timer itself by calling all the things to that time subscription?
Best way is to call services under timer thing itself which will not create many threads and execute in one thread.