cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Need to share some code when posting a question or reply? Make sure to use the "Insert code sample" menu option. Learn more! X

Timers and Schedulers - Best Practice

100% helpful (2/2)

Timers and schedulers can be useful tool in a Thingworx application.  Their only purpose, of course, is to create events that can be used by the platform to perform and number of tasks.  These can range from, requesting data from an edge device, to doing calculations for alerts, to running archive functions for data.  Sounds like a simple enough process.  Then why do most platform performance issues seem to come from these two simple templates?

It all has to do with how the event is subscribed to and how the platform needs to process events and subscriptions.  The tasks of handling MOST events and their related subscription logic is in the EventProcessingSubsystem.  You can see the metrics of this via the Monitoring -> Subsystems menu in Composer.  This will show you how many events have been processed and how many events are waiting in queue to be processed, along with some other settings.  You can often identify issues with Timers and Schedulers here, you will see the number of queued events climb and the number of processed events stagnate.

But why!?  Shouldn't this multi-threaded processing take care of all of that.  Most times it can easily do this but when you suddenly flood it with transaction all trying to access the same resources and the same time it can grind to a halt.

This typically occurs when you create a timer/scheduler and subscribe to it's event at a template level.  To illustrate this lets look at an example of what might occur.  In this scenario let's imagine we have 1,000 edge devices that we must pull data from.  We only need to get this information every 5 minutes.  When we retrieve it we must lookup some data mapping from a DataTable and store the data in a Stream.  At the 5 minute interval the timer fires it's event.  Suddenly all at once the EventProcessingSubsystem get 1000 events.  This by itself is not a problem, but it will concurrently try to process as many as it can to be efficient.  So we now have multiple transactions all trying to query a single DataTable all at once.  In order to read this table the database (no matter which back end persistence provider) will lock parts or all of the table (depending on the query).  As you can probably guess things begin to slow down because each transaction has the lock while many others are trying to acquire one.  This happens over and over until all 1,000 transactions are complete.  In the mean time we are also doing other commands in the subscription and writing Stream entries to the same database inside the same transactions.  Additionally remember all of these transactions and data they access must be held in memory while they are running.  You also will see a memory spike and depending on resource can run into a problem here as well.

Regular events can easily be part of any use case, so how would that work!  The trick to know here comes in two parts.  First, any event a Thing raises can be subscribed to on that same Thing.  When you do this the subscription transaction does not go into the EventProcessingSubsystem.  It will execute on the threads already open in memory for that Thing.  So subscribing to a timer event on the Timer Thing that raised the event will not flood the subsystem.

In the previous example, how would you go about polling all of these Things.  Simple, you take the exact logic you would have executed on the template subscription and move it to the timer subscription.  To keep the context of the Thing, use the GetImplimentingThings service for the template to retrieve the list of all 1,000 Things created based on it.  Then loop through these things and execute the logic.  This also means that all of the DataTable queries and logic will be executed sequentially so the database locking issue goes away as well.  Memory issues decrease also because the allocated memory for the quries is either reused or can be clean during garbage collection since the use of the variable that held the result is reallocated on each loop.

Overall it is best not to use Timers and Schedulers whenever possible.  Use data triggered events, UI interactions or Rest API calls to initiate transactions whenever possible.  It lowers the overall risk of flooding the system with recourse demands, from processor, to memory, to threads, to database.  Sometimes, though, they are needed.  Follow the basic guides in logic here and things should run smoothly!

Comments

Hi Adam,

First of all thanks for the Amazing Post.

But with this approach you will have a big running transaction which will lock a lot of resources on it's execution, and for my experience that can be worst... As you know, when you call a Service a new transaction lock it's get and it doesn't gets freed until the service ends it's call, would be good to have partial transactions for your approach, I mean after each Thing's service call stop and start transaction.

The solution it's not that easy, and what if only few of the Implementing things should need to be triggered? and when I mean few, let's say we have 10k things based on the given template and only 1k has to be triggered? you will iterate over 10k?

We have lots of Timers and Schedulers on our Platform, for instance to cut in pieces different Systems of our solution and for a lot of other reasons.. This Timer/Scheduler "centered based approach" doesn't lets to componentize well solution.

One use case that we found usual and which depends on Timers/Schedulers and could be overcome if this feature comes with the platform it's Queues implementation, do you know if there's any plan on implementing a queue system out-of-the box.

Just one last thing, It's hard to know which Timer/Scheduler it's the bootle neck actually, the total number of events doesn't help when you have lots of them, would be good to have a more in-depth Telemetry ( Service Level, Thing Level,... )

Best Regards,

Carles.

I've seen the negative effects of this happen both ways, in a single subscription at ME/THING level transaction looping through many things, and with a whole bunch of things subscribed to a single event. Generally, I think Adam is right regarding the best practice here, you can loop through a lot of stuff very fast in a single transaction. Security plays a role here, if this is executing in an Administrator context it will be quite fast, but it will be slower if you are executing as a Non-Administrator, because security will be checked on all things/properties/services that you are accessing in the loop. The best practice there is to use Template.QueryImplementingThings() and not Template.QueryImplementingThingsWithData(), which is far slower as it checks security on every property on every thing.

Great post, glad to have sort of an unofficial best practice published here now.

Agree with Carles,

We would deeply appreciate a Detailed performance monitor (task monitor). Sometimes is very difficult to find out the root cause. And in some cases the logs are not helpful at all (mainly for websocket communiction).

Although the platform is packed with functionalities, it is a shame that I can see lot of these around:

Overall it is best not to use Timers and Schedulers (and this and that) whenever possible....


Without a doubt it is a great post, it is a shame that one needs to find out these issues for oneself.

Hi,

So which one is good to have.

 

Shall i implement Timer on thing template level or creating the subscription on timer itself by calling all the things to that time subscription?

 

Thanks,

Sathishkumar C.

HI Satishkumar,

 

Best way is to call services under timer thing itself which will not create many threads and execute in one thread.

 

Regards,

Pankaj Phopse

Thanks for the input Pankaj.

@CarlesColl  @Sathishkumar_C @Pankajphopse1 

Thanks guys for such amazing post for with details discussion on pros/cons about timer implementation 

Can you help implementation for below problem statement

1. I need to execute one service on each Thing on every 5 second

2.  This service trigger other service available on same thing

3. Things count can around 10K

 

can you guys help which approach should i used above use case

any help appreciated 

Hello Sapnil,

 

For timers PTC recommends not to use timers less than minute this might cause performance issue.

 

Though you want to use 5 second timer, create a 5sec timer thing and execute your services under itself, create a loop with try catch snippet, by this way it will execute all your services one by one and if any error either you can break  or continue the loop.

 

But i will suggest not to go with 5sec timer, as you are going to execute 10k things in just 5sec of interval, if any thing/service taken more time to execute,this will lead to increase in queue size which might end up with performance issue.

 

 

Regards,

Pankaj Phopse.

Another thought/recommendation to add to Adam's excellent write up here.

 

Although he states that you should try to avoid Timers and Schedulers whenever possible, I would put on my architect hat and remind those about the appropriate balance of constraints that you need to do when designing any system.  As a general rule of thumb I agree, however using Data Change events and HistoricalDataLogged events can have exponentially large numbers on busy systems and would fall in the data flow path so can risk jamming up that data flow.  Hence, you should think about finding the right balance between periodic tasks and how they're safely handled, and event-driven tasks and how they're handled.

 

Remembering that SDLC (Software Development Life Cycle) is an iterative process and requires analysis and redesign along the journey.  I often see performance situations where a particular design applied 3-4 years ago when there were 100 machines has broken down when there are 1000 machines.  This is normal and not related to ThingWorx but is systems design - the scope of the performance envelope is very different and you don't know in advance where it will take an exponential turn.  The first approach was perhaps appropriate for 4 years ago, and today a new approach is required.  Keep in mind that as applications scale, they will iteratively need to evolved - and that load/stress testing is the only way to safely and professionally ensure such levels of performance through scaling enterprise applications.

 

A downside to Adam's proposed approach is that only 1 EventProcessor thread will be used to execute all the work that you'll run from the for loop which will be safe but could take too long and might not take benefit from the resources provisioned (thread, DB connections, CPU, memory, etc.).  You need to be much more careful doing this, but you can come up with ways to get some parallel advantages that would fit your use case and leverage more threads.  An example of this could be a single Timer with a "TimerWorkerTemplate" which would be subscribed to Timer event and you'd instantiate a number of entities to correspond your worker threads.  As long as you have some defensive coding safeguards in place, you could also split the target entity list from the Timer into X groups, and then fork them off with an async service call - but you need to be 100% sure that the previous executions are completed.  One way to do this is to have them exit no matter what at 90% of the Timer interval, another is to put a sort of lock property somewhere that you set and unset (just be aware of script timeouts that might cause the unlock to be missed).

 

As I say, be very careful with this approach as if not done right can get into trouble.  And also keep in mind default Event Processor core pool size is 16, so ideally you could think about using 5 of them for some processing that might take a couple of minutes, use 1 of them for very long processing, leaving 10 for normal ThingWorx operations.  Planning the use of these threads is required when you have long running Services triggered by Subscriptions as the threads can become hoarded by certain tasks, making other operations like Mashup access or telemetry processing slow down.  This is the noisy neighbour problem presenting in the Event Processor - be sure to be a good neighbour and use Timer/Scheduler best practices, and keep fast lanes open in Event Processor.

Version history
Last update:
‎Feb 26, 2017 03:58 PM
Updated by:
Labels (2)