Community Tip - You can Bookmark boards, posts or articles that you'd like to access again easily! X
Team,
I have a customer that we will have a small application with 100 assets, 3 properties each and they will be logging time series data into Postgres at every 5 seconds,
We plan to use PostGres for time series historization but we have the following questions:
Is there any performance testings for persisting timeseries into PostGres vs SQL Server vs InfluxDB?
What is the performance tests PTC has done for PostGres?
What are the best practices limits we can consider for historizing time series data into PostGres and what aspects/limitations we should analyze to move to other historization solutions?
TKs,
Chris
Solved! Go to Solution.
Check the Why Use InfluxDB in a Small ThingWorx Application topic which compares PostgreSQL vs Influx (mostly in terms of ingestion speed and query response time).
I also shared some thoughts about storage scalability (with regards to time series data) in the Scalability of Thingworx in terms of storing data topic, but it didn't get much attention
Check the Why Use InfluxDB in a Small ThingWorx Application topic which compares PostgreSQL vs Influx (mostly in terms of ingestion speed and query response time).
I also shared some thoughts about storage scalability (with regards to time series data) in the Scalability of Thingworx in terms of storing data topic, but it didn't get much attention
In my case the source is a machine with 140 sensors generating data at 1/60Hz (1 per minute), so ~2.5 datapoints per second.
For your case (100x3/5 = 60 datapoints per second) I don't think PostgreSQL would work well and I'd use Influx.
If you need HA, you'd have to buy Enterprise version, if you don't, you can get away with the free version.
Due to the way Thingworx uses PostgreSQL, there is quite a big overhead in terms or number of rows when storing data (each datapoint for each sensor takes one row so for 100 assets x 3 properties you'd get 100x3 rows every 5 seconds in the Stream table. That would be 300x12x60x24=~5 200 000rows per day.
Storage-wise there is also quite a major overhead. The screenshot below shows a sample row (storing 1 datapoint (value=) for 1 sensor).
I'm not prepared to share exact numbers (that would take some time to double-check everything to make sure), but if I didn't confuse something major, for PostgreSQL one needs like ~40 times more storage space).