As Thingworx Neo4j(inbuilt) has 500K data table records limitation. Does it also apply for thingworx PostgreSQL (inbuild) as well?
It does not have such a limitation. It is also not true that Neo4j has such a limitation, however, the Neo database should be kept below 50GB in size because after that point querying and database backups become significantly slower.
I wouldn't store anything in a datatable if you know you're going to have more than 10,000 rows. Use a proper relational database and create stored procedures and sql query services to access your data. I've got extensive experience in TWX performance issues when datatable sizes increase beyond a certain threshold. The size limitation varies depending on the number of columns.
Thanks for the clarification, Polina.
I am not bothered much about 50 GB data size as of now because by the time this limit will be reached, Thingworx platform will be enhanced (in terms of underlying architecture) for better performance.
Another query, Do we expect the support of Neo4j (inbuilt) database in future as well and till when?
As you have extensive experience on performance issue due to large data size, Could you please help us in understanding what would be the criteria switching from the thingworx inbuilt database to an external database like, size of data, minimum number for columns, etc.
I believe, fetching large data from external database using multiple tables would also cause performance issues for multiple concurrent requests.How would we overcome this problem when going for external database. Please guide.
One query, I have installed and setup Postgres external database. How do I port my existing data from the thingworx inbuilt Postgres database to external Postgres database. Is there a way to access the internal database?
10,000 rows is not 'a lot' of data.
To understand this a little better.
Thingworx is designed to store all DataTable data in a single (underlying database H2/PostGres and yes even Cassandra) table structure. It has therefor limited indexing, I'm checking this in PostGres and Streams have an ID and TimeStamp, DataTables have an ID and some sort of composite index on the JSON Blob Values field. The rest of the information is stored as a JSON Blob (also true for Streams, Wiki and Blog entries)
This means that when you query from these data structures, for a Stream it first does a datetime filter, brings back the records, reconstructs the information in an actual infotable and then applies the remainder of your query if any. Not exactly sure how the DataTable index works
I believe the DataTable indexes may be kept on the side as Solar indexes - not sure since I see an Index on Values in the schema.
However it is true that the performance of DataTables isn't that good, Streams actually perform much much better and you can get reasonable performance out of streams that have millions of records.
DataTables have their limit not at 10K rows but more so in the 250-500K range.
Depending on how much data and especially what you need to do with it, an additional Relational Database (even another PostGres) and then using your own table design and a JDBC connection can speed up performance a LOT because you'll be leveraging another Server to do database side processing of your query before it comes back to Thingworx.
If performance is a requirement and you anticipate a lot of data over the life of the app, then use a relational db. I have an app running for a very large well known computer manufacturer with 50k users and over 100k worth of new data rows stored daily. There are no performanc issues. We've implemented stored procedures and created multiple indicies to ensure queries are optimal. Additionally we page datasets that return more than 100 rows per query. This keeps things very fast.
Do you have any comments on below question:
I have installed and setup Postgres external database. How do I port my existing data from the thingworx inbuilt Postgres database to external Postgres database. Is there a way to access the internal database?
There isn't TW with Postgres embedded, if you go to Persistence Providers you will find there the connection "ThingworxPersistenceProvider" to Postgres you just need to change configuration Settings on that entity to point to the other PotgreSQL instance. You may go to Persistence Providers and don't see nothing, if that's the case go to Advanced on top search and set System Objects to true.
But if you feel that you are on Embedded PostgreSQL TW installation and you didn't installed already PostgreSQL then you maybe are on a H2 or Neo4J TW installation which yes it are embedded, and to move to a PostgreSQL you will need a full Export and Import
Hi Pai Chung,
Just created one small Proof of concept (POC) based on your comment on 500 K records/rows limitation in thingworx database.
However, we are able to store 1000 K records/rows with 8 columns in each record (along with JSON object). We are able to successfully fetch the data which makes me believe that indexing is also correct. Even the query processing time is quite efficient i.e. 40-50 ms.
Do you have any comments on this? Is there some other limit you were referring to?
Please find below some additional details:
Please let us know in case you need any other information.