As Thingworx Neo4j(inbuilt) has 500K data table records limitation. Does it also apply for thingworx PostgreSQL (inbuild) as well?
It does not have such a limitation. It is also not true that Neo4j has such a limitation, however, the Neo database should be kept below 50GB in size because after that point querying and database backups become significantly slower.
Thanks for the clarification, Polina.
I am not bothered much about 50 GB data size as of now because by the time this limit will be reached, Thingworx platform will be enhanced (in terms of underlying architecture) for better performance.
Another query, Do we expect the support of Neo4j (inbuilt) database in future as well and till when?
I wouldn't store anything in a datatable if you know you're going to have more than 10,000 rows. Use a proper relational database and create stored procedures and sql query services to access your data. I've got extensive experience in TWX performance issues when datatable sizes increase beyond a certain threshold. The size limitation varies depending on the number of columns.
As you have extensive experience on performance issue due to large data size, Could you please help us in understanding what would be the criteria switching from the thingworx inbuilt database to an external database like, size of data, minimum number for columns, etc.
I believe, fetching large data from external database using multiple tables would also cause performance issues for multiple concurrent requests.How would we overcome this problem when going for external database. Please guide.
One query, I have installed and setup Postgres external database. How do I port my existing data from the thingworx inbuilt Postgres database to external Postgres database. Is there a way to access the internal database?
10,000 rows is not 'a lot' of data.
To understand this a little better.
Thingworx is designed to store all DataTable data in a single (underlying database H2/PostGres and yes even Cassandra) table structure. It has therefor limited indexing, I'm checking this in PostGres and Streams have an ID and TimeStamp, DataTables have an ID and some sort of composite index on the JSON Blob Values field. The rest of the information is stored as a JSON Blob (also true for Streams, Wiki and Blog entries)
This means that when you query from these data structures, for a Stream it first does a datetime filter, brings back the records, reconstructs the information in an actual infotable and then applies the remainder of your query if any. Not exactly sure how the DataTable index works
I believe the DataTable indexes may be kept on the side as Solar indexes - not sure since I see an Index on Values in the schema.
However it is true that the performance of DataTables isn't that good, Streams actually perform much much better and you can get reasonable performance out of streams that have millions of records.
DataTables have their limit not at 10K rows but more so in the 250-500K range.
Depending on how much data and especially what you need to do with it, an additional Relational Database (even another PostGres) and then using your own table design and a JDBC connection can speed up performance a LOT because you'll be leveraging another Server to do database side processing of your query before it comes back to Thingworx.
Hi Pai Chung,
Just created one small Proof of concept (POC) based on your comment on 500 K records/rows limitation in thingworx database.
However, we are able to store 1000 K records/rows with 8 columns in each record (along with JSON object). We are able to successfully fetch the data which makes me believe that indexing is also correct. Even the query processing time is quite efficient i.e. 40-50 ms.
Do you have any comments on this? Is there some other limit you were referring to?
Please find below some additional details:
Please let us know in case you need any other information.
If performance is a requirement and you anticipate a lot of data over the life of the app, then use a relational db. I have an app running for a very large well known computer manufacturer with 50k users and over 100k worth of new data rows stored daily. There are no performanc issues. We've implemented stored procedures and created multiple indicies to ensure queries are optimal. Additionally we page datasets that return more than 100 rows per query. This keeps things very fast.
Do you have any comments on below question:
I have installed and setup Postgres external database. How do I port my existing data from the thingworx inbuilt Postgres database to external Postgres database. Is there a way to access the internal database?
There isn't TW with Postgres embedded, if you go to Persistence Providers you will find there the connection "ThingworxPersistenceProvider" to Postgres you just need to change configuration Settings on that entity to point to the other PotgreSQL instance. You may go to Persistence Providers and don't see nothing, if that's the case go to Advanced on top search and set System Objects to true.
But if you feel that you are on Embedded PostgreSQL TW installation and you didn't installed already PostgreSQL then you maybe are on a H2 or Neo4J TW installation which yes it are embedded, and to move to a PostgreSQL you will need a full Export and Import
Hi Carles Coll,
It would be helpful if you could share more details (process, commands or scripts) on how to exporting data model from thingworx embedded PostgreS database to external PostgreS database.
Does this article help at list partially for your question on exporting data model?
Exactly as Carles mentioned, there's no embedded Postgres DB.
First I would like to make a clarification about the vocabulary: there is a TW with Neo4j embedded (inside the war file), but the Postgres version does not have the server embedded in the same way. You need to install and configure the Postgres server, before you deploy the war file in Tomcat.
After you installed the Postgres, there are some steps that you need to follow, from the installation guide.
Note: There are 2 different installation packages, one that supports Postgres and one that supports Neo4j. (Actually there's more that these 2 but that's another discussion). You need to deploy only one of those, you can not use the Neo4J with the Postgres Persistence Provider.
I would also follow Wayne's advice regarding to a relational DB. So, first test your persistence provider in regards to the queries that you will execute as well. Is a simple fetch is representative of your app? Try to estimate how many rows you'll have in the application lifecycle, Will it be 1 million? 10 million ? A real query will also impact the search time. Try to think of a representative query for your app and run the test again.
There is also an useful article in the support section https://support.ptc.com/appserver/cs/view/solution.jsp?n=CS204089&lang=en_US
Hope it helps in making your choice,
In case you need any clarification please don't hesitate to come back!
Need clarification on below points:
Eklavya, yes you will have to create SQL services for CRUD, you can check my blog How to configure Oracle 12c's Pluggable Database as External DataStorage for ThingWorx - while its written specifically for Oracle 12c however, if you check the section Creating Services to access the PDB as external database source for ThingWorx it should give you an idea how the services would look like with SQL script. Let me know if you would need more e.g./info
Which external database are you using? Is there something specific you plan to do with the transaction management, given most of the RDBMS have quite advanced built in transaction management. If you are using the PostgreSQL as your external database you'll need to check the Administration guide (just picked it up from the PostgreSQL'we homepage) for PostgreSQL
Neo4j is reaching the End of Support stage and is no longer provided to the new customers. It's being replaced by the built-in option of H2 database.
Hi Polina Osipova,
Can you provide a link of support page where this information of end of support is available?