We have a requirement to store information from devices in the field in Thingworx. Each device will be streaming property changes to a Thingworx Stream. Each device could have several (1-100) properties and we expect to have less than 10,000 connected devices. Which persistence provider would be the best to use for this requirement?
- Neo4J
- PostGres
- DataStax
- Other (if there is one...i'm not sure)
Please use DataStax - Cassandra for this.
Unless you have a data warehouse already, then you could go with the PostGreSQL option and then transfer the information out of PostGres to your data warehouse.
Where would the "cut-off" be between Neo4J, PostGres, and DataStax? Or can you explain when you would use one persistence store over the other?
At this point I usually do not recommend Neo4J I believe we might slowly deprecate that release.
Where is the cutoff point, for a single Stream, probably # of records in the low millions, less than 10 for sure.
Above that you should go to Cassandra vs. PostGreSQL.
So if you took 50 properties and 5K devices and then add an update rate, you will very quickly hit that limit I believe.
This is a scenario where information is all stored to the same Stream.
I haven't heard any numbers yet about the total Database size and what the limit would be. Since you could decide to store to separate streams, however that is not considered a best practice (but not illegal either )
Robert/Pai,
Excellent post right in line with our requirements. i.e. Must retain data streams for a specific time period in order to provide historical perspective (redisplay dashboard/mashup at period on the past) and for analytics processing, prediction, and prescription.
A few follow-up questions.
1. Are Neo (built-in) and DataStax Cassandra currently the only two supported persistence providers? (we're at version 6).
2. It is implied that DataStax custom built an interface to become a selectable persistence provider. Is there a general purpose ThingWorx API that allows me to "glue-in" my own provider - PostGres, MySQL, Oracle, other ...? Can I do this with open source Cassandra?
3. What is the best pointer to documentation to get started on configuring a persistence provider other than the default Neo (DataStax or otherwise) as we have a data warehousing requirement for streaming data - just as was originally stated by Robert. Again, I'm ready to do the legwork and development to figure it out, just need a good starting point to get acquainted with the concepts as related to ThingWorx platform.
Any pointers, snippets, or hints are appreciated.
Thanks
We also now support PostGreSQL
Here are the 6.5 docs http://support.ptc.com/cs/help/thingworx_hc/thingworx_6.5_hc/
For PostGreSQL and Cassandra probably you want to go here:
Thanks for the doc references. I will dig down in them to setup a test.
Thanks,