Official name: DataStax Enterprise, sometimes referred as Cassandra.
Note: DBA skills required, free self-paced training can be found here Training | DataStax The extension package can further be obtained through Technical Support.
Thingworx 6.0 introduces DSE as a backend database scaling to much greater byte count, ad Neo4j performance limitations hit at 50Gbs. Some of the main reasons to consider DSE are:
1. Elastic scalability -- Alows to easily add capacity online to accommodate more customers and more data when needed.
2. Always on architecture -- Contains no single point of failure (as with traditional master/slave RDBMS's and other NoSQL solutions) resulting in continious availability for business-critical applications that can't afford to go down.
3. Fast linear-scale performance -- Enables sub-second response times with linear scalability (double the throughput with two nodes, quadruple it with four, and so on) to deliver response time speeds.
4. Flexible data storage -- Easily accommodates the full range of data formats - structured, semi-structured and unstructured -- that run through today's modern applications.
5. Easy data distribution -- Read and write to any node with all changes being automatically synchronized across a cluster, giving maximum flexibility to distribute data by replicating across multiple datacenters, cloud, and even mixed cloud/on-premise environments.
Note: Windows+DSE is currently not fully supported.
Prerequisite: fully configured DSE database.
1. Obtain the dse_persistancePackage
2. Import as an extension in Composer.
3. In composer, create a new persistence provider.
4. Select the imported package as Persistence Provider Package.
5. In Configuration tab:
- For Cassandra Cluster Host, enter the IP address set in cassandra.yaml or localhost if hosted locally
- Enter new of existing Cassandra Keyspace name
- Enter Solr Cluster URL
- Other fields can be left at default (*)
6. Go to Services and execute TestConnectivity service to ensure True response.
7. When creating new Stream, Value Stream, or a Data Table, set Persistence Provider to the one created in previous steps.
Currently all reads and writes are done through Thingworx and all Thingworx data is encoded in DSE. Opcenter still allows to see connectes streams, datatables, valuestreams.
*SimpleStrategy can be used for a single data center, or NetworkTopologyStrategy is recommended for most deployments, because it is much easier to expand to multiple data centers when required by future expansion.
Is there a limit of data per node?
1 TB is a reasonable limit on how much data a single node can handle, but in reality, a node is not at all limited by the size of the data, only the rate of operations. A node might have only 80 GB of data on it, but if it's continuously hit with random reads and doesn't have a lot of RAM, it might not even be able to handle that number of requests at a reasonable rate. Similarly, a node might have 10 TB of data, but if it's rarely read from, or there is a small portion of data that is hot (so it could be effectively cached), it will do just fine. If the replication factor is above 1 and there is no reads at consistency level ALL, other replicas will be able to respond quickly to read requests, so there won't be a large difference in latency seen from a client perspective.
Thank you for the article. About the "Cassandra Cluster Host" part, do you have any guide to setup one, like the guide of PostgreSQL ("Getting_Started_with_PostgreSQL_HA_and_ThingWorx-Administrators_Guide")? I just need a database as simple as possible for now, instead of the complete system like the one of Amazone EC2, in the guide "Getting_Started_with_DataStax_Enterprise_and_ThingWorx". I asked this question once here: Thingworx Persistence Provider guidelines.
I'm interested in using Cassandra in order to try the extension Neuron. The database would have a lot of numeric values with timestamp. Some complex calcultation will be realized regularly based on the data collected.
Thank you in advance for your answer
There is no guide as of right now, however, the free training mentioned above goes through the details and practical part of how to set up and work with Cassandra. This is more of the basics and how to connect a set up database to Thingworx.
Thank you for your interest!
Wow great article on Cassandra. This will be very useful for the Cassandra And DSE Readers. I have one extra point to mention please go through it.
DSE delivers the only production certified version of Cassandra to the market and is the only version of Cassandra which ensures that both current and previous versions of DSE Cassandra are stable and trusted for production environments.