Community Tip - Learn all about the Community Ranking System, a fun gamification element of the PTC Community. X
Based on Google's Spanner DB; CockroachDB is a distributed SQL DB scaling horizontally; surviving disk, machine, rack & even datacenter failures. It is built to automatically replicate, rebalance & recover with minimal configuration See What is CockroachDB? for more.
Useful in use cases requiring:
Note: CockroachDB in current state isn't suitable for heavy analytics / OLAP.
Feature that makes it really attractive
As mentioned above, scaling horizontally it requires minimal configuration out of the box allowing quick setup starting from local laptop/machine as shown below it can scale easily to single dedicated server, development/public cloud cluster. Due to easy setup, adding new nodes is as simple as starting the cockroach utility.See CockroachDB FAQ for more. To top it off, it uses PostgreSQL Wire protocol and PostgreSQL's dialect further reducing configuration and special JDBC driver requirements when a ThingWorx is configured with PostgreSQL as persistence provider.
Setting up cockroach DB cluster
Download required binary or docker version from Install CockroachDB available for Mac, Linux & Windows
PS :Following setup uses Window's binary on a VM with Win10 64 bit, 6G RAM.
Starting Cluster node
Open command prompt and navigate to the directory where cockroach.exe is unzipped, and launching the node with following command prompt
cockroach.exe start --insecure --host=10.128.13.183 --http-port=8082
This will start a node on defined host in insecure mode with its web based DB administration console on port 8082 and DB listening on default port 26257. Note it will log a security warning since node is started in insecure mode due to the tag --insecure, like so
* * WARNING: RUNNING IN INSECURE MODE! * * - Your cluster is open for any client that can access 10.128.13.183. * - Any user, even root, can log in without providing a password. * - Any user, connecting as root, can read or write any data in your cluster. * - There is no network encryption nor authentication, and thus no confidentiality. * * Check out how to secure your cluster: https://www.cockroachlabs.com/docs/stable/secure-a-cluster.html * CockroachDB node starting at 2018-03-16 11:52:57.164925 +0000 UTC (took 2.1s) build: CCL v1.1.6 @ 2018/03/12 18:04:35 (go1.8.3) admin: http://10.128.13.183:8082 sql: postgresql://root@10.128.13.183:26257?application_name=cockroach&sslmode=disable logs: C:\CockroachDb\cockroach116\cockroach-data\cockroach-data\logs store[0]: path=C:\CockroachDb\cockroach116\cockroach-data\cockroach-data status: restarted pre-existing node clusterID: 012d011e-acef-47e2-b280-3efc39f2c6e7 nodeID: 1
Ensure that the secure mode is used when deploying in production.
Starting 2 additional nodes
Starting node2
cockroach.exe start --insecure --store=node2 --host=10.128.13.183 --port=28258 --http-port=8083 --join=10.128.13.183:26257
Starting node 3
cockroach.exe start --insecure --store=node2 --host=10.128.13.183 --port=28259 --http-port=8084 --join=10.128.13.183:26257
Note: Both of these 2 nodes are joining the cluster via 10.128.13.183:26257 (port for the node 1)
Verifying the live cluster and nodes via the web based CockroachDB admin console
Connecting to ThingWorx as external datastore
Good news, if your ThingWorx is running with PostgreSQL persistence provider, then no additional JDBC driver needed as CockroachDB uses the PostgreSQL wire protocol additionally, the SQL dialect is that of PostgreSQL
For any other persistence provider download and install the PostgreSQL Relational Database Connector from ThingWorx Marketplace.
Creating a database in the cluster
cockroach sql --insecure --port=26257
create database thingworx;
show databases;
root@10.128.13.183:26257/> SHOW databases; +--------------------+ | Database | +--------------------+ | crdb_internal | | information_schema | | pg_catalog | | system | | thingworx | | thingworxdatastore | +--------------------+ (6 rows)
CREATE USER cockroach WITH PASSWORD 'admin';
grant all on database thingworx to cockroach;
Creating a Thing & connecting to CockroachDB via ThingWorx Composer
For below example ThingWorx is using PostgreSQL as persistence provider.
JDBC Driver Class Name : org.postgresql.Driver JDBC Connection String : jdbc:postgresql://<serverIp/name>:26257/thingworx?sslmode=disable Database User Name : cockroach Database password : <password>
Creating table(s)
Now that the Thing is connected to the database, there are following ways DB objects can be created
Following command will create a small demo table
CREATE TABLE demo ( id INT, demovalue STRING)
Use SQLCommand as JavaScript handler when using above statement to create table directly from ThingWorx's Database Thing
Verifying the Database & a table created within that DB via the web admin console of CockroachDb
Under the left panel click on the Databases from the home page of one of the node's web admin consloe e.g. http://localhost:8084
Apart from other useful information about the database e.g. the database size and total number of tables, etc., clicking on the table name will also show the sql used to create it (including the defaults).
Creating couple of Database Thing services to perform bulk insert into the table from ThingWorx Composer
Insert Service created as SQL Command with code snippet, service takes 2 inputs of type int and string
insert into demo values ([[id]], [[demoValue]])
JavaScript service executing bulk demo data insert by wrapping the SQL service created above
for (i=0; i<2000; i++) { var params = { id: i /* INTEGER */, demoValue: 'Insert from node 0 while node 3 is down' /* STRING */ }; // result: NUMBER var result = me.InsertDemo(params); }
At this point different users in ThingWorx with sufficient access rights can create their DB Things in ThingWorx Composer and can use any of the node address to read/write the data to the CockroachDB cluster. For the purpose of demo one node was stopped while other 2 were running and data was written to the clsuter via the test service created above. Once the 3rd node was restarted we can see the automatic replication happening between the nodes; this can be seen live via the web based admin console of any of the running node's web console.
As highlighted above at certain point in time after i.e 1500hrs all nodes were synced with the data, including the node3 ,which as mentioned above was down while data was being inserted to the cluster. All of the above replication process was done using default configuration.
Hello Sushant,
Very useful blog, I will try this myself and reach out in case I get an exception. I do have few other queries related storing data external to Thingworx Postghres DB.
I currently have a solution which writes machine data to Thingworx Stream for all thing instances. I think in future, STREAM may not be able to handle such load and this may have to be split into multiple Streams. But I am not sure if that will help with performance or not as all stream data (even belonging to different stream) goes in the same backend STREAM table which is indexed on EntityId(stream Id). So my question is if we split up a single stream into multiple stream (same persistence provider) then would it really help performance ? I did read Adam's blog where he recommends splitting stream into multiple streams when it reaches a volume close to million. But I am wondering it will still have some performance issue as all the data is going to same STREAM table with different entity Ids. Please let me know what you think ?
2nd question :There was a talk on PTC product roadmap to change query services to run directly on Thingworx backend Database or something similar to improve performance (It was slate for some 8.x version). Is that already there or is it future release ?
3rd Question:
we do store some data related to Things in an external database which is queried using a JDBC connection. The volume coming back is not high as we delegate aggregating functions to the Database server itself which is faster to run through SQL. But my question is from Mashup standpoint. We are planning to implement mashups that directly access the DB, If 100 users are trying access data through mashups which inturn send the SQL query to DB for the same tablename will that create a bottleneck ? Does the queing happens automatically where some users see the result for their query quickly while others have a minor lag ? Will increasing ‘Maximum Number Of connections’ help (default is 5) and is there a limit on how high this number can be? Are there any other strategies to address external DB access ?
Also what about Inserts/Updates/Deletes. If multiple users try to run these commands on the same table and rows (for updates) will such commands be automatically queued by Database server when sent from Thingworx ?
Thanks in advance !
@supandey, @CarlesColl -- can one of you share your thoughts on the questions I asked above... thanks in advance !
Any change that CockroachDB will be available for Windchill? i would like to test it in that context.