Community Tip - You can Bookmark boards, posts or articles that you'd like to access again easily! X
For one of the customers, we are working on application development where Snowflake DB data (Power bi View tables) need to be inserted in postgres through ThingWorx schedular but while doing so ThingWorx Apache Tomcat went down. We are handling here around 22 lac rows to insert it in database. While doing so we are experiencing this. Please help us to improve this.
Is there other way to insert this many rows from snowflake DB to postgres without affecting ThingWorx performance?
Solved! Go to Solution.
In step 3 of the link that I sent you it says "Transferring data is accomplished by sending a POST request to SQLpipe's /transfer/create route. You can send this kind of command any way you want - from an Airflow DAG, a Cron command, or just Curl on your command line."
So you can use the TWX ContentLoader Functions, triggered by subscribing to the Scheduler event to send a POST request to start the transfer. There is no other guide, you will have to figure out the implementation yourself.
1) 22 "lac", is that 2.2 million rows?
2) Why would you do that via ThingWorx? This it a potentially long-running transaction which is not a task ThingWorx is optimized for.
3) You did not specify why it went down. Did you investigate? This is important information
4) You did not clearly state how you are extracting the data, which API/interface you are using
5) Chunk the data. 2.2 million rows will create a large transaction and eat up a lot of memory when done in one block. Create chunks of 10k rows or so and do the iterations.
1) yes its 2.2 million rows
2)we replicating power bi dashboard which are using snowflake db where data is been changed frequently even for last month as well in snowflake. So what we are doing is, we are using schedular to write the data for last 12 month in postgress once in a day around 12 am UTC.
3) i tried to investigate it, while writing this data CPU usage got almost 99% and thingworx stopped working. then i had the end the task in task manager and restart the tomcat.
4)we are connecting to snowflake by database thing by providing necessary configuration
Aside from the scheduler, ThingWorx has no real job here, IMO.
If you want to keep using ThingWorx for this task, break the transactions down into smaller ones. the REST interface of snowflake supports result paging, giving you only a couple hundreds lines per page. Not sure if that is also available with the JDBC driver, but you can just run multiple queries.
Or use an ETL tool for the job, e.g. sqlpipe https://www.sqlpipe.com/data-transfer-guides/snowflake-to-postgresql
i am exploring sqlpipe, is there any link where it will guide how to use this and integrate with snowflake and thingworx....
In step 3 of the link that I sent you it says "Transferring data is accomplished by sending a POST request to SQLpipe's /transfer/create route. You can send this kind of command any way you want - from an Airflow DAG, a Cron command, or just Curl on your command line."
So you can use the TWX ContentLoader Functions, triggered by subscribing to the Scheduler event to send a POST request to start the transfer. There is no other guide, you will have to figure out the implementation yourself.