Ingestion and processing of large CSV files into I...

AaronD · ‎Apr 16, 2020

Hello ptc community,

I'm currently developing a project which needs to process a lot of data.

Some sensors are directly reporting to a thingworx thing and thus automatically are saved in a stream.

Other sensors can not be connected to thingworx, but write their data to csv files which I can access at the end of the process.

The files will be quite large.

A rather small workpiece generates about 400MB of csv files while larger workpieces could reach 1+GB.

The files contain over 1 million entries.

1. The files will be available on a client machine. Do I need to transfer them to the thingworx server or can I read them directly?

2. Should I use Javascript or Java (most certainly Java as the performance will be better right?)

3. What is the best technique to transfer that much data into the InfluxDB? Direct access? Streams?

4. Can I query multiple tables into one time series graph? Do I always need a thing to display entries?

Are custom queries possible to reduce the data and shift the heavy lifting to the influxDB?

Thanks for your help.

This project collects all this data to make it possible to certify the workpieces.

Greetings from Munich

Aaron

PaiChung · ‎Apr 16, 2020

Reading the size and scale of what you have in the csv files, I'm not so sure you want to use the 'main' Thingworx platform to do this processing, the load it would put on the system I think will be enormous.

I would recommend engaging with PTC's services/technical team to understand what might be the best way.

I think:

Yes file has to be moved local

ingest the file using a separate server

find a way to write the rows to influx db as efficient as possible

AaronD · ‎Apr 16, 2020

Hy PaiChung,

thank you for your expertise.

Can you give me a link/email address to the right team :)?

Best regards

Aaron

PaiChung · ‎Apr 16, 2020

Not sure if through what means you are subscribed to Thingworx.

If you are already a client/partner you should have a contact.

Aside from that I would use the different contact information provided with the different programs that we have available.

Sorry that I'm not able to directly give you the right contact.

AaronD · ‎Apr 17, 2020

Ah no that's ok.

Then I will ask my collegeus.

My company is a thinkworx partner.

Ingestion and processing of large CSV files into InfluxDB

Ingestion and processing of large CSV files into InfluxDB