Migrating 300GB+ data not using UI export function...

tcoufal · ‎Jul 25, 2019

Hi,

does anyone know what the most efficient way would be to migrate data from old version of ThingWorx to the latest release.

We are using PostgreSQL as persistence provider. This runs in cluster and HA mode.

DB is a different machine than Tomcat with ThingWorx.

We tried to do In-Place upgrade a had some problems with it. Suggestion was to do a migration:

New DB, installation scripts etc.

export extensions and entities from old platform

We have tried data export via Composer (Export to ThingWorx Storage) but that failed + we dont have lot of space on our application server.

I would like to use pgAdmin or psql to copy content from ValueStream, Stream and Datatable tables (perhaps wiki and blog as well).

Can anyone suggest some concrete SQLs to do that?

Thanks a lot.

Tomas

Constantine · ‎Jul 25, 2019

Hello Thomas,

Just as an alternative idea that might work if you don't get any better answer -- you can try mounting remote filesystem inside your exports directory (using NFS for example). This way you'll get your exported data uploaded "automatically".

/ Constantine

tcoufal · ‎Jul 25, 2019

Hi,

what do you mean uploaded "automatically"?

I still need to upload data from Storage using Composer to move data to database, need I not?

And our PROD environment fails to export data under normal operation, even when I do one stream at a time.

Can be overcome if we cut out all incoming traffic (edge things) and stop all Timers and Schedulers (this is something that we cannot do, it takes too long). We can work with 1 - 2 hours window.

Constantine · ‎Jul 25, 2019

My thinking was along the following sequence:

Map /ThingworxStorage/exports/remote on your new machine to /ThingworxStorage/exports on your legacy box via NFS, Samba or whatever network filesystem you have available in your OS
Launch "Export to ThingWorx Storage" and wait till it completes. It will take quite some time, because all data is transferred over the network at the same time.
On the 2nd system simply launch "Import from ThingWorx Storage"

WARNING: I seriously don't recommend doing this directly in PROD, at least before trying it on a similarly sized test environment at least a couple of times (what we call a "dry run").

On a side note, there are some base rules for doing major upgrades right:

Start by creating an upgrade manual and update it every time you do a dry run. This is the most important document you'll have on a "go-live" day.
Every time you perform an action -- record its duration and put it in your upgrade manual. Like this you will know how long the whole process will take, but also it will help you to identify issues (e.g. some step completed much faster than planned -- strange!)
Define a rollback sequence (also with timing) and a "plan B" procedures, e.g. "if the data import hasn't finished within 90 minutes, then execute rollback"
Practice taking backups and (even more importantly) practice restoring them -- also keep it as part of your rollback sequence, if necessary
Document smoke tests, i.e. describe some test cases that will act as a go/no-go criteria. Most of them (in practice not necessarily all) should pass. On a D-day execute them as carefully as possible, because typically the person who does the final acceptance checks is the one formally responsible for the success of the complete upgrade. Our golden rule is to have at least one test case for each mashup, and at least 5 test cases per user group, thing template, data table, etc. Around 100 -- 200 test cases for an average upgrade, taking roughly between one and two hours to execute manually.
Communicate your plan with the customer well ahead of time and involve somebody from the customer side in important decision making, if possible
Define communication plan for the upgrade. For example, you need to be able to call your network admin if something goes wrong, he needs to be aware of your upgrade and available during complete upgrade. There are usually between 5 and 10 people who needs to be available to minimize all risks (ideally including someone from PTC if you manage to arrange it). Since upgrades are often done during week-ends, this is something that shouldn't be underestimated.
(Most importantly) Do several dry runs, more is better, but no less than three full-scale ones
Avoid doing early dry runs with productive environment, upgrade your preprod instead. It would be very good if you do at least one dry run with real prod.
Make sure that your preprod has exactly the same code as prod and sized similarly. Ideally it should also have some reasonably up-to-date data set

Violating any of those rules increases the risk of failure. But since you are already in production, then most of those items should be familiar to you (e.g. I guess you already have a preprod environment), and this list is actually not as scary as it may look like.

Hope it helps.

/ Constantine

tcoufal · ‎Jul 25, 2019

That is very extensive and elaborate answer :)

And yes we had all of that.

Except for point 5.

Our tests were not that extensive, since we cannot properly simulate all incoming data (KepServerEX does not support sending the same data to several ThingWorx instances).

And we have found no major problems.

Moreover we have done In-Place upgrade so no need for data transfer.

I talked to our customer, network disk will not be possible :/

I found this for DB data transfer that might work:

pg_dump -U <Username> -h <host> -a -t <TableToCopy> <SourceDatabase> | psql -h <host> -p <portNumber> -U <Username> -W <TargetDatabase>

I would need to verify if someone has done it and what the results were.

Migrating 300GB+ data not using UI export functionality

Migrating 300GB+ data not using UI export functionality