Windchill System Backup and Disaster Recovery Repl...

GaryMansell · ‎Sep 04, 2012

Hi all,

I am wondering if anyone has any experience of running a Windchill 10 System in a VMware environment with Veeam Backup and Disaster Recovery Replication (http://www.veeam.com) configured that they might like to share? Or, whether anyone has any views on the suitablility of such a configured System for Disaster Recovery purposes?

Currently our physical Windchill 9.1 System is replicated to a second set of servers in a Co-Location factility using Microsoft DFS for the Vaults and PTC Loadpoint, DBvisit for automated Oracle Log Shipping and WindchillDS Replication for LDAP.

This system has served us well for the last three years but it takes a lot of administrative effort to maintain both of the Systems and failover is a complex and manual process that takes a significant amount of time.

I want to virtualise our newly upgraded Windchill 10 System onto the VMWare platform this year to gain all the benefits inherent in such a system (snapshots/cloning etc), and it seems to me that Veeam would greatly simplify the administrative effort required to maintain a second Disaster Recovery copy of the System at a remote location, and also improve our Backup System performance by backing up at the VM level rather than at the OS level. Further, it also seems that Veeam could be used to simply generate an always up to date exact copy of the Production System as a TEST environment running in an isolated Sandbox environment (it can retain the same Windchill System name).

If I ran Oracle 11g on a Windows 2008r2 platform, both Oracle and the Windows OS itself are Windows VSS aware, and hence can be quiesced before the Veeam Snapshots are taken so that the OS and the DB can be stored in the VM image as a transactionally sound backup in the VMDK image.

What concerns me somewhat is that the Veeam backup/replication of VM’s does not save the state of the running machine in totality (ie including VMware memory snapshot) – they only snapshot the VMDK so what about the Windchill Java app itself and WindchillDS LDAP – are they going to start OK from a crash consistent VMDK image backup (ie as if started after a power failure of the original host) reliably each time?

Any advice or experience on this type of Disaster Recovery System would be greatly appreciated.

Best Regards

Gary Mansell

ptc-4901412 · ‎Feb 07, 2013

Gary,

We are looking to do the same thing. I see that you never got a response but wonder if you have implemented it yet and what experiences you would like to share. I really like the capability of Veeam to stand up a copy of production as a test server without having to change the name of the server.

Thanks

Dwight

MichaelSchumach · ‎Nov 12, 2013

Gary,

Like Dwight, we are looking at how best to create a disaster recovery strategy for our Windchill implementation. During our last Windchill migration from 9.1 to 10.1, we created a VM and have Windchill 10.1 running in a VM environment.

We too are looking for others who are using DFS and VMWare and the strategies used for disaster recovery.

Mike

GaryMansell · ‎Nov 13, 2013

In the end I decided that Veeam was not correct the solution for DR/Failover of our new Windchill 10.1 System and I chose to replicate our current9.1 System's Application level replication system that we use at 9.1 (DFS for App and Vaults/DBvisit for Oracle Log Shipping and WindchillDS Replication for Administrative LDAP), the improvement that I have made, though, is to virtualise all the servers with VMware so that I can now snapshot the system (before installing patches/upgrades etc) and clone for more easy creation of a Validation Environment.

There were several reasons and technologies that I investigated from block level disk replication to Veeam etc but I was not happy that the failover system would perform well enough under Veeam, I was also not happy with all the snapshots and the performance that they may have on the PROD server. We were also looking at very long cycle times between snapshot replications which would not have been as good as our Application level replication cycle times. Also, if the system gets out of sync with these sort of techniques, the whole disk can be corrupt and un-readable in time of failure. Complete re-syncs can take a very long time and a colleague reports that he has had to keep doing them to fix issues so that counted against Veeam.

I ended up in the conclusion that it is much safer and better performing to do Application level replication and this is what I went with - it requires some manual steps and takes about 30 minutes to switch the system between servers but it works and is tried and tested here.

The worst case scenario is that the DB server fails because this has a DBvisit replication period of 2 minutes so we could lose 2 minutes of transactions in this case (if we could not recover the live oracle logs from the failed server's disk). If the application server fails then the other sources are real time replicated so we should be in a better place. I was also told by an expert at PTC that the windchill application always persists it data in the DB immediately rather than holding transactions in RAM - this was the DB should be crash consistent if the machine just dies.

I hope that helps you.

Rgds

Gary

MichaelSchumach · ‎Nov 19, 2013

Gary,

Thanks for taking time to respond with additional information. This is helpful as we analyze our options for disaster recovery.

-Mike

Windchill System Backup and Disaster Recovery Replication with VMware and Veeam

Windchill System Backup and Disaster Recovery Replication with VMware and Veeam