Gareth mentions this, but I use the concept of "normalizing" the XML/SGML. I use a $30 program called Search & Replace by Funduc Software (www.funduc.com). You can run it using an UI, or through scripts. I have some scripts that are 100's of lines manipulating the heck out of a document for some reason or another. The one I use to normalize simply puts everything on its own line, start tags always start the line... and so on.
The problem with relying on Arbortext to save a document is that it uses the recordlength setting to determine where to break a line. When you modify the document it can shift a lot based on entering the letter "a". Point is, just because you save both documents with Arbortext won't guarantee that they will line up when doing a line-by-line comparason. I use this concept everyday trying to figure out why a new version of a document doesn't publish the same way it's previous version did (barring changes, of course).