Showing results for 
Search instead for 
Did you mean: 
Showing results for 
Search instead for 
Did you mean: 

Community Tip - When posting, your subject should be specific and summarize your question. Here are some additional tips on asking a great question. X

SGML Change Tracking in XML documents


SGML Change Tracking in XML documents

We are investigating using Change Tracking in our documents. Arbortext
Editor uses PIs for change tracking in SGML documents, but uses
namespace elements in an XML document.

Our documents are XML, but instead of a Schema, which would be
namespace-aware, we use a DTD, which is not. So whenever we try to
validate one of these files, it fails.

Does anyone know if there is a way to use the PIs for change tracking in
an XML document, instead of namespace elements?

We are currently using Editor 5.2 M051, but if there is a solution, we
would like to know if it would also work in 5.4.


That's odd. We use namespace elements in our XML, in 5.3 M030, using DTDs, and with no problem when validating the files.

Could it be that your files are not getting the proper namespace declaration(s) in the opening root tag?

Steve Thompson
TAD Technical
Boeing-IDS Technical Publications
MC K83-08
The truth is the truth even if nobody believes it, and a lie is a lie even if everyone believes it.

NOTICE: This communication may contain proprietary or other confidential information. If you are not the intended recipient, or believe that you have received this communication in error, please do not print, copy, retransmit, disseminate, or otherwise use the information. Also, please indicate to the sender that you have received this e-mail in error, and delete the copy you received. Any and all views expressed are the current understanding of the sender and should not be interpreted as an expression of official Boeing Company policy or position.

Les renseignements contenus dans ce message peuvent être confidentiels. Si vous n'êtes pas le destinataire visé ou une personne autorisée à lui remettre ce courriel, vous êtes par la présente avisé qu'il est strictement interdit d'utiliser, de copier ou de distribuer ce courriel, de dévoiler la teneur de ce message ou de prendre quelque mesure fondée sur l'information contenue. Vous êtes donc prié d'aviser immédiatement l'expéditeur de cette erreur et de détruire ce message sans garder de copie.

Hi Ed-

Well, things get a little complicated with change tracking. (Almost as
bad as tables-you can imagine what it's like when you start trying to
track changes *within* tables....) Even if you get the namespace/PI
issue sorted out, you'll probably have trouble validating change-tracked
documents outside of Arbortext. (I assume you're talking about something
other than Arbortext failing to validate the document; Arbortext itself
should grok the CT markup correctly.)

Here's an example: suppose you have a <figure> element which can contain
exactly one <graphic> element. Now suppose you edit a document, and
replace an out of date graphic with a new one. That part of the document
will now look like this (with attributes omitted for brevity):


<atict:del><graphic src="old.gif"/"></atict:del>

<atict:add><graphic src="new.gif"/"></atict:add>


If you replace the atict: elements with PI's, then an external
application will see this as a figure with two graphic child elements,
and consider the document invalid. Even with namespaced elements, an
external application that doesn't understand what <atict:del> and
<atict:add> mean will see this as a <figure> without the required child
<graphic>, and again you'll get the invalid document error.

If you need to validate the document outside Arbortext, you may need to
preprocess it to do something with the CT markup first (whether it's
namespaced elements or PIs). This could be done in any of the usual
ways, e.g. XSLT, Perl script, Java DOM code, etc.


Arbortext Editor doesn't complain. But we are using a third-party parser (NSGMLS) to validate our files, and it doesn't know what the namespace elements are. We figure if we could use the PIs in XML that we could easily validate the files. We were hoping there may be a way to "fool" the Change Tracking feature into thinking that an XML document is SGML, or something.

Yeah, I see what you mean about having extra tags where they're not
allowed, even if inside a PI. We may have to scale this Change Tracking
idea down to a much more ignorant, home-grown one with just inline tags.

After thinking about this a little more, if your <figure> tag can only
contain one <graphic>, the Arbortext Editor interface would not let you
delete the tag, nor would it let you add a second one, so I don't think
that would be a problem.

Actually, Arbortext will let you do this by copying a <graphic> from
somewhere else and pasting it to replace an existing graphic. In that
case, the change tracking markup you get is exactly as I described.

This situation can also come up where you have alternate child elements.
For example, suppose <figure> can contain either a <graphic> or a
<subfigure> (but not both). You can delete the graphic and add a
subfigure, which will result in something like this:





And again, the external application will balk at this, not understanding
that it should ignore the contents of the <atict:del> as far as context
checking is concerned.

I'm not sure what your workflow is like, i.e. where the validation in
NSGMLS is happening, but I see three possible approaches if you want to
stick with Arbortext change tracking:

a) insert a preprocessing step to remove/resolve the CT markup
before validating

b) somehow make NSGMLS understand what to do with the CT markup

c) when you save the doc, actually make two saved copies: one with
the CT markup and the other with changes applied (there's a save flag
for this, IIRC). The latter would be the version that NSGMLS sees, at
least as far as validation goes.

Of course, developing your own change tracking markup standard would
also work, but you'd better be ready to invest significant time and
energy into it. Having done something similar before, I can tell you
it's far from trivial.


Not to nitpick (who do I think I'm kidding?), but that example uses
namespace elements, not PIs. Can't say I've ever examined the Change
Tracking 'stuff', so didn't know firsthand what was used.

Steve Thompson

Hi Steve-

The PI version looks something like this:


<graphic src="old.gif"/">

<graphic src="new.gif"/">


The problems are somewhat different w.r.t. external parsers for PI's vs.
namespaced markup, but they still exist.

You can find out more about CT markup by looking at Arbortext's document
on the subject:

Thanks for the input, Clay. We are probably going to go with something
similar to your option c) below.

Top Tags