Question
Arbortext Editor Parsers: XML vs. SGML
Good Day!
Hope someone here can shed some light on this situation. Thanks in advance for your attention to this somewhat long post.
We have a customer who wants their data in SGML format, rather than the XML of the majority. We do our input and publishing using XML, then change the DTD declaration (and remove , etc.) to point to an "identical" SGML DTD. Seemed to be working fine, until... Someone thought to click the completeness check in Editor this revision cycle with the SGML open. Context checking was still 'on', but the SGML was reporting missing content in a place the XML did not and does not.
Changing the PROCEDURE ELEMENT declaration in the SGML DTD to "...(topic | %text; |graphic)*))..." (use an '*' in place of the '+') eliminates the error message. My belief is that the XML parser sees the %text; entity being satisfied (%text;'s '*'s allow for zero content) as also satisfying the '+', but the SGML parser sees the '+' as overriding the entity's '*'s.
While it may seem the easy answer is, make both use '*' rather than '+', the question remains: why don't they both return the error? Not sure which one should be considered 'wrong', if either, regarding the intent behind the structure, but shouldn't both parsers return the same response with what are essentially identical inputs?
Error message with SGML:
"/procedure Unexpected end tag encountered. More content is required."
Here's a 'sanitized' example of the tagged data, and the relevant portions of the DTDs.
Data:
...<procedure...><title>Title Data</title><pfmatr><pretopic>...</pretopic><pretopic>...</pretopic></pfmatr>(cursor here as result of error)</procedure>...
XML DTD:
| table | revst | revend)*" >
mfmatr?, chapter*) | increv | tr+) >
(pfmatr?, (topic | %text; |graphic)+))
| %deleted;) >
SGML DTD:
| table)*" >
mfmatr?, chapter*) | increv | tr+)
+(revst|revend) >
(pfmatr?, (topic | %text; |graphic)+))
| %deleted;) >
Thanks for any insight,
Steve Thompson
TAD Technical
Boeing-IDS Technical Publications
+1(316)977-0515
MC K83-08
The truth is the truth even if nobody believes it, and a lie is a lie even if everyone believes it.
NOTICE: This communication may contain proprietary or other confidential information. If you are not the intended recipient, or believe that you have received this communication in error, please do not print, copy, retransmit, disseminate, or otherwise use the information. Also, please indicate to the sender that you have received this e-mail in error, and delete the copy you received. Any and all views expressed are the current understanding of the sender and should not be interpreted as an expression of official Boeing Company policy or position.
Les renseignements contenus dans ce message peuvent être confidentiels. Si vous n'êtes pas le destinataire visé ou une personne autorisée à lui remettre ce courriel, vous êtes par la présente avisé qu'il est strictement interdit d'utiliser, de copier ou de distribuer ce courriel, de dévoiler la teneur de ce message ou de prendre quelque mesure fondée sur l'information contenue. Vous êtes donc prié d'aviser immédiatement l'expéditeur de cette erreur et de détruire ce message sans garder de copie.
Hope someone here can shed some light on this situation. Thanks in advance for your attention to this somewhat long post.
We have a customer who wants their data in SGML format, rather than the XML of the majority. We do our input and publishing using XML, then change the DTD declaration (and remove , etc.) to point to an "identical" SGML DTD. Seemed to be working fine, until... Someone thought to click the completeness check in Editor this revision cycle with the SGML open. Context checking was still 'on', but the SGML was reporting missing content in a place the XML did not and does not.
Changing the PROCEDURE ELEMENT declaration in the SGML DTD to "...(topic | %text; |graphic)*))..." (use an '*' in place of the '+') eliminates the error message. My belief is that the XML parser sees the %text; entity being satisfied (%text;'s '*'s allow for zero content) as also satisfying the '+', but the SGML parser sees the '+' as overriding the entity's '*'s.
While it may seem the easy answer is, make both use '*' rather than '+', the question remains: why don't they both return the error? Not sure which one should be considered 'wrong', if either, regarding the intent behind the structure, but shouldn't both parsers return the same response with what are essentially identical inputs?
Error message with SGML:
"/procedure Unexpected end tag encountered. More content is required."
Here's a 'sanitized' example of the tagged data, and the relevant portions of the DTDs.
Data:
...<procedure...><title>Title Data</title><pfmatr><pretopic>...</pretopic><pretopic>...</pretopic></pfmatr>(cursor here as result of error)</procedure>...
XML DTD:
| table | revst | revend)*" >
mfmatr?, chapter*) | increv | tr+) >
(pfmatr?, (topic | %text; |graphic)+))
| %deleted;) >
SGML DTD:
| table)*" >
mfmatr?, chapter*) | increv | tr+)
+(revst|revend) >
(pfmatr?, (topic | %text; |graphic)+))
| %deleted;) >
Thanks for any insight,
Steve Thompson
TAD Technical
Boeing-IDS Technical Publications
+1(316)977-0515
MC K83-08
The truth is the truth even if nobody believes it, and a lie is a lie even if everyone believes it.
NOTICE: This communication may contain proprietary or other confidential information. If you are not the intended recipient, or believe that you have received this communication in error, please do not print, copy, retransmit, disseminate, or otherwise use the information. Also, please indicate to the sender that you have received this e-mail in error, and delete the copy you received. Any and all views expressed are the current understanding of the sender and should not be interpreted as an expression of official Boeing Company policy or position.
Les renseignements contenus dans ce message peuvent être confidentiels. Si vous n'êtes pas le destinataire visé ou une personne autorisée à lui remettre ce courriel, vous êtes par la présente avisé qu'il est strictement interdit d'utiliser, de copier ou de distribuer ce courriel, de dévoiler la teneur de ce message ou de prendre quelque mesure fondée sur l'information contenue. Vous êtes donc prié d'aviser immédiatement l'expéditeur de cette erreur et de détruire ce message sans garder de copie.

