Skip to main content
1-Visitor
June 3, 2011
Question

entityrefs

  • June 3, 2011
  • 5 replies
  • 2236 views
This is a pretty open-ended question, but hopefully it will be an easy fix.

Over the past 6 years or so, we would get the occasional entity (®,
&ndash) appear in our content (usually due to copy/paste from other non-xml
documents).

For some reason, this year we're getting all sorts of &ndash and &mdash
characters showing up, but I'm not aware of any significant changes made to
our DTD or configs.

Is there a simple way to force Arbortext to disallow (or automatically
resolve) these entity refs? We try to tell the writers to only put in
characters on the keyboard, but copy/paste still allows the rogue ’ or
the like to show up.

Any thoughts appreciated,
Keith Berard
Milliman Care Guidelines

    5 replies

    1-Visitor
    June 3, 2011
    Play with this and see what you need:
    I think there are two preferences: "Convert Character Entities on Save:" and "Write Non-ASCII Characters As:".

    [cid:image001.png@01CC21E5.3B338000]


    -Andy

    \ / Andy Esslinger LM Aero - Tech Order Data
    _____-/\-_____ (817) 279-0442 1 Lockheed Blvd, MZ 4285
    \_\/_/ (817) 777 3047 Fort Worth, TX 76108-3916
    18-Opal
    June 3, 2011
    Or, in ACL-ese, "set entityoutputconvert" and "set writenonasciichar".
    🙂



    --C



    Clay Helberg

    Senior Consultant

    TerraXML


    berard1-VisitorAuthor
    1-Visitor
    June 3, 2011
    Checking both didn't seem to do what I wanted, but just Write Non-Ascii ->
    Characters did the trick.

    Thanks also for the acl, I'll add that to our startup scripts.

    keith

    1-Visitor
    June 3, 2011
    You might also want to look at 'entityinputconvert' which should ensure entities are converted to characters on opening a document or pasting in content from an external source.

    Something to be aware of, if you place these options in a startup script their settings will not necessarily be reflected in the preferences panel. You can use eval option() on the command line to get the setting for the current document.

    David

    David S. Taylor

    Project Manager, Structured Information
    Institute for Research in Construction
    National Research Council Canada
    Bldg. M-23A, Room 239
    1200 Montreal Road, Ottawa, ON K1A 0R6
    1-Visitor
    June 3, 2011

    Keith,


    If you're talking about the M-Dash and N-dash showing up in your document, it may be because of automatic character substitution being allowed. It will seem erratic because it's based on the location of the cursor when you type the '-' key. It will enter a dash, ndash or mdash based on what type of character is just before it meaning letter, number or space. You can also cycle through them by repeatedly hitting the '-' key. Since there isn't an ASCII character for those two, I assume they're being saved as entities.


    If this is the issue, you can turn it off by either the DCF file for the doctype, or by remapping the '-' key in your init ACL file. It depends on the scope you want to address.


    Hope this helps,


    Bob