cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Did you get called away in the middle of writing a post? Don't worry you can find your unfinished post later in the Drafts section of your profile page. X

finding non-keyboard characters, code, find char entity bug?

naglists
1-Newbie

finding non-keyboard characters, code, find char entity bug?

Hi,

Had a need to find non-keyboard characters during a recent 5.1 to 5.3
upgrade. Developed an ACL to make it easy. Code included below. It is not
"user" ready, but it mostly gets the job done.

Interestingly, there are a handful of character entities that exhibit
strange behavior in Editor. They appear to always be preceded by a space.
You can delete this space if you Edit XML as Source the entity, however,
Editor restores the space. One of the cedil's is always preceded by an open
parenthesis. What?! Strange.

Is anyone else aware of this behavior? I'll be submitting a call against it,
but thought I'd see if it was known (or, by some stretch of the imagination,
a feature).

Anyhow, there were a couple of entities survived the trip through the
pipeline and composition in 5.1 but did not in 5.3. This helped find / fix
them. It also revealed some very strange characters we think are conversion
artifacts, not from the upgrade, but from the FrameMaker to XML via
Interchange conversion years ago. These do not apparently have any impact on
formatting, but they are kinda creepy.
3 REPLIES 3

>
> Interestingly, there are a handful of character entities that exhibit
> strange behavior in Editor. They appear to always be preceded by a space.
> You can delete this space if you Edit XML as Source the entity, however,
> Editor restores the space. One of the cedil's is always preceded by an open
> parenthesis. What?! Strange.


This behavior, by the way, is revealed by trying to ignore finding them. The
acl always finds them even while trying to ignore them because
selection_markup() returns, for example "^¨" (where ^ is a space)
instead of "¨". I guess I could modify the acl to ignore the entity and
its strange preceding space ... Anyhow, I wanted to clarify how that
behavior related to this acl.

--
Paul Nagai

Paul,

You encountered an edge condition having to do with how we pass around combining diacriticals (I hope I have the right phrase) internally. Under normal circumstances, if you had an "x" with a cedilla following it in the character flow then you should see an x with a cedilla under it on the screen. And if you selected it, you would get back both the "x" and the cedilla as two distinct characters. But when passing around just a cedilla, we do something special so that the diacritical does not try to combine. And you see this "something special" as that leading space.

Something we can address at some point because we are slowly getting rid of the legacy "non-unicode" parts of the software but we don't have any work in this area scheduled today.

John Dreystadt
Software Development Director
Arbortext - PTC
734-352-2835
-

Hey John,
Thanks for the information. I'll save everyone some time and skip opening a
call.

Top Tags