cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - You can subscribe to a forum, label or individual post and receive email notifications when someone posts a new topic or reply. Learn more! X

finding (and removing) AT processing instructions with ACL

jsulak
1-Visitor

finding (and removing) AT processing instructions with ACL

Hello Adepters,

I'm trying to search for and remove specific Arbortext processing
instructions (such as ) using ACL.
However, when I use xpath_nodeset($arr, "//processing-instruction()"),
the resulting nodeset does not include any of the Arbortext-specific
processing instructions. Unfortunately I think this is the correct
documented behavior.

So, my question is, since I can't do it that way, is there another
approach I could take? I don't want to remove ALL of them (which I
could use "save -nopi" for) but just specific ones.

More generally, what I'm trying to accomplish is removing the shading
from a table, so if there's another way accomplish that in ACL without
directly manipulating the PIs, I'm all ears.

Thanks,

-James

18 REPLIES 18

Hi James--

Yes, you're right. For whatever reason, Arbortext chose to make their
own PI's hidden from the XPath processor. Luckily, the
oid_find_children() function groks those PI's, so you can do this pretty
easily using this instead of XPath:

# untested, debugging left as an exercise
function nukeCellShading(doc) {
local $pis[];
oid_find_children(oid_root(doc),$pis,"_cellfont");
for (i=1; i <= count($pis); i++) {
if (oid_has_attr($pis[i],"Shading")) {
oid_delete($pis[i]);
}
}
}

HTH.

--Clay

bibach
1-Visitor
(To:jsulak)

Hey, James...

Just off the top of my head (or, the top of the help file, anyway... haven't
actually played with this), I'd suggest using xpath_nodeset to iterate
through all table cells in the document and tbl_oid_cell (I think) to get
the cell "ID", which you can use with tbl_cell_fontpi to get the _cellfont
PI OID.

-Brandon 🙂

jsulak
1-Visitor
(To:jsulak)

Thanks, Brandon. Trying it out on a single cell, tbl_cell_fontpi() does
return an OID that I can delete, and in fact, that function can do the
deletion itself, which is convenient.



Thanks,



-James


jsulak
1-Visitor
(To:jsulak)

Thanks, Clay.

For anyone who's interested, this code (using oid_find_children to
access the PIs directly):

oid_find_children($table_oid, $nodeset, "_cellfont");
for ($i in $nodeset) {
if (oid_has_attr($nodeset[$i], "Shading")) {
oid_delete($nodeset[$i]);
}
}

is significantly faster than this code (translating the OIDs into cell
IDs):

oid_xpath_nodeset($table_oid, $nodeset, ".//entry");
for ($i in $nodeset) {
$cell_id = tbl_oid_cell($nodeset[$i]);
tbl_cell_fontpi($cell_id, 'delete');
}


-James

bibach
1-Visitor
(To:jsulak)

Yeah, "not the quickest one" has been said about me, too. After all those
years hanging around with Ed, I'm afraid some of it may have rubbed off on
him, too. 😉

Given previous findings on the relative speed of XPath vs. native (OID)
navigation functions, I'm not surprised that the oid_find_children version
is faster, particularly given that all the navigation is done in a single
function (which is probably written in C or C++, possibly with a thin ACL
wrapper around it), leaving just iteration of the results to the interpreted
ACL code.

-Brandon 🙂

This subject is fortuitous for me. I'm trying to figure out how, as part of the save process, to remove some Arbortext PIs, but not all of them.

Here's an example of my current document:


<xref format="dita" scope="local" type="topic">

</xref>

<ph id="save_oppy_intro"><ph id="save_oppy_intro2">The Spring '09 ...

I want to keep the "Pub_previewtext" PI, but I do not want the "Pub Caret" PI.

I think that the code presented in this thread can help me do that, but I don't know how to hook into the save process.

Can someone help me grope my way through this?

Steve

bibach
1-Visitor
(To:jsulak)

Steve,

You need to setup a "save" (and possibly "saveas") callback, which is a
function you write, then register with Editor so that it will be called on
save. In your case, you'll probably run your "PI fixup" code, then return
from the function with the "continue the save" code.

Type "help 148" at the Editor command line and it should bring up the
relevant help topic, which starts by describing the "doc_add_callback"
function (search for it, if "help 148" doesn't get you there), which is how
you register your function to be called. The rest of the topic describes
the various types of document callbacks you can create, including save and
saveas.

HTH...

-Brandon 🙂

Brandon Ibach
Developer, Single-Sourcing Solutions, Inc.

ebenton
1-Visitor
(To:jsulak)

Nah, I was already slow.


naglists
1-Visitor
(To:jsulak)

Steven,
If you are successful at removing the Pub Caret selectively, please publish
how you did it. We, too, would like to remove this on Save but have not been
successful at doing so. Arbortext does not handle every PI equally. I wish
there were an option to tell Editor NOT to create this PI ever.

If I'm doing this right, it appears that Paul is correct, the Pub Caret is special.

Here's my code:

$retval = doc_add_callback(current_doc(), 'save', 'clean_up')
function clean_up (doc, op) {
oid_find_children(oid_null(), $nodeset, "Caret");
for ($i in $nodeset) {
response("Found a caret PI")
}
response("Got save. doc = " . doc . "; op = " . op)
}

I get the response that indicates the save callback is triggered, but I do not get the response indicating that a PI has been found. If I change the match from "Caret" to an element that I know is in the document, I do get the expected response.

Looking into the save callback event, it says that save is executed before the file is saved. I think this PI is written as part of the save, so, it doesn't exist then. I then tried the same thing based on the destroy event, but with the same lack of success.

Any ideas?

Steve
naglists
1-Visitor
(To:jsulak)

>
> Looking into the save callback event, it says that save is executed before
> the file is saved. I think this PI is written as part of the save, so, it
> doesn't exist then. I then tried the same thing based on the destroy event,
> but with the same lack of success.
>
I hadn't considered when I last attempted this ... that the Pub Caret
didn't exist at the time my code ran. I don't know why ... it's possible I
was looking for it at other times, it's been a bit and I don't remember ...
it's possible it just didn't occur to me. In any case, that is an
interesting thought ... that it isn't handled differently, it is simply
inserted after any hook/callback might find it.

--
Paul Nagai

In that case, you might be stuck having to run a post-processor on the
file after it's saved, either a quick-and-dirty Perl script or something
more robust that parses the file and deletes the PI from the doc
structure before resaving it.


Shouldn't the destroy callback be able to catch it, though? The file has been written before the destroy, but it appears that oid_find_children doesn't find it there, either.

$retval2 = doc_add_callback(current_doc(), 'destroy', 'remove_pis')
function remove_pis (doc) {
oid_find_children(oid_null(), $nodeset2, "Caret");
for ($i in $nodeset2) {
response("Found a caret PI")
}
response("Got destroy. doc = " . doc )
}
bibach
1-Visitor
(To:jsulak)

I suspect the problem is that the caret PI is *never* in the document, as
far as the in-memory model. It's location is noted as the document is read
and set as it is written, but it only ever exists on disk.

I'm guessing the goal is to avoid the problems that can occur with
downstream processing due to the caret PI popping up in unexpected or
inconvenient locations. I know we had to fix up our publishing code at
least a few times for this reason. So, if you've decided that maintaining
the last cursor position across editing sessions is not important, why not
use the save callback to just move the cursor to a location where you know
the PI won't cause a problem? You could move the cursor back after the save
is complete, either by calling the save command during the callback (making
sure the callback doesn't go into a loop in the process) or by setting a
timer to restore the cursor.

-Brandon 🙂

naglists
1-Visitor
(To:jsulak)

You are correct in your guess about our intent anyhow. Documentum
occasionally chokes on the PI for some reason.

We knew something about where the PI upset Documentum so we actually check
for that bad-ish location and move the PI "up a few lines" if we find it in
the bad-place so as to preserve, as much as possible, the friendly behavior
authors like.

On Tue, Jul 7, 2009 at 3:20 PM, Brandon Ibach <
brandon.ibach@single-sourcing.com> wrote:

> I suspect the problem is that the caret PI is *never* in the document, as
> far as the in-memory model. It's location is noted as the document is read
> and set as it is written, but it only ever exists on disk.
>
> I'm guessing the goal is to avoid the problems that can occur with
> downstream processing due to the caret PI popping up in unexpected or
> inconvenient locations. I know we had to fix up our publishing code at
> least a few times for this reason. So, if you've decided that maintaining
> the last cursor position across editing sessions is not important, why not
> use the save callback to just move the cursor to a location where you know
> the PI won't cause a problem? You could move the cursor back after the save
> is complete, either by calling the save command during the callback (making
> sure the callback doesn't go into a loop in the process) or by setting a
> timer to restore the cursor.
>
> -Brandon 🙂
>
>
ebenton
1-Visitor
(To:jsulak)

I'm not a Documentum expert, nor do I play one on TV, but it seems to me
that a product with such close ties to Arbortext Editor should have a
way of dealing with any and all Arbortext PIs.


Well, that's a good point. But, we're XyEnterprise Contenta users and their custom Perl scripts that "aid" in an Epic checkout go out of their way to remove certain Arbortext PIs (mainly Pub carets).

Dave

I would imagine all these problems are to do with the bursting (or
chunking, or whatever you call it). If a PI appeared at a burst point,
in which CMS object would it live? If you are not bursting the XML to
create virtual/compound documents then the PI can simply be stored with
the rest of the XML in a single CMS object blob.

Bit of a bummer for the CMS vendors, but it's quite clear from the XML
spec that the intention of PIs is that they are ignored by all XML
processors (well, except the one they are meant for ;]).

-Gareth

Hintz, David L wrote:
> */Well, that's a good point. But, we're XyEnterprise Contenta users and
> their custom Perl scripts that "aid" in an Epic checkout go out of their
> way to remove certain Arbortext PIs (mainly Pub carets)./*
>
> */ /*
>
> */Dave/*
>
> */ /*
>
> *From:* Benton, Ed L [
> find it in the bad-place so as to preserve, as much as possible, the
> friendly behavior authors like.
>
> On Tue, Jul 7, 2009 at 3:20 PM, Brandon Ibach
> <brandon.ibach@single-sourcing.com <br="/>> <
>">mailto:brandon.ibach@single-sourcing.com>> wrote:
>
> I suspect the problem is that the caret PI is *never* in the document,
> as far as the in-memory model. It's location is noted as the document
> is read and set as it is written, but it only ever exists on disk.
>
>
>
> I'm guessing the goal is to avoid the problems that can occur with
> downstream processing due to the caret PI popping up in unexpected or
> inconvenient locations. I know we had to fix up our publishing code at
> least a few times for this reason. So, if you've decided that
> maintaining the last cursor position across editing sessions is not
> important, why not use the save callback to just move the cursor to a
> location where you know the PI won't cause a problem? You could move
> the cursor back after the save is complete, either by calling the save
> command during the callback (making sure the callback doesn't go into a
> loop in the process) or by setting a timer to restore the cursor.
>
>
> -Brandon Smiley Happy
>
>
>
> On Tue, Jul 7, 2009 at 3:40 PM, Steven Anderson
> <sanderson@salesforce.com <<a=" style="COLOR:" blue;=" text-decoration:=" underline&quot;=" target="_BLANK" href="mailto:sanderson@salesforce.com">>">mailto:sanderson@salesforce.com>> wrote:
>
> Shouldn't the destroy callback be able to catch it, though? The file
> has been written before the destroy, but it appears that
> oid_find_children doesn't find it there, either.
>
>
>
> $retval2 = doc_add_callback(current_doc(), 'destroy', 'remove_pis')
>
> function remove_pis (doc) {
>
> oid_find_children(oid_null(), $nodeset2, "Caret");
>
> for ($i in $nodeset2) {
>
> response("Found a caret PI")
>
> }
>
> response("Got destroy. doc = " . doc )
>
> }
>
>
>
> *From:* Clay Helberg [
> *Sent:* Tuesday, July 07, 2009 12:29 PM
>
>
> *To:* - <">mailto:->
> *Subject:* [adepters] - RE: finding (and removing) AT processing
> instructions with ACL
>
>
>
> In that case, you might be stuck having to run a post-processor on the
> file after it's saved, either a quick-and-dirty Perl script or something
> more robust that parses the file and deletes the PI from the doc
> structure before resaving it.
>
>
>
> *From:* Paul Nagai [
> *Sent:* Tuesday, July 07, 2009 2:27 PM
> *To:* - <">mailto:->
> *Subject:* [adepters] - RE: finding (and removing) AT processing
> instructions with ACL
>
>
>
> Looking into the save callback event, it says that save is executed
> before the file is saved. I think this PI is written as part of the
> save, so, it doesn't exist then. I then tried the same thing based
> on the destroy event, but with the same lack of success.
>
> I hadn't considered when I last attempted this ... that the Pub Caret
> didn't exist at the time my code ran. I don't know why ... it's possible
> I was looking for it at other times, it's been a bit and I don't
> remember ... it's possible it just didn't occur to me. In any case, that
> is an interesting thought ... that it isn't handled differently, it is
> simply inserted after any hook/callback might find it.
>
>
> --
> Paul Nagai
>
>
Announcements

Top Tags