cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

ACL - How to remove string with forward slashes?

ED_9927023
5-Regular Member

ACL - How to remove string with forward slashes?

Objective is to remove a specific string with special characters from a file programmatically.

 

The problem experienced is, everything leading up to the forward slash I can remove, as soon as i include a forward slash in the $stringToRemove variable, the program fails to perform a substitution with the value "test". Ultimately i will replace "test" with an empty space. Lack of debugging tools makes this a bit tricky to solve.

 

$stringToRemove =  '<!string1 % string2 "string3//string4/string5"> %string6; '

 

$Double_Quote_Symbol  = chr(34)
$Percent_Symbol             = chr(37)
$Forward_Slash_Symbol = chr(47)

 

$stringToRemove = "string1 " . $Percent_Symbol . "string2" . $Double_Quote_Symbol . "string3//string4/string5" 

execute("substitute -a -c -noe -ws -noq /" . $stringToRemove . "/test")

 

Note: this is performed on an XML file opened as a text file using "edit -untagged"

 

I've attempted:

1) using the decimal value of a forward slash to ensure it is recognized as a string value rather than a built in special function using chr().

2) using the quote() function in attempt to return the string value rather than built in functions reading the forward slash in a string incorrectly. Unsuccessful.

 

 

$stringToRemove = quote('<!string1 % string2 "string3//string4/string5"> %string6;')

 

 

3) building the string piece by piece and escaping the forward slash with a backslash which did not work for me. example: 

 

 

 $stringToRemove = $string1 . $string2 .  $string 3 . "\/\/" . $string4

 

 

 

I noticed in the 'substitute' docs that ACL may perform essentially a tag balancing check. Does ACL interpret this forward slash in a string and expecting it to be the end tag and ffails because of this? All I hope to accomplish is to ensure this string can be programmatically removed.

 

 

 

 

 

3 REPLIES 3

Hello again. It's not 100% clear what you're trying to achieve, is it that you're looking to rename some elements across a whole bunch of files? If so then the "Arbortext way" is not to use this sort of script. You would either use oid_XXX functions to implement a recursive tree walker algorithm or a simple XSLT to apply the markup transformation.

If you're just looking to do a string replace then this works to replace all ABC/XYZ with ZZZ: subs -a -c -ws "ABC/XYZ"ZZZ"

If you want to do some basic markup replacement then this works to rename <abbrev> tags to <acronym>, but again is not really the Arbortext way: subs -a -c -ws -m "<abbrev>(.*)</abbrev>"acronym>\1</acronym>"

ED_9927023
5-Regular Member
(To:GarethOakes)

@GarethOakes You have been a life saver with all the help thank you.

 

so in conclusion to this particular issue, I noticed instead of Forward Slashes you used Double Quotes as a means to tell the ACL command the oldtext and newtext parameters. 

 

The ah-ha moment came and noticed because the execute function interprets the string which is passed in, there was a conflict with the string i wanted to replace and the ACL command syntax passed into the execution function.

 

The string i wanted to delete is the Entity Tag up to the ending semi colon. (format is exact, but due to work I am replacing the sensitive info):

 

 

 

<!DOCTYPE pm
    <!ENTITY % ISOEntities PUBLIC "aaaa//bbbb//cccc//dddd" "eeee//ffff/gggg/hhhh/iiii"> %ISOEntities;
] 

 

 

 

 

so it may very well be that I should be utilizing the oid functions to iterate over the DOCType tag. It appears to be holding a Data Collection assumed by the square brackets. 

Its observable now, if there are multiple items within the DOCTYPE pm brackets and I simply substitute a line with an empty space there could be a problem with the formatting of the XML in that particular collection. Im not certain if this will be an issue exactly but will be experimenting with it next!

 

I just thought converting the file to text to remove this single string in all files would be the quickest way to be value added to the operation. Only about a month into ACL so still learning!

 

this is the resulting successful solution.

 

$stringToRemove_ISOEntities   = '<!ENTITY % ISOEntities PUBLIC " '
$stringToRemove_ISOEntities_2 = 'aaaa//bbbb//cccc//dddd'
$stringToRemove_ISOEntities_3 = '" "eeee//ffff/gggg/hhhh/iiii"'
$stringToRemove_ISOEntities_4 = '"> %ISOEntities;'

execute('substitute -a -c -ws -m "' . $stringToRemove_ISOEntities   . '"' . 'test1-"' )
execute('substitute -a -c -ws -m /' . $stringToRemove_ISOEntities_2 . '/' . 'test2-' )
execute('substitute -a -c -ws -m "' . $stringToRemove_ISOEntities_3 . '"' . 'test3-"')
execute('substitute -a -c -ws -m /' . $stringToRemove_ISOEntities_4 . '/' . 'test4-' )

 

 

output:

 <!DOCTYPE pm [test1-test2-test3-test4-
]>

Ah OK the DOCTYPE is particularly awkward, as it's not a regular tag or PI. Entities in particular are a bit of a pain with XML because they are replaced during parsing so you don't really get access to them with most XML APIs. I think Arbortext has a way to deal with it programmatically but I've not tried. If your search and replace works across your data set then that is probably easiest! I guess the edit -untagged trick would be enough to get around most of the issues so you're probably pretty safe. Glad to have helped!

Announcements