On 2/19/2010 4:02:30 PM, pbaxter wrote:
== Just realized that the xmllines function results in a double-spaced text file. Maybe in post-processing there's a way to simply ignore every-other line, since I don't want to delete ALL the blank lines.
I don't think all of it is double spaced.
You could certainly write a simple algorithm to scan for and remove triple lines. You could manually add a character to the 'empty' lines you wanted to keep, delete the rest and then (automatically) replace the holding characters with "".
== I'm gathering that there isn't a simple way to (via script or program) to open the XML - rendered as text - then copy the text and write to another file? Currently we're doing this manually, but would MUCH prefer to find an automated method.
I'm not quite sure what it is exactly that you want to do. The method presented is sufficient to do that. Once you've got an array of the 'text' element of the xml file (which your loop should do - see attached if it isn't), then you can create a simple program to output it to a file.
Do you want to retain page numbers and heading? It looks like the structure is the same for all pages except the first, so a general header removal function should also be straightforward.
Stuart