Can you run spell check from the command line on a directory of xml files?
We, uh, well...the guy next to me just did a search and replace to take out all line breaks...but forgot to put a space in. XML Tidy has done a fabulous job of fixing all 300 files BUT now we have a few sentences per file that have been combined where a sentence was broken, hence "when the temperatureis reached..." and such.
This is gonna be a pain, wasn't backed up lately, not sure what I'll do...but would be a start if I could get editor to parse each and either tell me which had spelling errors - and preferably open those.
Luckily this is the procedures chapters so most of the sentences are very short.
John T. Jarrett CDT Sr. Tech Writer, Tech Pubs, ILS, Land & Armaments/Global Tactical Systems
Unfortunately, I'm not sure that what you're hoping for is possible. You could certainly write a script that would go through the files in a directory, open each one, and invoke the spell checker on it. But you would still need a user sitting there to examine each misspelling to determine a) if it is a real misspelling, or just a specialized jargon word that isn't in the dictionary, and b) if it is one of your omitted space issues, where the space ought to go to correct the problem. As far as I know, the spell checker doesn't have an auto-correct feature
A script like the one I described might make the process a little more efficient, but I think you're still in for a fair amount of manual labor on this one.
This is kind of a random idea, but you might try using Aspell, which is an open-source command-line spell checker. It claims to be able to check XML documents, but I’ve never tried it. You could probably script it with a batch script.
Assuming you’re on windows, here’s a link (the port looks a bit outdated):
I have asked PTC about the ability for a non-interactive spell check that can run from a windows command line or a script. Have not heard back as of yet and that was a few months back. My criteria is it will run and only give me a report of mis-spelled words NOT making any changes.
Gary Nadeau Tech Lead - Data Support Boeing Defense, Space & Security - St. Louis ' Phone: (314)233-5231 * Email: firstname.lastname@example.org P PLEASE CONSIDER THE ENVIRONMENT BEFORE YOU PRINT THIS E-MAIL
The only thing which pops into mind is that if your lost CR were always after a predictable number of characters, say always 80 characters and nothing else has changed, you could devise a script or search routine to count 80 characters in a line and insert a space where the cr used to be.