Skip to main content
1-Visitor
February 28, 2014
Question

automate export of text from folder of Illustrator graphics?

  • February 28, 2014
  • 10 replies
  • 1907 views
Hiya,

We have a document we are contemplating migrating from FrameMaker to XML.
It contains several hundred mainframe screens that are currently authored
in Adobe Illustrator ... but are essentially just text. I am wondering if
there is a way to automatically export the text from the Illustrator files
into simple text files (after which I could convert/import them into our
XML where a warm, happy content model is just waiting to format them all
pretty-like).

So for example, the process would take a folder full of:
abcd1.ai
abcd2.ai
...

And output:
abcd1.txt
abcd2.txt
...

Also happy to hear about other (semi-)automated paths anyone else might
have taken for such an endeavor.

--
Paul Nagai

    10 replies

    1-Visitor
    February 28, 2014
    Paul:

    Doesn't AI have some automation built into it which can output text from an ai file?

    John Sillari
    Chief Technologist
    Dayton T. Brown, Inc.
    12-Amethyst
    February 28, 2014
    Just a quick Google search and came across the following. Talks about translation but basically the same idea. Looks like data sets might work but also might require modification to the source file.
    18-Opal
    February 28, 2014
    IIRC, AI files are really just PostScript with some extra info embedded in comment fields, or at least they were way back when. Is that still true? If so, you might be able to use Ghostscript to extra the text bits.
    naglists1-VisitorAuthor
    1-Visitor
    February 28, 2014
    Thanks for the tips and pointers and links so far, everyone. The
    conversation is advancing.

    As always, balancing the acquisition costs (I don't currently have AI) plus
    the development time vs. offshore (assuming licenses already exist there)
    time to just cut and paste is the trick ...

    Clay: Hmmm. I opened an .ai file with a text editor. While I can read some
    portions, none of the readable bits contain the text of the screen, at
    least not in plain text. It could be hashed or mashed or otherwise bashed.
    (That might not have been quite what you meant, but I know I vaguely
    remember being able to open many PostScript files and read some of the text
    portions.)

    Ghostscript might be smarter about translating the text into human readable
    text.


    18-Opal
    March 1, 2014
    Hi Paul--

    A quick check to see if it's PostScript is to look at the first line. If it starts with "%!PS-Adobe-{version number}", then it's PostScript, and hopefully Ghostscript would be able to do something sensible with it.

    PostScript files, especially ones from design programs like AI, usually won't have the text contents in human-readable form, because text may be stored with a custom (sometimes binary) encoding, and even if it is there in ASCII format, it may be broken down by individual characters so that each glyph can be placed precisely on the page. But Ghostscript is usually pretty good about sorting all that out and giving you back usable text, as long as the text layout in the original document isn't too wacky.

    --Clay
    naglists1-VisitorAuthor
    1-Visitor
    March 1, 2014
    Hilarious. First line: %PDF-1.4

    I can open it w/Acrobat and copy the text. LOL! Doesn't get me automation
    (at least not directly) but it sure gets around the licensing issues
    associated with Illustrator.




    18-Opal
    March 1, 2014
    naglists1-VisitorAuthor
    1-Visitor
    March 1, 2014
    Cool. Thanks for the pointer. The PDF-guy is looking at it, too. I bet, if
    nothing else comes up, PDFtotext will work nicely. The AI/PDFs are super
    simple and pretty much only contain the text and a box around it.


    1-Visitor
    March 3, 2014
    Hi,



    You might try saving your file format from AI to SVG, and then import
    the SVG into IsoDraw. We've been bringing various drawing formats into
    IsoDraw and have found that SVG gives better fidelity and editable text.
    For example: we've gone from other vector drawing formats to PDF, open
    that in AI (CS4), save as SVG and then import that. It's a bit of a
    workaround but there have been fewer glitches than directly importing
    some drawing formats or completely incompatible drawing formats. In your
    case I'd save your AI file to SVG and see what happens when you open the
    SVG file in IsoDraw, you may find you get better fidelity. It's worth a
    try.



    Greg


    18-Opal
    March 3, 2014
    That sounds pretty doable. I'm using pdftotext right now for a data conversion project and it works really well for the documents we're converting.

    --Clay