automate export of text from folder of Illustrator graphics?
‎Feb 28, 2014
03:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Feb 28, 2014
03:36 PM
automate export of text from folder of Illustrator graphics?
Hiya,
We have a document we are contemplating migrating from FrameMaker to XML.
It contains several hundred mainframe screens that are currently authored
in Adobe Illustrator ... but are essentially just text. I am wondering if
there is a way to automatically export the text from the Illustrator files
into simple text files (after which I could convert/import them into our
XML where a warm, happy content model is just waiting to format them all
pretty-like).
So for example, the process would take a folder full of:
abcd1.ai
abcd2.ai
...
And output:
abcd1.txt
abcd2.txt
...
Also happy to hear about other (semi-)automated paths anyone else might
have taken for such an endeavor.
--
Paul Nagai
We have a document we are contemplating migrating from FrameMaker to XML.
It contains several hundred mainframe screens that are currently authored
in Adobe Illustrator ... but are essentially just text. I am wondering if
there is a way to automatically export the text from the Illustrator files
into simple text files (after which I could convert/import them into our
XML where a warm, happy content model is just waiting to format them all
pretty-like).
So for example, the process would take a folder full of:
abcd1.ai
abcd2.ai
...
And output:
abcd1.txt
abcd2.txt
...
Also happy to hear about other (semi-)automated paths anyone else might
have taken for such an endeavor.
--
Paul Nagai
10 REPLIES 10
‎Feb 28, 2014
03:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Feb 28, 2014
04:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Feb 28, 2014
04:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Feb 28, 2014
04:12 PM
IIRC, AI files are really just PostScript with some extra info embedded in comment fields, or at least they were way back when. Is that still true? If so, you might be able to use Ghostscript to extra the text bits.
‎Feb 28, 2014
05:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Feb 28, 2014
05:58 PM
Thanks for the tips and pointers and links so far, everyone. The
conversation is advancing.
As always, balancing the acquisition costs (I don't currently have AI) plus
the development time vs. offshore (assuming licenses already exist there)
time to just cut and paste is the trick ...
Clay: Hmmm. I opened an .ai file with a text editor. While I can read some
portions, none of the readable bits contain the text of the screen, at
least not in plain text. It could be hashed or mashed or otherwise bashed.
(That might not have been quite what you meant, but I know I vaguely
remember being able to open many PostScript files and read some of the text
portions.)
Ghostscript might be smarter about translating the text into human readable
text.
conversation is advancing.
As always, balancing the acquisition costs (I don't currently have AI) plus
the development time vs. offshore (assuming licenses already exist there)
time to just cut and paste is the trick ...
Clay: Hmmm. I opened an .ai file with a text editor. While I can read some
portions, none of the readable bits contain the text of the screen, at
least not in plain text. It could be hashed or mashed or otherwise bashed.
(That might not have been quite what you meant, but I know I vaguely
remember being able to open many PostScript files and read some of the text
portions.)
Ghostscript might be smarter about translating the text into human readable
text.
‎Feb 28, 2014
07:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Feb 28, 2014
07:11 PM
Hi Paul--
A quick check to see if it's PostScript is to look at the first line. If it starts with "%!PS-Adobe-{version number}", then it's PostScript, and hopefully Ghostscript would be able to do something sensible with it.
PostScript files, especially ones from design programs like AI, usually won't have the text contents in human-readable form, because text may be stored with a custom (sometimes binary) encoding, and even if it is there in ASCII format, it may be broken down by individual characters so that each glyph can be placed precisely on the page. But Ghostscript is usually pretty good about sorting all that out and giving you back usable text, as long as the text layout in the original document isn't too wacky.
--Clay
A quick check to see if it's PostScript is to look at the first line. If it starts with "%!PS-Adobe-{version number}", then it's PostScript, and hopefully Ghostscript would be able to do something sensible with it.
PostScript files, especially ones from design programs like AI, usually won't have the text contents in human-readable form, because text may be stored with a custom (sometimes binary) encoding, and even if it is there in ASCII format, it may be broken down by individual characters so that each glyph can be placed precisely on the page. But Ghostscript is usually pretty good about sorting all that out and giving you back usable text, as long as the text layout in the original document isn't too wacky.
--Clay
‎Feb 28, 2014
07:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Feb 28, 2014
07:41 PM
Hilarious. First line: %PDF-1.4
I can open it w/Acrobat and copy the text. LOL! Doesn't get me automation
(at least not directly) but it sure gets around the licensing issues
associated with Illustrator.
I can open it w/Acrobat and copy the text. LOL! Doesn't get me automation
(at least not directly) but it sure gets around the licensing issues
associated with Illustrator.
‎Feb 28, 2014
10:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Mar 01, 2014
04:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Mar 01, 2014
04:52 PM
Cool. Thanks for the pointer. The PDF-guy is looking at it, too. I bet, if
nothing else comes up, PDFtotext will work nicely. The AI/PDFs are super
simple and pretty much only contain the text and a box around it.
nothing else comes up, PDFtotext will work nicely. The AI/PDFs are super
simple and pretty much only contain the text and a box around it.
‎Mar 03, 2014
06:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Mar 03, 2014
06:33 AM
Hi,
You might try saving your file format from AI to SVG, and then import
the SVG into IsoDraw. We've been bringing various drawing formats into
IsoDraw and have found that SVG gives better fidelity and editable text.
For example: we've gone from other vector drawing formats to PDF, open
that in AI (CS4), save as SVG and then import that. It's a bit of a
workaround but there have been fewer glitches than directly importing
some drawing formats or completely incompatible drawing formats. In your
case I'd save your AI file to SVG and see what happens when you open the
SVG file in IsoDraw, you may find you get better fidelity. It's worth a
try.
Greg
You might try saving your file format from AI to SVG, and then import
the SVG into IsoDraw. We've been bringing various drawing formats into
IsoDraw and have found that SVG gives better fidelity and editable text.
For example: we've gone from other vector drawing formats to PDF, open
that in AI (CS4), save as SVG and then import that. It's a bit of a
workaround but there have been fewer glitches than directly importing
some drawing formats or completely incompatible drawing formats. In your
case I'd save your AI file to SVG and see what happens when you open the
SVG file in IsoDraw, you may find you get better fidelity. It's worth a
try.
Greg
‎Mar 03, 2014
11:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
‎Mar 03, 2014
11:01 AM
That sounds pretty doable. I'm using pdftotext right now for a data conversion project and it works really well for the documents we're converting.
--Clay
--Clay