Question about PDF search indexes

Question

Adepters:Some of our deliverables are in the form of PDF collections (dozens of PDFs that can be launched from a Web page). We generate these using PE 5.3, direct-to-PDF. The group that maintains the PDF collection has a manual process for generating a PDF search index of the entire collection using the standard Acrobat tool. This, of course, is a manual and somewhat arduous task. So, I'm being asked if there's any way to automate this process (building the Acrobat search index). I'm pretty sure PE can't help - at least I couldn't find any mention of that in the PE Programmer's Guide, so I'm wondering if any of you have created these indexes and perhaps found a better method (automating the Acrobat tool, third party software, etc.)?Thanks in advance,Dave

ClayHelberg · Answer

Hi David--

DMP might be of use here. If you build a DMP project that includes all
your PDFs, it will build a full-text search index as part of the DMC
package it generates. You can search within DMC viewer of course (or in
the web app if you produce a webapp.jar file). But the index it
generates is a standard Lucene index, so you may be able to extract the
index part of the DMC build and integrate it with your own search
interface, assuming that interface also uses a Lucene-compatible index.

The PDF's you index this way do not need to be published using PE or
Arbortext (though of course they can be).

--Clay

Clay Helberg

Senior Consultant

TerraXML

Sign up

Please use your PTC eSupport account.

Welcome to the PTC Community

Please use your PTC eSupport account.