cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Did you know you can set a signature that will be added to all your posts? Set it here! X

PDF differences between 5.2M020 and 5.3M030

ebheadrick
1-Newbie

PDF differences between 5.2M020 and 5.3M030

In our upgrade to 5.3 I am trying to compare our old PDF files (5.2M020)to the new ones created with 5.3. I have gotten most of the external variables checked so that I think everything should be using the same environments, but I might be missing something. What I have run into is that the word spacing (full justification) is slightly different such that words are hyphenated sometimes and not other times. This causes the pages to not compare.

The other thing is that when you cut and paste text from sections of the PDF to a text editor you get different results.

5.2M020 for a minimum 28 days’

5.3M030 for a minimum28 days’

Sometimes the spaces are missing in the other direction. This throws off the application we have to compare pages for changes for our loose leaf page count. From my understanding this application usesthe open source code of PDFbox. I haven't dug into that aspect of the code.

Thanks,
Ellen

6 REPLIES 6

Not that this will help you, or is even true, but I remember once upon a
time that Arbortext (when it was just Arbortext, not PTC) made a
statement that it did not guarantee page fidelity between versions.

I don't remember from whom this statement came, or where it might be
archived, if anywhere. However, I think it might still be true.


I was wrong about the dates on this, since this is actually from 2007,
but I looked it up on PTC's Arbortext Knowledge Base and found TPI
136881, which reads as follows:



Solution

136881

Type

TPI

Created Date

12-Apr-2007

Last Updated

11-May-2007



Title

Page Fidelity of Composed Output Between Versions

Details

Description

As you upgrade, you may notice that composed output varies slightly and
does not exactly match the paged output from the prior version you are
using.

Resolution

Arbortext cannot promise page fidelity from release to release. New
releases include product fixes and refinements, including in the area of
composition. This may result in page infidelity between releases. The
composed output from new releases is more desirable than in previous
releases.




At least with the example posted, I don't think I would call this a
page fidelity problem. I'm assuming that Ellen checked the input and
the stylesheet and the missing space is an issue in the composition
stage. I would expect that a cut and paste from the PDF would at
least preserve the spaces between words, no matter how that space was
created. So for instance if it was set with a non-break, or em or en
space in the PDF, that should still paste as an ASCII space. At the
very least this is breaking basic functionality that an end user was
expecting - the ability to select the text cut it out and paste into
another document and have the text be correct.

Ellen you mention the PDFbox tool, what does its text extraction come
up with? Is it also missing the space?

I would consider at the missing space a bug.

..dan

Well, her original message consisted of two problems. The first one,
where she was trying to "...compare our old PDF files (5.2M020) to the
new ones created with 5.3." would be a page fidelity issue. The second
one, or "the other thing" as it was described, would fall into the area
you were writing about.
The page fidelity issues to which I and Arbortext refer occur between
versions when there are no changes to the source file or any
stylesheets, FOSIs, XSL or anything else. Only the version of Arbortext
Editor/PE is different.

agreed

Well my guess was that the pages would agree if the spaces were
consistent. I need to figure out how much of an impact there will be
when or if we can change to the newer version. My thinking is now also
going towards is this an Arbortext or Adobe distiller/PDF problem, because
the missing spaces has been showing up all along with the application that
uses PDFbox.

Ellen

"Dan Vint" <dvint@dvint.com> wrote on 03/20/2009 11:26:12 AM:

> agreed
>
>
Top Tags