cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - You can subscribe to a forum, label or individual post and receive email notifications when someone posts a new topic or reply. Learn more! X

PE/Local composer performance with large PDFs

JasonBuss
7-Bedrock

PE/Local composer performance with large PDFs

Hello all,

I'm using Editor and PE 5.4 M100, and I'm having difficulty publishing a large doc to PDF.

Ok, so the document itself is XML, ATA-style markup, and uses FOSI. It's roughly 2.2MB in size, contains references to about 200 bi-level TIFF graphic ents, and also contains several hundred instances of ID/IDREF pairs.

By turning off graphic display and gentext, I was able to get the document to load pretty quickly into the Editor (about 15 seconds). No completeness errors or anything.

Either sending this to PE or by using the local Print Composer (using distiller for PDF), I cannot get this document to render. I had my maxBusyInterval set to 30 minutes, and it considered the subprocess for this composition hung and restarted it. Other than that, I can get no indication of errors or issues that I can address. My PC for local composition is XP, with a dual-core Intel and 3Gb RAM.

The development server I'm using for PE has 4 quad-core CPUs and 48Gb of RAM.

Anyone have any ideas? I tried searching the adepters a few times, but for some reason the filtering isn't working on the site (selecting just Adepters for searching pulls in results for Windchill, ProE, ect).

Thanks,

-Jason
21 REPLIES 21

Hi there name-bruthah,



Divide and conquer.



If you can, try the first half of the doc separately and see if you get
the error. If yes, divide again until you can isolate the specific
graphic or piece of data triggering the problem.



Regards,

Jason







1380 Forest Park Circle, Suite 100

Lafayette, CO 80027

Hi Jason--



Here are some troubleshooting ideas off the top of my head:



When you try to publish locally, what happens when it fails? Does Editor
crash? Does it give an error in the log window? Do you end up killing
the process after some long length of time assuming it's hung?



Have you tried setting the maxbusyinterval higher on PE and republishing
there? I've seen cases of large documents that require longer than 30
mins to publish, but once you bump up the limit to avoid timeouts they
eventually get there.



Have you tried removing/commenting different portions of the document to
see if there's something specific in the content that is causing the
trouble, e.g. a corrupted image file or an unusual XML structure that
triggers something erroneous in the stylesheet?



--Clay



Clay Helberg

Senior Consultant

TerraXML


Jason:

Can you try to publish directly from the server using PE Interactive? If so,
monitor the process activity via Task Manager to see which process is hanging
(editor.exe, pubtex.exe, pubview.exe, et al.). If you can, turn on full logging
in PE, run the job, and inspect the PE logs in the Apache Tomcat directory to
see how the job was handled. You may find some clues buried in the intermediate
files which PE normally deletes once the job completes. Examine the javavmmemory
setting and bump it up if it's low. We set the heap size to 900K before java
started crapping out; you may have more luck with your larger physical memory
pool. If you have an Acrobat distiller available on your server, try Print
Composed... -> PDF from PE Interactive (make sure Adobe PDF is set as your
default printer first!). We're running XML files in the 60 - 80 MB range (plus
loads of CGM graphics) without issues on our PE 5.4 M140 Windows 2008 R2 server.

John


Jason,


It sounds like your subprocesses are simply shutting down due to time. I say this since I've got one document that takes about 32 HOURS to publish. It's an index that uses psuedo tables that are built on the fly.


Anyway, what I would do is turn up your time on PE's configuraiton to a couple of hours. If you still get the indication that your subprocess just died, double the time until you get a finished document.


As for the server and your CPU/RAM status, if it's not running on a 64-bit server, the extra RAM is not really being used. If it is a 64-bit server, you should be good there.


At any rate, give the configuration boost a try and see what happens.


Hope this Helps,


Bob

We have seen examples of this sort of really slow publishing too, particularly when Styler stylesheets are involved (as opposed to native FOSI or native APP). As an additional anecdote, one job we were testing recently peaked at almost 14GB of RAM usage. That was a Styler stylesheet.

-G

We've had TIFFs fail PE runs before. Give it a shot without graphics.

After that (and trying other suggestions) it's time to start halving the
document and seeing what happens.

Hi Bob--



Wow! 32 hours? I think that's a record of some kind.



--C





Clay Helberg

Senior Consultant

TerraXML


Another thing that has worked for me using Print Composer in the past is to run Print Preview. This is much, much faster than making PDF because I think it is the creating of PostScript for sending to Distiller that takes up the most time in Print Composer. When print/PDF jobs fail, often Print Preview will produce a DVI file that is viewable right up to the page where the problem occurs.
Another thing to try is, if you use the filename method for referencing illustrations and use an APTGRPATH environment variable, just set this variable to some local path where there are no graphics. This will speed up your processing, since no graphics will be found.

Hi,


This following is information I used to provide my military customers with really big documents. Also, per the PE documentation, set the maxBusyInterval to 0 if you don't want any timeouts, or because you seem to have a general idea, set it to like 48 hours. Hope this helps!


Thanks


Susan Fort


Susan Fort
Product Manager, SLM Segment
T 937.743.9091 F 781.707.0602
E -



Variable settings that can reduce the time to open very large documents in Arbortext Editor:



set gentext=off


set gentextautoupdate=none


set gentextwarnings=3


set tabletags=on


set tabletagdisplay=none


set graphicdisplay=off


set tagdisplay=off


set equationdisplay=off


set bitmapdisplay=off


set fosiwarning=on


set inlineediting=off


set showentities=none



====================


Passing data to PE in chunks


set the environment variable APTPEHTTPMODE to a value of CSM.


CSM is much smaller and can be set by APTPECHUNKSIZE


FLSM is also an option, but the fixed length is 4gig.



Here are the methods for increasing available virtual memory:


______________________


Setting bigjobthreshold:



1. See this section in the help for details of the 'bigjobthreshold" option: http://www.ptc.com/ae53M30_hc/index.jspx?id=ID552467643&action=show



At the Arbortext Editor command line, enter the command "eval doc_estimate_dfs(); "



Then take the number that is shown in the panel that pops up and use it for the value for the "bigjobthreshold" set option. ie: set bigjobthreshold="n" where "n" is the number that appeared in the "eval" panel after the first command.



For example, after entering the command



eval doc_estimate_dfs();



a panel popped up with the number 352 in it. Now set "bigjobthreshold" option accordingly:



set bigjobthreshold=352



Set this on both the Arbortext Editor / Styler client machine and the Arbortext Publishing Engine server (in an .acl file placed in CUSTOMPATH/init).



_____________________


Setting javavmmemory:



On both the Arbortext Publishing Engine server and the client machine place the following lines in an ACL file in the ...\custom\init directory of the install tree,



if (!java_init(0)) {



set javavmmemory=764


set javavmargs="-Xss2m"



}



______________________


Setting 3GB parameter:



When composing very large documents to PDF on Windows, the


composition process may fail with an "Out of virtual memory space"


error message. This situation can occur when the composition process encounters


a Windows 2GB memory addressing limitation. After installing this release,


enable this fix by performing the following steps.


CAUTION


Ensure that you make the following change exactly as described. Incorrectly


modifying system files can leave your workstation in an unstable state.


The following steps update the contents of the file c:\boot.ini using the


instructions provided on Microsoft’s web site at:


http:/­/­www.­microsoft.­com/­whdc/­system/­platform/­server/­PAE/­PAEmem.­mspx


Review the contents of that web page before making these changes.


1. Stop Arbortext Editor and Arbortext Publishing Engine.


2. From Windows Explorer, locate the file c:\boot.ini.


If you do not see the file boot.ini in c:\, ensure that system and hidden


files are visible using the following steps:


a. With Windows Explorer open, choose Tools->Folder Options.


b. Select the View tab.


c. In Advanced Settings, select Show hidden files and folders.


d. In Advanced Settings, remove the check mark from Hide protected


operating system files (Recommended). Answer Yes when you are


prompted whether you are sure you want to display these files.


e. Select OK to close the Folder Options dialog box. boot.ini should be


visible after the contents of c:\ are refreshed.


3. By default, boot.ini is set to be read-only. To allow changes to the file,


right-click on the file name and choose Properties. Uncheck Read-only


and select OK.


4. Open the file c:\boot.ini in an ASCII editor such as Notepad and add


the /3GB parameter as described on Microsoft’s web site. The contents of


your updated boot.ini file will be similar to the following example:


[boot loader]


timeout=30


default=multi(0)disk(0)rdisk(0)partition(2)\WINNT


[operating systems]


multi(0)disk(0)rdisk(0)partition(2)\WINNT="????" /3GB



"????" in the example can be the programmatic name of any of the following


operating system versions:


Windows XP Professional


Windows Server 2003


Windows Server 2003, Enterprise Edition


Windows Server 2003, Datacenter Edition


Windows 2000 Advanced Server


Windows 2000 Datacenter Server


Windows NT Server 4.0, Enterprise Edition



If your boot.ini file lists more than one operating system, ensure that you


append the /3GB parameter to the operating system you use when running


Arbortext Editor or the Arbortext Publishing Engine.


5. Save boot.ini.


6. Reset boot.ini to be read-only.


7. Optionally reset the states of Show hidden files and folders and Hide


protected operating system files (Recommended) in the Folder Options


dialog box.


8. Restart your workstation or server.


9. Restart Arbortext Editor and the Arbortext Publishing Engine.

Clay,


The book was an IPB. The tagging for IPB's is such that there are no entries, but rather the individual pieces that eventually make up a row such as <partno>...</partno><desc>...</desc>, etc. So the stylesheet (FOSI) ends up building everything in RAM. The numerical index for our planes is some 1500 pages. Think about it... 1500 pages of psuedo tables... BLAAH!


Anyway, we don't restart our services over the weekend so these are sent on Friday's. Of course we don't print them often.


Bob

You can create a .dvi file using format allpasses forceat the command line or in a startup file. Then use pubview.exe to preview the .dvi file. The format command is a little faster than the preview command because the preview contents don't have to be rendered.


Good luck!
Suzanne Napoleon
www.FOSIexpert.com
"WYSIWYG is last-century technology!"


Bob,

Table markup, whether authored or FOSI-generated, takes longer to format than flowing text. If you don't need formatting available only with tables, you may want to consider using algroup, indent, ruling, and possibly boxinginstead of gentables in order to speed up the formatting process.

If you decide to test this, please let us know your results:-)

Thanks!
Suzanne


That's interesting as an Arbortext consultant once told me that the first thing that happens is that the Styler stylesheet is converted to a FOSI (just as if you exported the stylesheet as a FOSI), so it never actually processes with a Styler stylesheet. Anyone else heard that one?

Dave

That is true. There is no "Styler engine" that can run a Styler sheet
directly without first compiling it to an XSL-FO, FOSI or APP template.

However, unlike languages like C, for which compiler technology is very
mature and even a highly-skilled programmer would be hard-pressed to write
more efficient assembler code than what's produced by the compiler,
Styler's "compilers" for the various formatter backends don't necessarily
produce the most efficient results compared to a hand-built FOSI or APP
template.

There are a variety of reasons for this, but one that comes to mind for
FOSI is numbering. Automatic numbering is usually best handled using
FOSI's built-in counters. More sophisticated numbering may require calls
to ACL functions, which can involve a performance hit. There are cases
where a Styler-generated FOSI makes calls to ACL for numbering, in order to
support the full range of numbering features that the Styler UI offers,
even if the Styler sheet in question hasn't used any features beyond what
could have been handled with counters. More sophisticated or aggressive
optimization strategies within Styler's "compiler" for FOSI might catch
this. A well-tuned hand-built FOSI almost certainly would.

-Brandon 🙂


Suzanne,


Ok, to get a little deeper... I am involved with DOD spec doctypes and to make matters worse, use PTC's CPD product.Though I can do all kinds of FOSI-related trickery with both style, as well as ACL, I have a limit to what I can change and still have a viable document for change packages.


On another note about the speed of FOSI... It's direct and to the point, so it's quite fast. If I had nothing but tables/rows/entries, the formatter would do its thing relatively fast. However, as you know Suzanne, psuedo tables are built one piece at a time. Then when enough pieces are put together for the entries within a row, the row is stored while the next row is built, and so on. Goes like that through the whole table. Because of the complexity of the data and how it's built, it is far easier to build the SGML structure using the data types then to biuld the rows/entries and all their attributes based on what type of entry they are.


I have nothing against FOSI whatsoever. Being DOD-oriented, it never has to change as far as I'm concerned. There's not much I can't do with it for paper output. When I mix in ACL, the sky's the limit.


Have a great day,


Bob

Bob,

I'm with you -- FOSI (plus ACL if needed) is great for techdocs, especially because of its speed compared to XSL-FO and APP. XSL-FO would probably be out of the question in your case.


Good luck!
Suzanne


I object! My native APP could beat your native FOSI! 😉

Suzanne Napoleon <suzannenapoleon@fosiexpert.com> wrote:



Bob,

I'm with you -- FOSI (plus ACL if needed) is great for techdocs, especially because of its speed compared to XSL-FO and APP. XSL-FO would probably be out of the question in your case.

Good luck!
Suzanne

Hi Gareth!

Ya think? 🙂

My understanding is that APP is slower than FOSI. The Help info for Print Engine Comparison sez to consider FOSI if you "require the fastest possible performance." I've heard there is an effort to speed up APP. Has that happened?

Thanks!
Suzanne

Glad I managed to get someone fired up with my response 🙂

What you're referring to is true in the context of Styler. It comes down to what Brandon referred to earlier. Styler is basically a high-level description of a stylesheet which is then "compiled" into a native stylesheet language such as XSL-FO, FOSI or APP.

It turns out the design of Styler stylesheets is very close to FOSI and XSL-FO but not as close to native APP templates. Therefore Styler's APP compiler is not as mature as the FOSI compiler and produces less optimal code. PTC are actively working to improve the APP compiler and it is certainly getting better/faster with each release.

Native APP templates that have not come from Styler can be very fast indeed; showdown at high noon? 🙂

-G

Thanks for the clarification 🙂

I tested the formatting speed for my bookon my laptop with Print Composer.The DTD is based on docbook. The stylesheet is a custom native FOSI. The book currentlyhas 888 pages without the index. It has a lot of tables, most of which are not very long, and there are a lot of graphics throughout. The flowing text is ragged right, no hyphenation, and there are margin notes.There is a fair amount of as-is, line-for-line text.A few ACL functions are used. Two formatting passes are needed because of the TOC and xrefs. With no pre-existing cache files, the doc formatted in less than 34 seconds. With pre-existing cache files, the doc formatted in less than 28 seconds. When the index bug is fixed, I'll test again with the index.

That is pretty good! You’ll have to send me the files one day so I can see if I can do it in 33 seconds 😉

I’m sure we will all be interested to see your book. What a massive endeavour, it should be very rewarding to finally get it out there. A few of us have thought about something similar for APP but it seems a bit like climbing Mt Everest. I don’t know how you do it 🙂

-G
Announcements

Top Tags