Generating a large number of small pdf files

Currently, I need to generate a large number of small pdf files (around 18000 pdf files)

Each one take around 1.x – 2 seconds to generate, make it simple, let say it needs 2 seconds

So, 18000 * 2 second = 36000 seconds = 10 hours It takes too long time

My way to generate the pdf is calling the FOP java executable each time to generate one pdf file, providing the xml, xsl (to transform to FOP xml format).

The way java is working is that, each time when I call FOP, it will create a java VM process (java) to run it. I am thinking if this will waste much time because of creating and dropping the java vm process 18,000 times for 18,000 pdf file generation.

I am not familiar with Java VM process, but guess the resource could be re-used and managed by the Java VM so that re-creating the java VM process many time is not a big overhead?

If re-creating the Java VM process is a big overhead, writing a another Java program to call the FOP API function directly (instead of calling the FOP executable) and do the loop there is a way to go?

Any suggestion is highly appreciated.

Add Comment
1 Answer(s)

You could build the calls to perform the FOP transformation into any Java program, and running through your long list of XML/XSL/PDF would cut down on the VM startup per PDF, plus it would speed up because of JIT compilations for 2nd and subsequent runs. Some example code is on Apache FOP website

Another possibility with the embedded approach is to build your FOP transformation server as RMI server so your FOP service can be called multiple times, and the client app has the logic of all your input data and server does the PDF generation.

Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.