//

Optimizing iText performance using AppDynamics and YourKit

27.11.2010 | 2 minutes of reading time

The following example shows how easy it is to combine a performance monitoring solution with a profiler.
On a regular patrol through our AppDynamics monitoring on our continuously integrated projects, I found this interesting HotSpot in iText. iText , a previously free but now commercial java library, allows to parse PDFs easily by just using a PDFReader:

1PdfReader reader = new PdfReader(filename);
2

Thats really easy!
But in fact this does more than expected as you easily can discover on the following screenshot showing code HotSpots in AppDynamics:

[singlepic id=336 w=560 float=]

About 20% of the whole transaction is spend on opening 2 PDFs. I was getting curious. What is happening there? Is there a chance for slow code?

The Invocation Trace shows that inside the Constructor of the PdfReader a PRTokeniser is created, which itself creates a RandomAccessFileOrArray. This then again opens a RandomAccessFile or a MappedRandomAccessFile. Latter uses a FileChannel to read the file into memory. Looks like a simple approach, but with a lot delegation, so there should be a way to look for alternatives to play with.

So I wrote a little microbenchmark for our code which reads fields from PDF files.
After3000 warmup calls, I ran 10000 benchmarked calls, which totaled to 12.3 seconds on my machine (Win 7 32bit, Java 1.6.20, -server).
To understand what the code was doing I used YourKit as Profiler . After the code was run, it produces this report:

I found an alternative Constructor, which was taking a RandomAccessFileOrArray directly. So i tried this to be able to substitute the method for reading the file. So to approach this in baby steps, I did what the other constructor would have done:

1PdfReader reader = new PdfReader(new RandomAccessFileOrArray(filename), null);
2

But I got surprised! I found out that this code took a lot less time for the 10k loops: 5.8 seconds. By just doing the same stuff as before? Really?
The JavaDoc revealed what was going differently here:

1/**
2* Reads and parses a pdf document. Contrary to the other constructors only the xref is read
3* into memory. The reader is said to be working in "partial" mode as only parts of the pdf
4* are read as needed.
5**/
6

Interesting! This could be performing a lot better for my use cases, which only covers reading form fields.

YouKit proved this. No more traces of expensive constructors:

So this second variant is faster. And I was not even looking for this. But when working on performance, I can recommend to reach for lowhanging fruits. And very often you find that you need to change a slightly different place than the one exposing the performance issue. Since I did this improvement in our PDFService code it never showed up in performance reports.

For me this is a very good example of how you should work towards improved performance

  1. Monitor performance in production and test.
  2. Analyze anomalies.
  3. Evaluate different approaches to solving the problem using a profiler.
  4. Prefer doing small simple changes with large effect.
  5. Continue observing performance.

share post

Likes

0

//

More articles in this subject area\n

Discover exciting further topics and let the codecentric world inspire you.

//

Gemeinsam bessere Projekte umsetzen

Wir helfen Deinem Unternehmen

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.