The following example shows how easy it is to combine a performance monitoring solution with a profiler.
On a regular patrol through our AppDynamics monitoring on our continuously integrated projects, I found this interesting HotSpot in iText. iText, a previously free but now commercial java library, allows to parse PDFs easily by just using a PDFReader:
PdfReader reader = new PdfReader(filename);
Thats really easy!
But in fact this does more than expected as you easily can discover on the following screenshot showing code HotSpots in AppDynamics:
[singlepic id=336 w=560 float=]
About 20% of the whole transaction is spend on opening 2 PDFs. I was getting curious. What is happening there? Is there a chance for slow code?
The Invocation Trace shows that inside the Constructor of the PdfReader a PRTokeniser is created, which itself creates a RandomAccessFileOrArray. This then again opens a RandomAccessFile or a MappedRandomAccessFile. Latter uses a FileChannel to read the file into memory. Looks like a simple approach, but with a lot delegation, so there should be a way to look for alternatives to play with.
So I wrote a little microbenchmark for our code which reads fields from PDF files.
After3000 warmup calls, I ran 10000 benchmarked calls, which totaled to 12.3 seconds on my machine (Win 7 32bit, Java 1.6.20, -server).
To understand what the code was doing I used YourKit as Profiler. After the code was run, it produces this report:
I found an alternative Constructor, which was taking a RandomAccessFileOrArray directly. So i tried this to be able to substitute the method for reading the file. So to approach this in baby steps, I did what the other constructor would have done:
PdfReader reader = new PdfReader(new RandomAccessFileOrArray(filename), null);
But I got surprised! I found out that this code took a lot less time for the 10k loops: 5.8 seconds. By just doing the same stuff as before? Really?
The JavaDoc revealed what was going differently here:
* Reads and parses a pdf document. Contrary to the other constructors only the xref is read
* into memory. The reader is said to be working in "partial" mode as only parts of the pdf
* are read as needed.
Interesting! This could be performing a lot better for my use cases, which only covers reading form fields.
YouKit proved this. No more traces of expensive constructors:
So this second variant is faster. And I was not even looking for this. But when working on performance, I can recommend to reach for lowhanging fruits. And very often you find that you need to change a slightly different place than the one exposing the performance issue. Since I did this improvement in our PDFService code it never showed up in performance reports.
For me this is a very good example of how you should work towards improved performance
- Monitor performance in production and test.
- Analyze anomalies.
- Evaluate different approaches to solving the problem using a profiler.
- Prefer doing small simple changes with large effect.
- Continue observing performance.