What do you do when your Garbage Collection runs all the time? Well usually you would try to introduce some magic GC parameters to fix your issue. But let me tell you: Excessive garbage production cannot be fixed by parameters!
You use the parameters to tune an “ok” GC to become a “good” GC. In complex applications it is very easy to lose control over your garbage. I have seen logs where gigabytes of garbage is produced in a second. Visualized it looks like this:
But usually it happens only in production, or, if you are lucky, maybe even in load testing, but the main problem remains: How can you figure out what kind garbage is produced? Garbage leaves no trace.
There are a few possible ways to do that:
- Take a heap dump. Use a tool like Eclipse MAT to inspect the references of garbage.
This works. But taking a heap dump will stop your machine for a while, so you do not want to do that in production. Also, it is limited in what it sees. Maybe your timing is bad and you take the heap dump right after a GC when there is no significant garbage in memory. Even if there is garbage, you will never really know if that is really the biggest contributor.
- Use a profiler like JProfiler. Allocation profiling however is actually quite expensive and requires a lot of tuning to bring the overhead down. Even then it is risky to run in production. Using it on a development machine works, but still you might be unsure if that object creation you are looking at is really creating the garbage you can see from your GC logs.
- Add logging. This is for the desperate. The amount of logging and additional bookkeeping usually creates a lot of additional pressure on memory and GC as well as log file pollution. Also it requires deployment of many temporary debugging versions.
I did not like any of those solutions, so I decided to create my own tool. Maybe somebody else has done this already, but in that case I was unable to find it.
I created and open sourced Allocation Tracker.
Allocation Tracker is a performance and production optimized object creation counter. All it does is to add a counter to each
call of a class matching a given prefix. It is very fast on startup, where it does load time byte code manipulation. And it is very fast on object creation, where it only adds a counter per class name. It is also designed to consume very little memory.
After startup it does nothing until turned on via JMX. Eventually you want to look at what has been recorded. That functionality is also exposed via JMX.
At codecentric we use the Vaadin Framework a lot. We also contribute back features and optimizations. When starting our application with
com.vaadin classes will get instrumented.
After letting the tracker work for about a minute while working with the application, we can generate a report of the most frequently created instances:
com/vaadin/server/EncodeResult 102800 com/vaadin/server/AbstractClientConnector$AllChildrenIterable 9442 com/vaadin/server/AbstractClientConnector$CombinedIterator 9442 com/vaadin/util/CurrentInstance 3507 com/vaadin/event/ConnectorEvent 3445 com/vaadin/shared/communication/SharedState 1703 com/vaadin/shared/AbstractComponentState 1693 com/vaadin/server/ClientConnector$AttachEvent 1598 com/vaadin/server/ClientConnector$DetachEvent 1460
Do not get confused by the slashes, that is bytecode speak. As a developer you usually use dots instead.
So Vaadin produces a lot of
EncodeResults. No surprise for me. I have contributed a few patches already on this, because the encoding showed up as a CPU hotspot in profiling. Version 7.1.15 used here for profiling does not have them merged yet.
What was a surprise for me, however, is the sheer amount of
The code doing that can be seen in AbstractClientConnector#L582.
Whenever a component is attached, detached or marked as dirty, it will propagate the event to its children.
To iterate over its children, the two classes mentioned above are created in line 331.
It turns out, this is not needed in most cases and can be easily changed to much more efficient code, as I have done in commit a3578d3ac293442b64cbc5f5feb4cb9106799fdb.
Iterables does not really seem to be the correct solution here, but because
getAllChildrenIterable() is a public API it needs to be fulfilled.
After applying the change, a recoding of approximately the same actions results in this report:
com/vaadin/server/EncodeResult 108604 com/vaadin/event/ConnectorEvent 3654 com/vaadin/util/CurrentInstance 2998 com/vaadin/shared/communication/SharedState 1768 com/vaadin/shared/AbstractComponentState 1757 com/vaadin/server/ClientConnector$AttachEvent 1720 com/vaadin/server/ClientConnector$DetachEvent 1579
The inner classes (with the change now also anonymous) now appear quite at the end:
com/vaadin/server/AbstractClientConnector$1 32 com/vaadin/server/AbstractClientConnector$1$1 32
So without any noticeable impact the allocation tracker helped us find excessive object creation and we were able to craft and apply a fix.
Not everything is perfect. Because of the high performance design, the agent does have a few limitations:
- It does not record allocation stacks. While a profiler can tell you where in code you allocate objects, or a heap dump can tell you where objects are referenced (but maybe not so nicely for garbage, which by its nature is no longer referenced), this tracker cannot do that. The reason is that with a high frequency of instantiations it would create a lot of stack traces, and need to store and compare them. That would be a lot of overhead.
The reason why the allocation tracker still helps is that in most applications there are very limited places where a “hot” constructor is invoked.
The Vaadin Iterables mentioned above were only created in one place. Same usually applies to any object typical programs use. If it really happens to be a class that is used everywhere, you know at least which it is and can maybe build your own counter into that specific class to identify the hot path.
- It records superclass construction. As you can see in above report it is listing
com/vaadin/event/ConnectorEvent. If you consult Vaadin code, you will see, and maybe have even guessed, that
ConnectorEvent. In Java constructors are always invoked up the chain until
java.lang.Object. The agent has instrumented the constructor of
ConnectorEvent, without looking at the class hierarchy. Figuring out if a class is abstract, or non-abstract but never instantiated directly at load time is a bit of work, especially considering dynamic class loading. The other option would be to suppress tracking superclass creations at runtime. However this would be quite a lot of extra code. In the end the output of superclasses is not that problematic, as a developer can easily recognize and filter them. Or maybe their total count can give some extra hints.
- It is not really suitable for JVM classes. The agent will produce a warning when the instrumented prefix is “java” and maybe in future will get a set of blacklisted JVM classes. Messing with them is not really safe. Also the results would not be really useful. There are plenty of instances created in dark areas of the JVM or application servers, so the agent’s use would be quite limited. If you are a JVM or app server developer wanting to use the agent, you for sure will be able to modify it to suit you needs 🙂
If you want to reduce your object allocation, most likely motivated from frequent or long GCs, the allocation tracker will be a easy to use and safe tool for you. It will be maintained by codecentric and we will merge any performance improving pull requests. Feature wise it will stay mostly as it is right now, solving the limitations it has would affect its performance to a large extend and defeat its primary purpose.
All my plans for it are documented in Github issues.
If you found any issues, fixed problems or improved your code using the allocation tracker, please let us know or create a Pull Request!