<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>codecentric Blog &#187; Fabian Lange</title>
	<atom:link href="http://blog.codecentric.de/en/author/fla/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.codecentric.de/en/</link>
	<description>Expertenwissen rund um agile Softwareentwicklung, Java und Performance Solutions.</description>
	<lastBuildDate>Sun, 16 Jun 2013 17:19:42 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>International #TableTopDay at codecentric</title>
		<link>http://blog.codecentric.de/en/2013/03/international-tabletopday-at-codecentric/</link>
		<comments>http://blog.codecentric.de/en/2013/03/international-tabletopday-at-codecentric/#comments</comments>
		<pubDate>Sat, 30 Mar 2013 21:39:49 +0000</pubDate>
		<dc:creator>Fabian Lange</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.codecentric.de/?p=17961</guid>
		<description><![CDATA[All the geeks at codecentric were pretty excited about TableTopDay, a table top board gaming holiday invented by the fine folks at GeekAndSundry, Felica Day and Wil Wheaton. As response to their extremly successful YouTube Show &#8220;TableTop&#8221;, which accumulated already &#8230; <a href="http://blog.codecentric.de/en/2013/03/international-tabletopday-at-codecentric/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><img src="http://blog.codecentric.de/files/2013/03/TableTopDay_200x200.jpg" alt="TableTopDay_200x200" width="200" height="200" class="alignright size-full wp-image-17986" />
<p>All the geeks at codecentric were pretty excited about TableTopDay, a table top board gaming holiday invented by the fine folks at GeekAndSundry, Felica Day and Wil Wheaton. As response to their extremly successful <a href="http://www.youtube.com/playlist?list=PL4F80C7D2DC8D9B6C">YouTube Show &#8220;TableTop&#8221;</a>, which accumulated already over 8.5 million views, they proclaimed March 30th to be International TableTopDay. Over 2500 groups registered their event on <a href="http://www.tabletopday.com">tabletopday.com</a> and so did we.
</p>
<p>
While I expected only codecentrics to attend, we ended with one guy working for one of our customers, one random guy who registered on our Google+ Event page and 4 strangers who showed up without any notice. But also all the codecentrics brought their families. It was successful and fun event, for us and for all board game nerds worldwide, as proven by the hashtag <a href="https://twitter.com/search?q=%23TableTopDay&#038;src=hash">#TableTopDay</a>, which was trending on Twitter the whole Saturday, and many people posting pictures or even livestreaming their event.
</p>
<p><span id="more-17961"></span><br />
Games we played:</p>
<ul>
<li><a href="http://www.amazon.de/gp/product/B001BAUGW0/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B001BAUGW0&#038;linkCode=as2&#038;tag=exfuror">Pandemic</a></li>
<li><a href="http://www.amazon.de/gp/product/B001REM4KC/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B001REM4KC&#038;linkCode=as2&#038;tag=exfuror">Small World</a></li>
<li><a href="http://www.amazon.de/gp/product/B0014LETVU/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B0014LETVU&#038;linkCode=as2&#038;tag=exfuror">Stone Age</a></li>
<li><a href="http://www.amazon.de/gp/product/B000KY536C/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B000KY536C&#038;linkCode=as2&#038;tag=exfuror">Fearsome Floors</a></li>
<li><a href="http://www.amazon.de/gp/product/B002IUFSPM/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B002IUFSPM&#038;linkCode=as2&#038;tag=exfuror">Castle Panic</a></li>
<li><a href="http://www.amazon.de/gp/product/B0036X00EE/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B0036X00EE&#038;linkCode=as2&#038;tag=exfuror">Hansa Teutonica</a></li>
<li><a href="http://www.amazon.de/gp/product/B004LWF076/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B004LWF076&#038;linkCode=as2&#038;tag=exfuror">Zombie Dice</a></li>
<li><a href="http://www.amazon.de/gp/product/B006WY1X1S/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=B006WY1X1S&#038;linkCode=as2&#038;tag=exfuror">Munchkin</a></li>
</ul>
<p>And maybe others; I was busy winning, could not watch all games being played <img src='http://blog.codecentric.de/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Thanks to all people attending making this possible, and special thanks to codecentric for hosting this event! Looking forward to next year, or earlier whenever possible!</p>
[[Show as slideshow]]
<p>Play more games!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.codecentric.de/en/2013/03/international-tabletopday-at-codecentric/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The CenterDevice Cloud Architecture</title>
		<link>http://blog.codecentric.de/en/2013/02/the-centerdevice-cloud-architecture/</link>
		<comments>http://blog.codecentric.de/en/2013/02/the-centerdevice-cloud-architecture/#comments</comments>
		<pubDate>Thu, 14 Feb 2013 21:40:48 +0000</pubDate>
		<dc:creator>Fabian Lange</dc:creator>
				<category><![CDATA[Cloud @en]]></category>
		<category><![CDATA[Performance @en]]></category>

		<guid isPermaLink="false">http://blog.codecentric.de/?p=16913</guid>
		<description><![CDATA[In this post I want to give you an insight into the architecture of CenterDevice, a document management and collaboration tool for the enterprise hosted in our own cloud datacenter in Germany. CenterDevice is a startup of codecentric, with a &#8230; <a href="http://blog.codecentric.de/en/2013/02/the-centerdevice-cloud-architecture/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In this post I want to give you an insight into the architecture of <a href="https://www.centerdevice.de">CenterDevice</a>, a document management and collaboration tool for the enterprise hosted in our own cloud datacenter in Germany. CenterDevice is a startup of codecentric, with a few codecentrics working full or part time on it.</p>
<p>Let me start with a screenshot from our AppDynamics monitoring.<br />
<img src="http://blog.codecentric.de/files/2013/01/centerdevice-architecture.png" alt="centerdevice-architecture" width="699" height="316" class="aligncenter size-full wp-image-16915" /><br />
<span id="more-16913"></span></p>
<p>
The green boxes are all Java 7 based services monitored with AppDynamics. The number &#8220;4&#8243; within the blue circle indicates that at the moment there are currently 4 instances each running. This number can change when services are scaled up and down. We currently run a mix of <strong>KVM</strong> virtualized and non-virtualized <strong>CentOS</strong> instances on high end Dell Machines with a total of about 148 cores, 600GB memory and 150TB disc space.<br />
The lines connecting the boxes indicate call directions, their average response times and how often they were made in that time interval. What is really great about AppDynamics is that the services, and how they talk to each other is automatically detected.
</p>
<p>
In the middle of the picture is the heart of our architecture &#8220;tomcat-rest&#8221;. When you talk to
<pre>api.centerdevice.de/v1</pre>
<p> you arrive there (and see a message that tells you you are missing our oAuth authorization &#8211; we are planning to publish our API soon). It hosts all of our REST services, which are implemented with <strong>Jersey</strong>. We chose Jersey, because it is very easy to implement REST Services with it, is a proven piece of software and also easily testable in unit and integration tests as outlined in this excellent blog post by <em>Michael Lex</em> about <a href="http://blog.codecentric.de/en/2012/05/writing-lightweight-rest-integration-tests-with-the-jersey-test-framework/">Writing lightweight REST integration tests with the Jersey Test Framework</a>.
</p>
<p>
Services accessing any kind of data use our <strong>MongoDB</strong> backend, visualized by AppDynamics as blue boxes on the top. As a scalable database with easy access and painless schema changes (because there is no schema), it suits our performance requirements, as well as the changeability we require for a young product with frequent releases. If you want to know more about MongoDB I recommend reading <em>Thomas Jaspers</em> <a href="http://blog.codecentric.de/en/2012/12/mongodb-tutorial/">MongoDB tutorial</a>. We talk to it via the Java Driver, have implemented replicasets and shard to multiple nodes. The major challenge here is to accept the eventual consistency. Even code which writes data might not yet be able to read it again in the next line.
</p>
<p>
Uploaded documents are stored on the <strong>Gluster</strong> backed XFS file system. All data is encrypted using a per user 256 AES key before it is persisted on the file system. Gluster then takes care of replicating the data to all servers.
</p>
<p>
For clients, we developed two native clients: <a href="https://itunes.apple.com/us/app/centerdevice-for-ipad/id557148108">CenterDevice for iPad</a> and <a href="https://play.google.com/store/apps/details?id=de.centerdevice.android&#038;hl=en">CenterDevice for Android Phone</a>. The reason for native apps is that some emulated or cross compiled platforms have plenty of issues. Mark Zuckerberg also recently had a widely acknowledged talk about the <a href="http://techcrunch.com/2012/09/11/mark-zuckerberg-our-biggest-mistake-with-mobile-was-betting-too-much-on-html5/">problems HTML5 mobile clients have</a> at the moment. And because iOS dominates the tablet market and Android rules the phone market, that is what we started with.
</p>
<p>
<img src="http://blog.codecentric.de/files/2013/01/centerdevice-mobile-clients.png" alt="centerdevice-mobile-clients" width="696" height="361" class="aligncenter size-full wp-image-17002" />
</p>
<p>Note that our screenshots show cute baby animals, instead of boring documents, because they are more adorable.<br/>Besides our native iPad and Android Phone application, and a few third party clients, the main user of the REST Server is our <strong>Vaadin 6</strong> based web client. In fact we have two of them
<pre>app.centerdevice.de</pre>
<p> and
<pre>public.centerdevice.de</pre>
<p>Both are logically hosted on &#8220;tomcat-centerdevice&#8221;.<br />
Vaadin, as a framework, allows us to implement complex rich web interfaces easily, using SWT Style Java code. It compiles to GWT JavaScript which is then delivered to the browser. The bulk of the work can be easily done in Vaadin, which we combine with <strong>CDI-utils</strong>, a plugin that allows implementing the MVP pattern easily using <strong>Weld</strong> as CDI library. Developing complex components however requires developing GWT Widgets yourself sometimes, which is not that easy. For easy copy to clipboard functionality, <em>Henning Treu</em> wrote a <a href="http://blog.codecentric.de/en/2012/07/copy-to-clipboard-with-vaadin/">Copy to clipboard vaadin addon</a>, which we open sourced.
</p>
<p>
<img src="http://blog.codecentric.de/files/2013/01/centerdevice-web-app.png" alt="centerdevice-web-app" width="690" height="382" class="aligncenter size-full wp-image-17004" />
</p>
<p>
Communication between the web application servers and the rest server is only unidirectional, but sometimes we want to send back notifications (like new documents somebody just shared with you) to the webserver.
</p>
<p>
That is where <strong>RabbitMQ</strong> comes into play. The application map from AppDynamics shows our 3 usages of messaging:</p>
<ul>
<li>Sending Notifications from REST to Web.</li>
<li>Sending requests to send e-Mails (currently sent from and consumed by the REST Server).</li>
<li>Sending processing requests from REST server to doc-server.</li>
</ul>
<p>RabbitMQ is set up using HA queues and worked flawlessly so far. <em>Tobias Trelle</em> wrote an <a href="http://blog.codecentric.de/en/2011/04/amqp-messaging-with-rabbitmq/">introduction into RabbitMQ with Spring</A>, which provides more background about Rabbit MQ and AMQP.</p>
<p>
The document processing done by &#8220;doc-server&#8221;, which has multiple tasks depending on the type of input document:</p>
<ul>
<li>Generating PDF representations</li>
<li>Generating preview images for different sizes</li>
<li>Performing fulltext extraction</li>
<li>Performing OCR</li>
<li>Getting page count</li>
<li>Obtaining additional metadata</li>
<li>Detecting language</li>
</ul>
<p><p>There are basically two types of documents that we use as basis: PDF and Images.<br />
If we are getting any other format, we use <strong>Libre Office</strong> to convert it to PDF, or <strong>Imagemagick</strong> to convert it to images.<br />
ImageMagick is then also used to generate preview images in various sizes.<br />
Depending on the type of document we can use <strong>Apache Tika</strong> to get the fulltext from the document. For documents where Tika cannot find a fulltext, we resort to OCR running on &#8220;tomcat-ocr&#8221;. OCR will be done using <strong>tesseract</strong>. Further metadata is extracted using Tika and custom detectors.
</p>
<p>
Search capabilities are provided by <strong>Apache Solr 3</strong>, running on &#8220;tomcat-solr&#8221;. Solr is running in Master Slave mode. One neat feature of CenterDevice is that is performs super fast search on everything we can extract from documents or their metadata.
</p>
<h3>Performance</h3>
<p>
As you can see in the screenshot, the overall performance is quite nice. However it largely depends on the documents uploaded for individual processing requests. AppDynamics automatically learns the normal behavior and alerts us when there is deviation. For uploads and downloads however, we turned this learning off. Depending on the document and the clients connectivity, there is just nothing like a normal time a client takes to upload or download a document. We however gather diagnostic data for extraordinary slow up/downloads to investigate in case of customer complaints. A similar story applies to the document processing. While in most cases also the learned baselines are good, sometimes they do not match. That is why in the screenshot the connection to our external processing services are red. Some heavy document processing was going on at the time the screenshot was taken. When performance degrades, AppDynamics captures important metrics and provides us code level insight into the root cause. So far the most issues were typical like too many queries, too much API calls, inefficient indicies etc.
</p>
<h3>ToDos</h3>
<p>
We always have ideas for improving stuff and are moving fast, so the architecture will change. While currently we do not have a pressing need, these changes will be most likely coming in future:</p>
<p>* Introduce WebSession replication for failover using memcache or redis. (currently we loose web session data on failure (not happened so far), and deployments (happening during nights when no sessions are alive)). The major challenge so far seems to get it integrated into the Servlet 3 async pushing we do.<br />
* Switch from Gluster to Hadoop File System (We were bitten by lots of Gluster problems, like running on ext-4 64bit, which we now changed to xfs)<br />
* Switch from Solr 3 to Elastic Search (the master-slave failover just does not work as nicely and does not scale, plus changes to Solr 3 need downtime)<br />
* Add a reverse proxy / loadbalancer layer to perform green / blue deployments, redirect specific users to certain versions of the software.
</p>
<h3>Join the team!</h3>
<p>
I hope you enjoyed the overview on architecture and software we use. If you are interested in helping us build and grow this stack, we have good news for you: CenterDevice and codecentric are hiring! We are especially looking for a dev-opsy Linux and Hardware enthusiast to build out what I described above. <strong>Get in touch!</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.codecentric.de/en/2013/02/the-centerdevice-cloud-architecture/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Save Memory by Using String Intern in Java</title>
		<link>http://blog.codecentric.de/en/2012/03/save-memory-by-using-string-intern-in-java/</link>
		<comments>http://blog.codecentric.de/en/2012/03/save-memory-by-using-string-intern-in-java/#comments</comments>
		<pubDate>Mon, 12 Mar 2012 18:26:24 +0000</pubDate>
		<dc:creator>Fabian Lange</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Performance @en]]></category>

		<guid isPermaLink="false">http://blog.codecentric.de/?p=12272</guid>
		<description><![CDATA[During Attila Szegedis talk about &#8220;lessons learned about the JVM&#8221; at QCon London, I was surprised that he emphasized the importance of knowing the amount of data you store in memory. It is not common to be concerned about object &#8230; <a href="http://blog.codecentric.de/en/2012/03/save-memory-by-using-string-intern-in-java/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>During Attila Szegedis talk about &#8220;<a href="http://qconlondon.com/dl/qcon-london-2012/slides/AttilaSzegedi_JVMPerformanceOptimizationsAtTwittersScale.pdf">lessons learned about the JVM</a>&#8221; at <a href="http://qconlondon.com/">QCon London</a>, I was surprised that he emphasized the importance of knowing the amount of data you store in memory. It is not common to be concerned about object size in enterprise Java programming, but he gave a good example of what they had to do Twitter.</p>
<h2>Recap: Memory Footprint of Data</h2>
<p>Question: How much memory does the String &#8220;Hello World&#8221; consume?<br />
Answer: 62/86 Bytes (32/64 bit Java)!<br />
This breaks down into 8/16 (Object Header for String) + 11 * 2 (characters) + [8/16 (Object Header char Array) + 4 (array length) padded to 16/24] + 4 (Offset) + 4 (Count) + 4 (HashCode) + 4/8 (Reference to char Array). [On 64Bit the size of String Object is padded to 40].</p>
<h2>The Problem</h2>
<p>Imagine you have a lot of Locations attached to tweets in your data store. The implementation of the location as a Java class could look like this</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="java5" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> Location <span style="color: #009900;">&#123;</span>
	<span style="color: #003399; font-weight: bold;">String</span> city<span style="color: #339933;">;</span>
	<span style="color: #003399; font-weight: bold;">String</span> region<span style="color: #339933;">;</span>
	<span style="color: #003399; font-weight: bold;">String</span> countryCode<span style="color: #339933;">;</span>
	<span style="color: #006600; font-weight: bold;">double</span> <span style="color: #006600; font-weight: bold;">long</span><span style="color: #339933;">;</span>
	<span style="color: #006600; font-weight: bold;">double</span> lat<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>So if you load all the locations of tweets ever made, it is quite obvious that you load a lot of String objects, and at the scale Twitter has, there are for sure a lot of duplicate Strings. Attila said that this data did not fit into a 32 GB heap. So the question is: can we reduce memory consumption, so that all Locations fit into memory?<br />
<span id="more-12272"></span><br />
Let us have a look at two possible solutions, which can even augment each other.</p>
<h2>Attilas Solution</h2>
<p>There is a, more or less, hidden dependency between the data stored in the Location class, which, once realized, will solve the problem in an elegant way, non-technical way. We can just apply normalization to the Object and split it into two:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="java5" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> SharedLocation <span style="color: #009900;">&#123;</span>
	<span style="color: #003399; font-weight: bold;">String</span> city<span style="color: #339933;">;</span>
	<span style="color: #003399; font-weight: bold;">String</span> region<span style="color: #339933;">;</span>
	<span style="color: #003399; font-weight: bold;">String</span> countryCode<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">class</span> Location <span style="color: #009900;">&#123;</span>
	SharedLocation sharedLocation<span style="color: #339933;">;</span>
	<span style="color: #006600; font-weight: bold;">double</span> <span style="color: #006600; font-weight: bold;">long</span><span style="color: #339933;">;</span>
	<span style="color: #006600; font-weight: bold;">double</span> lat<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>This is actually a neat solution, because cities rarely change the region and country they reside in. The combination of those Strings is unique. And it is flexible, so that even violations of those uniqueness can be handled. This makes always sense for data which could be user input. By doing so, multiple Tweets from &#8220;Solingen, NRW, DE&#8221; just consume one SharedLocation.<br />
But still &#8220;Ratingen, NRW, DE&#8221; would store 3 additional String in memory, rather than just the new one &#8220;Ratingen&#8221;. By refactoring the data model like that the amount of data for that Twitter research project dropped to about 20GB.</p>
<h2>String interning</h2>
<p>But what to do when you do not want to, or simply cannot refactor the data model? Or the researcher at twitter would not have had a 20GB Heap?<br />
The answer is String interning, which keeps every String only once in memory. But there is huge confusion about String interning. Many people ask if it speeds up equals checks, because equal strings are actually identical Strings when using interning. Yes, it might do that (like all objects should)</p>
<pre lag="java5">
// java.lang.String
public boolean equals(Object anObject) {
  if (this == anObject) {
    return true;
  }
  //...
}
</pre>
<p>But equals performance is not the reason you should do interning. String interning is intended to reuse String objects to save memory.</p>
<blockquote><p>Only use String.intern() on Strings you know are occurring multiple times, and only do it to save memory</p></blockquote>
<p>How effective interning is, is determined by the ratio of duplicate/unique strings. And it depends on whether it is easy to change code at string generating places.</p>
<h3>So how does it work?</h3>
<p>String interning takes a String instance (so it already exists in the Heap) and checks if an identical copy exists already in a StringTable.<br />
That StringTable is basically a HashSet that stores the String in the Permanent Generation. The only purpose of that Table is to keep a single instance of the String alive. If it is in there, the instance is returned. If its not, its added to the String Table:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// OpenJDK 6 code</span>
JVM_ENTRY<span style="color: #009900;">&#40;</span>jstring<span style="color: #339933;">,</span> JVM_InternString<span style="color: #009900;">&#40;</span>JNIEnv <span style="color: #339933;">*</span>env<span style="color: #339933;">,</span> jstring str<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
  JVMWrapper<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;JVM_InternString&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  JvmtiVMObjectAllocEventCollector oam<span style="color: #339933;">;</span>
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>str <span style="color: #339933;">==</span> NULL<span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">return</span> NULL<span style="color: #339933;">;</span>
  oop string <span style="color: #339933;">=</span> JNIHandles<span style="color: #339933;">::</span><span style="color: #202020;">resolve_non_null</span><span style="color: #009900;">&#40;</span>str<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  oop result <span style="color: #339933;">=</span> StringTable<span style="color: #339933;">::</span><span style="color: #202020;">intern</span><span style="color: #009900;">&#40;</span>string<span style="color: #339933;">,</span> CHECK_NULL<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #b1b100;">return</span> <span style="color: #009900;">&#40;</span>jstring<span style="color: #009900;">&#41;</span> JNIHandles<span style="color: #339933;">::</span><span style="color: #202020;">make_local</span><span style="color: #009900;">&#40;</span>env<span style="color: #339933;">,</span> result<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
JVM_END
&nbsp;
oop StringTable<span style="color: #339933;">::</span><span style="color: #202020;">intern</span><span style="color: #009900;">&#40;</span>Handle string_or_null<span style="color: #339933;">,</span> jchar<span style="color: #339933;">*</span> name<span style="color: #339933;">,</span>
                        <span style="color: #993333;">int</span> len<span style="color: #339933;">,</span> TRAPS<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">unsigned</span> <span style="color: #993333;">int</span> hashValue <span style="color: #339933;">=</span> hash_string<span style="color: #009900;">&#40;</span>name<span style="color: #339933;">,</span> len<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #993333;">int</span> index <span style="color: #339933;">=</span> the_table<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-&gt;</span>hash_to_index<span style="color: #009900;">&#40;</span>hashValue<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  oop string <span style="color: #339933;">=</span> the_table<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-&gt;</span>lookup<span style="color: #009900;">&#40;</span>index<span style="color: #339933;">,</span> name<span style="color: #339933;">,</span> len<span style="color: #339933;">,</span> hashValue<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">// Found</span>
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>string <span style="color: #339933;">!=</span> NULL<span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">return</span> string<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">// Otherwise, add to symbol to table</span>
  <span style="color: #b1b100;">return</span> the_table<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-&gt;</span>basic_add<span style="color: #009900;">&#40;</span>index<span style="color: #339933;">,</span> string_or_null<span style="color: #339933;">,</span> name<span style="color: #339933;">,</span> len<span style="color: #339933;">,</span>
                                hashValue<span style="color: #339933;">,</span> CHECK_NULL<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>As a result, this specific instance of a String only exists once.</p>
<h3>How to use interning</h3>
<p>The right place to use String interning is where you read from data store and add the Objects/String to a larger scope. Note that ALL Strings which are hardcoded (as constant or anywhere in code) are automatically interned by the compiler.<br />
An example would be:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="java" style="font-family:monospace;"><span style="color: #003399;">String</span> city <span style="color: #339933;">=</span> resultSet.<span style="color: #006633;">getString</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #003399;">String</span> region <span style="color: #339933;">=</span> resultSet.<span style="color: #006633;">getString</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #003399;">String</span> countryCode <span style="color: #339933;">=</span> resultSet.<span style="color: #006633;">getString</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000066; font-weight: bold;">double</span> city <span style="color: #339933;">=</span> resultSet.<span style="color: #006633;">getDouble</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000066; font-weight: bold;">double</span> city <span style="color: #339933;">=</span> resultSet.<span style="color: #006633;">getDouble</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">5</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
Location location <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Location<span style="color: #009900;">&#40;</span>city.<span style="color: #006633;">intern</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, region.<span style="color: #006633;">intern</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, countryCode.<span style="color: #006633;">intern</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #000066; font-weight: bold;">long</span>, lat<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
allLocations.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span>location<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>All newly created location objects will use the interned string. The temporary strings read from the database will be garbage collected.</p>
<h3>How to find out how effective would be string interning</h3>
<p>You are best off taking a heap dump of a quite full heap. It can even be collected at an OutOfMemoryError.<br />
Open it in MAT and select the <code>java.lang.String</code> from the histogram. On that one use &#8220;Java Basics&#8221; and &#8220;Group By Value&#8221;<br />
<img src="http://blog.codecentric.de/files/2012/03/Mat-find-duplicate-strings.png" alt="" title="Mat-find-duplicate-strings" width="752" height="475" class="aligncenter size-full wp-image-12303" /><br />
Depending on the heap size, this may take a long time. In the end the result will be like this. Sorting either along retained heap or number of objects will reveal interesting things:<br />
<img src="http://blog.codecentric.de/files/2012/03/Mat-duplicate-strings.png" alt="" title="Mat-duplicate-strings" width="728" height="476" class="aligncenter size-full wp-image-12302" /><br />
From this screenshot we can see that empty Strings take a lot of memory! 2 million empty Strings take a total of 130MB. Then we see some amount of JavaScript that is loaded, a few more technical strings like the keys, which are used for localization. And we can see some Strings related to business logic.<br />
These business logic strings are probably the easiest to intern, because we might know where they are loaded into memory.<br />
For the other ones we need to use &#8220;Merge shortest Path to GC Root&#8221; to identify where they are stored, which may or may not reveal to use where it is created.</p>
<h3>Tradeoffs</h3>
<p>So why not always do String interning? Because it slows your code down!<br />
Here a small example:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="java5" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #006600; font-weight: bold;">int</span> MAX = <span style="color: #cc66cc;">40000000</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #006600; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> args<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399; font-weight: bold;">Exception</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #006600; font-weight: bold;">long</span> t = <span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">currentTimeMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #003399; font-weight: bold;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> arr = <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399; font-weight: bold;">String</span><span style="color: #009900;">&#91;</span>MAX<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #000000;  font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #006600; font-weight: bold;">int</span> i = <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> MAX<span style="color: #339933;">;</span> i++<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		arr<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> = <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399; font-weight: bold;">String</span><span style="color: #009900;">&#40;</span>DB_DATA<span style="color: #009900;">&#91;</span>i <span style="color: #339933;">%</span> <span style="color: #cc66cc;">10</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #666666; font-style: italic;">// and: arr[i] = new String(DB_DATA[i % 10]).intern();</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">currentTimeMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> - t<span style="color: #009900;">&#41;</span> + <span style="color: #0000ff;">&quot;ms&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">gc</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>arr<span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>The code uses a String array to keep a strong reference to the String objects. and we print the first element in the end to avoid removal of the structure due to optimization.  Then we load 10 different Strings from the database. I used new String() here to illustrate the temp String allocation you always do when you read from storage. At the end I do a GC, so that the results are correct and no leftovers are included.<br />
This was run on a 64bit Windows, JDK 1.6.0_27, i5-2520M with 8GB Ram. Run with <code>-XX:+PrintGCDetails -Xmx6G -Xmn3G</code> to log all GC activity. Here is the output:<br />
<b>Without intern()</b></p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="bash" style="font-family:monospace;">1519ms
<span style="color: #7a0874; font-weight: bold;">&#91;</span>GC <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSYoungGen: 2359296K-<span style="color: #000000; font-weight: bold;">&gt;</span>393210K<span style="color: #7a0874; font-weight: bold;">&#40;</span>2752512K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span> 2359296K-<span style="color: #000000; font-weight: bold;">&gt;</span>2348002K<span style="color: #7a0874; font-weight: bold;">&#40;</span>4707456K<span style="color: #7a0874; font-weight: bold;">&#41;</span>, <span style="color: #000000;">5.4071058</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>Times: <span style="color: #007800;">user</span>=<span style="color: #000000;">8.84</span> <span style="color: #007800;">sys</span>=<span style="color: #000000;">1.00</span>, <span style="color: #007800;">real</span>=<span style="color: #000000;">5.40</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> 
<span style="color: #7a0874; font-weight: bold;">&#91;</span>Full GC <span style="color: #7a0874; font-weight: bold;">&#40;</span>System<span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSYoungGen: 393210K-<span style="color: #000000; font-weight: bold;">&gt;</span>392902K<span style="color: #7a0874; font-weight: bold;">&#40;</span>2752512K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSOldGen: 1954792K-<span style="color: #000000; font-weight: bold;">&gt;</span>1954823K<span style="color: #7a0874; font-weight: bold;">&#40;</span>1954944K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span> 2348002K-<span style="color: #000000; font-weight: bold;">&gt;</span>2347726K<span style="color: #7a0874; font-weight: bold;">&#40;</span>4707456K<span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSPermGen: 2707K-<span style="color: #000000; font-weight: bold;">&gt;</span>2707K<span style="color: #7a0874; font-weight: bold;">&#40;</span>21248K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>, <span style="color: #000000;">5.3242785</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>Times: <span style="color: #007800;">user</span>=<span style="color: #000000;">3.71</span> <span style="color: #007800;">sys</span>=<span style="color: #000000;">0.20</span>, <span style="color: #007800;">real</span>=<span style="color: #000000;">5.32</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> 
DE
Heap
 PSYoungGen      total 2752512K, used 440088K <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000740000000, 0x0000000800000000, 0x0000000800000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  eden space 2359296K, <span style="color: #000000;">18</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000740000000,0x000000075adc6360,0x00000007d0000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  from space 393216K, <span style="color: #000000;">0</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x00000007d0000000,0x00000007d0000000,0x00000007e8000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  to   space 393216K, <span style="color: #000000;">0</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x00000007e8000000,0x00000007e8000000,0x0000000800000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
 PSOldGen        total 1954944K, used 1954823K <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000680000000, 0x00000006f7520000, 0x0000000740000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  object space 1954944K, <span style="color: #000000;">99</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000680000000,0x00000006f7501fd8,0x00000006f7520000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
 PSPermGen       total 21248K, used 2724K <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x000000067ae00000, 0x000000067c2c0000, 0x0000000680000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  object space 21248K, <span style="color: #000000;">12</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x000000067ae00000,0x000000067b0a93e0,0x000000067c2c0000<span style="color: #7a0874; font-weight: bold;">&#41;</span></pre></td></tr></table></div>

<p><b>With intern()</b></p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="bash" style="font-family:monospace;">4838ms
<span style="color: #7a0874; font-weight: bold;">&#91;</span>GC <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSYoungGen: 2359296K-<span style="color: #000000; font-weight: bold;">&gt;</span>156506K<span style="color: #7a0874; font-weight: bold;">&#40;</span>2752512K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span> 2359296K-<span style="color: #000000; font-weight: bold;">&gt;</span>156506K<span style="color: #7a0874; font-weight: bold;">&#40;</span>2757888K<span style="color: #7a0874; font-weight: bold;">&#41;</span>, <span style="color: #000000;">0.1962062</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>Times: <span style="color: #007800;">user</span>=<span style="color: #000000;">0.69</span> <span style="color: #007800;">sys</span>=<span style="color: #000000;">0.01</span>, <span style="color: #007800;">real</span>=<span style="color: #000000;">0.20</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> 
<span style="color: #7a0874; font-weight: bold;">&#91;</span>Full GC <span style="color: #7a0874; font-weight: bold;">&#40;</span>System<span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSYoungGen: 156506K-<span style="color: #000000; font-weight: bold;">&gt;</span>156357K<span style="color: #7a0874; font-weight: bold;">&#40;</span>2752512K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSOldGen: 0K-<span style="color: #000000; font-weight: bold;">&gt;</span>18K<span style="color: #7a0874; font-weight: bold;">&#40;</span>5376K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span> 156506K-<span style="color: #000000; font-weight: bold;">&gt;</span>156376K<span style="color: #7a0874; font-weight: bold;">&#40;</span>2757888K<span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>PSPermGen: 2708K-<span style="color: #000000; font-weight: bold;">&gt;</span>2708K<span style="color: #7a0874; font-weight: bold;">&#40;</span>21248K<span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>, <span style="color: #000000;">0.2576126</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>Times: <span style="color: #007800;">user</span>=<span style="color: #000000;">0.25</span> <span style="color: #007800;">sys</span>=<span style="color: #000000;">0.00</span>, <span style="color: #007800;">real</span>=<span style="color: #000000;">0.26</span> secs<span style="color: #7a0874; font-weight: bold;">&#93;</span> 
DE
Heap
 PSYoungGen      total 2752512K, used 250729K <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000740000000, 0x0000000800000000, 0x0000000800000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  eden space 2359296K, <span style="color: #000000;">10</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000740000000,0x000000074f4da6f8,0x00000007d0000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  from space 393216K, <span style="color: #000000;">0</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x00000007d0000000,0x00000007d0000000,0x00000007e8000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  to   space 393216K, <span style="color: #000000;">0</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x00000007e8000000,0x00000007e8000000,0x0000000800000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
 PSOldGen        total 5376K, used 18K <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000680000000, 0x0000000680540000, 0x0000000740000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  object space 5376K, <span style="color: #000000;">0</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x0000000680000000,0x0000000680004b30,0x0000000680540000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
 PSPermGen       total 21248K, used 2725K <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x000000067ae00000, 0x000000067c2c0000, 0x0000000680000000<span style="color: #7a0874; font-weight: bold;">&#41;</span>
  object space 21248K, <span style="color: #000000;">12</span><span style="color: #000000; font-weight: bold;">%</span> used <span style="color: #7a0874; font-weight: bold;">&#91;</span>0x000000067ae00000,0x000000067b0a95d0,0x000000067c2c0000<span style="color: #7a0874; font-weight: bold;">&#41;</span></pre></td></tr></table></div>

<p>We can see that the difference is significant. It took 3.3 seconds longer to run with interning. But the memory saved was enormous. When the code finished, the program using interning used 253472K(250M) of Memory. The other one used 2397635K (2.4G). That is quite a difference and illustrates nicely the tradeoffs when using String interning.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.codecentric.de/en/2012/03/save-memory-by-using-string-intern-in-java/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Analyzing Production Outage Caused by Weblogic Compiler Threads</title>
		<link>http://blog.codecentric.de/en/2012/02/analyzing-production-outage-caused-by-weblogic-compiler-threads/</link>
		<comments>http://blog.codecentric.de/en/2012/02/analyzing-production-outage-caused-by-weblogic-compiler-threads/#comments</comments>
		<pubDate>Tue, 07 Feb 2012 17:19:35 +0000</pubDate>
		<dc:creator>Fabian Lange</dc:creator>
				<category><![CDATA[Performance @en]]></category>

		<guid isPermaLink="false">http://blog.codecentric.de/?p=11514</guid>
		<description><![CDATA[I was at a customer recently to further improve their application monitoring for which they were using AppDynammics. When I arrived, they told me: &#8220;Good that you came Fabian, we have something interesting to show&#8221;. Usually operations guys are very &#8230; <a href="http://blog.codecentric.de/en/2012/02/analyzing-production-outage-caused-by-weblogic-compiler-threads/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>I was at a customer recently to further improve their application monitoring for which they were using AppDynammics. When I arrived, they told me:<br />
&#8220;Good that you came Fabian, we have something interesting to show&#8221;. Usually operations guys are very concerned about something when they say this to me, but in this case, they were happy. &#8220;We had a big outage when we took our new software live on Saturday!&#8221;. Ooops?? Nobody likes production outage on Saturdays! Why are they so happy about it? &#8220;Well it was a long Saturday, but thanks to our monitoring we knew what was going on&#8221;.</p>
<p>So let me walk you through what they did, and show you the problem that killed their server.</p>
<h2>An Overview of the Situation</h2>
<p>Here is what the situation looked like on Saturday 8 in the morning.<br />
<img src="http://blog.codecentric.de/files/2012/02/thread-pool-exhausted.png" alt="" title="thread-pool-exhausted" width="700" height="475" class="aligncenter size-full wp-image-11491" style="border: 1px solid black" /><br />
<span id="more-11514"></span><br />
If you don&#8217;t know AppDynamics, here a short introduction:</p>
<ul>
<li>The biggest area is the application map. AppDynamics autodiscovers and monitors all systems and their interaction.</li>
<li>On the right side there are some statistics. We can see some Stalls and Abnormal Slow requests. AppDynamics found out that something is not right.</li>
<li>On the bottom is the historical view. Because this is historical data now, we only see hourly data. During the incident, the data was more fine grained.</li>
</ul>
<p>The historical data is of most interest for us, looking back.<br />
We can see the server start on the left hand side (blue icons on top of green bar). The load on the system climbed rapidly (green bar). But then the system broke down. The response time increased massively (blue bar). And requests decreased. Soon the server crashed and had to be restarted.<br />
But the restart did not help, the system did not recover. Luckily it did later after the problem was fixed <img src='http://blog.codecentric.de/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<h2>What went wrong</h2>
<p>The system crashed with an OutOfMemory Error. In our <a href="http://blog.codecentric.de/en/2010/01/java-outofmemoryerror-a-tragedy-in-seven-acts/">OutOfMemoryError Series</a>, we already discussed some of them, and here is another: &#8220;Unable to create native thread&#8221;.</p>
<p>This error is actually an interesting fact. Outages with this kind of Error are usually not created by inefficient code, but either by infrastructural problems or buggy threading code. So lets look at the number of threads on each server recorded by AppDynamics:</p>
<p><img src="http://blog.codecentric.de/files/2012/02/thread-pool-exhausted-2.png" alt="" title="thread-pool-exhausted-2" width="700" height="433" class="aligncenter size-full wp-image-11489" /></p>
<p>Ouch. When it crashed, the system had already created 1800 threads.<br />
What was producing these threads? I have seen this before at customers, so I immediately knew this. And also my client was able to find out by seeing a few snapshots of stalled calls. JSP Compilation was the culprit.<br />
All application servers have an option to precompile JSPs on startup, or when requested for the first time. But both settings are problematic when the server is started under high load.<br />
The problem was resolved on Sunday Morning 4AM. But the ops team had not to be awake those 20 hours. They had reported the qualified issue to Oracle around noon on Saturday. The Oracle provided &#8220;fix&#8221; was simple, but not documented anywhere. Or at least not where you can find it if you do not know it.<br />
Here is the screenshot from AppDynamics discovering the property change and restart at 4AM.<br />
<img src="http://blog.codecentric.de/files/2012/02/thread-pool-exhausted-3.png" alt="" title="thread-pool-exhausted-3" width="601" height="419" class="aligncenter size-full wp-image-11490" /></p>
<h2>Limiting JSP Compiler Threads on Weblogic</h2>
<p>The &#8220;secret&#8221; switch was an environment variable you have to set:</p>
<pre>
BEA_COMPILER_NUM_THREADS = 1
</pre>
<p>While 1 is for sure very pessimistic, it seems to be a way better setting than unlimited, which is the default. No one should ever create an unlimited amount of threads. If you think in a classical view, you should only have as much threads as you have number of cores. modern JVMs can easily handle like ten times that much threads, without much starvation. But in this case we had about 100 Compiler Threads per CPU.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.codecentric.de/en/2012/02/analyzing-production-outage-caused-by-weblogic-compiler-threads/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Find Memory Leaks at Runtime &#8211; Addendum</title>
		<link>http://blog.codecentric.de/en/2012/02/find-memory-leaks-at-runtime-addendum/</link>
		<comments>http://blog.codecentric.de/en/2012/02/find-memory-leaks-at-runtime-addendum/#comments</comments>
		<pubDate>Fri, 03 Feb 2012 11:18:12 +0000</pubDate>
		<dc:creator>Fabian Lange</dc:creator>
				<category><![CDATA[Performance @en]]></category>

		<guid isPermaLink="false">http://blog.codecentric.de/?p=11284</guid>
		<description><![CDATA[In Act 5 of our OutOfMemoryError series, I talked about the lack of tool support for finding memory leaks at runtime. I got some negative feedback on dzone, because the tool I discussed was commercial Shortly after I published that &#8230; <a href="http://blog.codecentric.de/en/2012/02/find-memory-leaks-at-runtime-addendum/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In <a href="http://blog.codecentric.de/en/2012/01/find-java-memory-leaks-at-runtime-act-5/">Act 5 of our OutOfMemoryError series</a>, I talked about the lack of tool support for finding memory leaks at runtime. I got some negative feedback on dzone, because the tool I discussed was commercial <img src='http://blog.codecentric.de/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /><br />
Shortly after I published that post, I got notified of a product called &#8220;<a href="http://www.plumbr.eu/">plumbr</a>&#8220;. Plumbr is in beta phase, which means it is free at the moment, and they even have a <a href="http://www.plumbr.eu/blog/how-much-should-our-product-cost">public pricing discussion</a>. But still we need to accept, that there is no free solution. Its not my fault, so don&#8217;t be angry <img src='http://blog.codecentric.de/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /><br />
In this short update, I will show you how plumber will find the very same memory leak, which AppDynamics found in my last article.<br />
<span id="more-11284"></span></p>
<h3>Installation</h3>
<p>Plumbr is as easy to install as AppDynamics. Your downloaded package contains files for various platforms, so this is an indication, that plumbr is using a JVMTI native agent in addition to the javaagent. After unzipping, I needed to add this to my Tomcat 7, which will be running the memory leak app:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="bash" style="font-family:monospace;">SET <span style="color: #007800;">PATH</span>=C:\Tools\plumbr\plumbr\win\<span style="color: #000000;">64</span>;<span style="color: #000000; font-weight: bold;">%</span>PATH<span style="color: #000000; font-weight: bold;">%</span>
<span style="color: #000000; font-weight: bold;">set</span> <span style="color: #007800;">CATALINA_OPTS</span>=-agentlib:plumbr -javaagent:C:\Tools\plumbr\plumbr\plumbr.jar</pre></td></tr></table></div>

<h3>Finding Memory Leaks Fast</h3>
<p>Plumbr is working very fast. I deployed my leaking application and started the load script. After only a few minutes this messaged appeared:</p>
<p>
<img src="http://blog.codecentric.de/files/2012/01/plumbr-found-leak.png" alt="" title="plumbr-found-leak" width="677" height="234" class="aligncenter size-full wp-image-11288" />
</p>
<p>So the only thing I needed to do was to open the generated report:</p>
<p>
<img src="http://blog.codecentric.de/files/2012/01/plumbr-found-leak-2.png" alt="" title="plumbr-found-leak-2" width="700" height="748" class="aligncenter size-full wp-image-11292" />
</p>
<p>Tadaa! The leak was correctly identified. And it also lists where the leak comes from, which is very important:</p>
<pre>
The objects are created at
  de.codecentric.memleak.leak.StringQueueLeak.leak(de.codecentric.memleak.domain.Bookmark):58
</pre>
<p>When compared to AppDynamics, this is slightly less information, because the call stack is shorter. And it also does not give invocation counts. However this is still great information, which helps us to identify and fix the leak. And it is presented in a dead simple way.</p>
<h3>Further improvements</h3>
<p>You might wonder why there are 3 leaks identified, while AppDynamics did show only one of them. The second and third are classes used by the Derby database:</p>
<p>
<img src="http://blog.codecentric.de/files/2012/01/plumbr-finalizer.png" alt="" title="plumbr-finalizer" width="700" height="460" class="aligncenter size-full wp-image-11293" />
</p>
<p>So plumbr currently assumes that a reference from <a href="http://javasourcecode.org/html/open-source/jdk/jdk-6u23/java/lang/ref/Finalizer.java.html">java.lang.ref.Finalizer</a> is a memory leak. There is some controversy (for example on <a href="http://stackoverflow.com/questions/8355064/is-memory-leak-why-java-lang-ref-finalizer-eat-so-much-memory">stackoverflow</a>) about this being possible leaks. So I mailed this observation to the plumber developers and got a quick answer: While they think Finalizers are problematic, they want to change the handling, so there will be less likely false positive reportings.</p>
<p>I think plumbr is a really interesting solution. It solves the very specific problem of finding memory leaks in an elegant way. It reports fast and did not add significant overhead to my application, but I still have to trial it in a real application server environment to make an educated statement.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.codecentric.de/en/2012/02/find-memory-leaks-at-runtime-addendum/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Find Java Memory Leaks at Runtime (Act 5)</title>
		<link>http://blog.codecentric.de/en/2012/01/find-java-memory-leaks-at-runtime-act-5/</link>
		<comments>http://blog.codecentric.de/en/2012/01/find-java-memory-leaks-at-runtime-act-5/#comments</comments>
		<pubDate>Tue, 17 Jan 2012 20:30:49 +0000</pubDate>
		<dc:creator>Fabian Lange</dc:creator>
				<category><![CDATA[Performance @en]]></category>

		<guid isPermaLink="false">http://blog.codecentric.de/?p=10800</guid>
		<description><![CDATA[Act 4 of our series on OutOfMemoryError closed with the promise of better approaches to find memory leaks. We explained that, while we can find big objects in heap dumps,  they only in case of an OutOfMemoryError give us the &#8230; <a href="http://blog.codecentric.de/en/2012/01/find-java-memory-leaks-at-runtime-act-5/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><a href="http://blog.codecentric.de/2011/08/java-heapdumps-erzeugen-und-verstehen-4-akt/">Act 4 of our series on OutOfMemoryError</a> closed with the promise of better approaches to find memory leaks. We explained that, while we can find big objects in heap dumps,  they only in case of an OutOfMemoryError give us the indication of a leak. To have a chance to find something during a post-mortem analysism, one should always use the JVM parameter <code>-XX:+HeapDumpOnOutOfMemoryError</code>.</p>
<p>But not all leaks will cause an OutOfMemoryError and produce a dump, or they would take a very long time to occur. For example the server and the JVM could even be restarted in a regular interval for deployments or to fight memory issues.</p>
<p>To find slowly growing memory leaks, we have to perform more complicated and time-consuming analysis. We could use multiple dumps, which are spread out over time. While they theoretically would allow us to recognize growing structures, it would be tedious in practice, because the difference between multiple dumps will be mostly normal fluctuation, which makes it difficult to spot the relevant delta. And you already have to know the memory leak producing use cases which you then can invoke between dump. But perhaps the biggest problem of all is that creating a dump in production is not advisable, because it can hang the system for seconds to minutes, depending on the heap size.</p>
<p>A better solution is to monitor the heap and the relevant objects within during the runtime of our application. By doing so we can track every structure and get notified when something keeps on growing over time. And because the application is still running fine, we can also easily get the information about the code which is interacting with the leak. This is not possible at all using heap dumps, as they do not contain information about code.</p>
<p><span id="more-10800"></span></p>
<h2>The concept</h2>
<p>The pattern for runtime analysis of heap is pretty simple:</p>
<ul>
<li>Find all objects which are created by the application.</li>
<li>Track those objects and record their size.</li>
<li>Alert on any &#8220;abnormal&#8221; behavior.</li>
<li>Provide content and invoking code for diagnosis.</li>
</ul>
<p>Unfortunately each and every of those points brings a lot of issues in practice. As a result of this, there are only a few implementations of this concept. Already finding all the relevant objects is not an easy task. In Act 4 I recommended to focus on our own packages, like <code>de.codecentric.memoryleak</code>. But what do we do when standard classes leak? While in a demo application the number of objects might be manageable, in real applications, there are millions of objects. How can we ever efficiently store data on those complex structures? And what is &#8220;abnormal&#8221; behavior? Are there sizes and lifetimes of objects that we can consider &#8220;normal&#8221;?</p>
<h2>An Implementation</h2>
<p>As an example for this concept, I am going to showcase the <a>Leak Detection Feature</a> of the <a href="http://www.appdynamics.com/">APM solution AppDynamics</a>. The only other implementation of a leak detection, which does not use heap dumps, I am currently aware of, is the <a href="www.ca.com/de/application-management.aspx">Introscope Leak Hunter</a>. Should you know a tool which does it in a similar way, I would be happy to get learn about them in your comment!</p>
<h3>Assumptions</h3>
<p>As you can guess, a solution as outlined above cannot be realistically implemented. We need to simplify the problem using simple assumptions. Luckily there are quite a few assumptions you can do for any Java program. For example the typical age distribution of objects is used by Garbage Collectors to work with different generations, as I described in Act 3.</p>
<p>AppDynamics is doing the following assumptions:</p>
<ul>
<li>There is no need to monitor all objects. Experience shows us that most memory leaks are caused by putting data into collection type structures, like lists and maps, but not removing the data from there later. Custom cache implementations are a very typical example. Because of that, AppDynamics monitors just those classes.</li>
<li>We do not need to look at collections that are not used, like internal structures created by the application server on startup. We just need the structures our code interacts with.</li>
<li>From those active collections, we just need to monitor those which contain a relevant number of objects. And because we look for leaks, that number has to increase over time.</li>
<li>Those long and active collections could be leaks, but to become a relevant problem for the stability of our code, that collection has to dominate a significant amount of memory.</li>
<li>All those factors apply over a longer period of time.</li>
</ul>
<p>AppDynamics uses a similar process to find leaks. By that it minimizes impact on the monitored JVM. Additionally AppDynamics uses elaborated algorithms to calculate object tree sizes efficiently with very low overhead, even under high load. Nevertheless memory analysis will always be connected to a higher overhead.</p>
<h2>My Open Source Collection Analyzer</h2>
<p>Because AppDynamics is a commercial solution, I wanted to have my own shot at implementing such a memory leak finder.<br />
You can find my version of a basic <a href="https://github.com/CodingFabian/JavaCollectionAnalyzer">Java Memory Analyzers</a> on Github.</p>
<p>The fundamental idea is surprisingly easy to implement. My analyzer is coded using just two classes.</p>
<h3>CollectionAnalyzerAspect</h3>
<p>I do similar assumptions as AppDynamics and just watch Collections. If I wanted, I could add any possibly leaking classes here:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="java5" style="font-family:monospace;">@Before<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;   call(* java.util.Map.put(..)) &amp;&amp;
            !this(de.codecentric.performance.memory.CollectionAnalyzerAspect)&quot;</span><span style="color: #009900;">&#41;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #006600; font-weight: bold;">void</span> trackMapPuts<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> JoinPoint thisJoinPoint<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #003399; font-weight: bold;">Map</span> target = <span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">Map</span><span style="color: #009900;">&#41;</span> thisJoinPoint.<span style="color: #006633;">getTarget</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	CollectionStatistics stats = getStatistics<span style="color: #009900;">&#40;</span>target<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	stats.<span style="color: #006633;">recordWrite</span><span style="color: #009900;">&#40;</span>getLocation<span style="color: #009900;">&#40;</span>thisJoinPoint<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	stats.<span style="color: #006633;">evaluate</span><span style="color: #009900;">&#40;</span>target.<span style="color: #006633;">size</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>This Pointcut adds my code before all calls to <code>Map.put()</code>. Because I am using a map myself to store statistical data, I need to exclude myself to avoid a nasty recursion. Next, I get a statistics storage object for the collection instance I monitor to record access and evaluate its usage.<br />
This is a simplistic approach. I think it would be much better to evaluate all statistics in a separate thread periodically than to do this synchronous on every request.<br />
There is one additional interesting problem: How can I identify the Collection I am currently inspecting? For that I am using the &#8220;identityHashCode&#8221;, but I already know that this might not be a wise idea, as it might not be unique.</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="java5" style="font-family:monospace;"><span style="color: #006600; font-weight: bold;">int</span> identityHashCode = <span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">identityHashCode</span><span style="color: #009900;">&#40;</span>targetCollection<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<h3>CollectionStatistics</h3>
<p>Ok, I have recorded all the invocation counts for the methods. So what do I do with this data?</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="java5" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #006600; font-weight: bold;">void</span> evaluate<span style="color: #009900;">&#40;</span><span style="color: #006600; font-weight: bold;">int</span> size<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #000000;  font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>size <span style="color: #339933;">&gt;</span>= DANGEROUS_SIZE<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>Information for Collection %s (id: %d)<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>, className, id<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot; * Collection is very long (%d)!<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>, size<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000000;  font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>reads == <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span>	<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot; * Collection was never read!<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000000;  font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>deletes == <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot; * Collection was never reduced!<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Recorded usage for this Collection:<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000000;  font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">String</span> code : interactingCode<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot; * %s<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>, code<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>I did not really decide on an elaborated criterion for when a collection behaves &#8220;abnormal&#8221;. I just took a hardcoded collection length. It would be a great idea to have a WeakReference to the collection and calculate the dominator tree of it, but to <a href="http://www.javaspecialists.eu/archive/Issue142.html">calculate the Deep Size</a> is a pretty complex problem on its own.</p>
<p>Besides the length, I consider two factors as interesting:</p>
<ul>
<li>Is this collection ever read?</li>
<li>Was something ever deleted from it?</li>
</ul>
<p>Both are typical antipattern for caches. Nobody reading or deleting from long lists is a clear indicator for a problem. Thats why I am warning on it. Last, I print all the recorded invoking code, which is a pretty useful information!</p>
<h3>A Testrun</h3>

<div class="wp_syntax"><table><tr><td class="code"><pre class="bash" style="font-family:monospace;">Information <span style="color: #000000; font-weight: bold;">for</span> Collection java.util.ArrayList <span style="color: #7a0874; font-weight: bold;">&#40;</span>id: <span style="color: #000000;">1813612981</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection is very long <span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #000000;">5000</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #000000; font-weight: bold;">!</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection was never reduced<span style="color: #000000; font-weight: bold;">!</span>
Recorded usage <span style="color: #000000; font-weight: bold;">for</span> this Collection:
 <span style="color: #000000; font-weight: bold;">*</span> de.codecentric.performance.LeakDemo:<span style="color: #000000;">19</span>
 <span style="color: #000000; font-weight: bold;">*</span> de.codecentric.performance.LeakDemo:<span style="color: #000000;">17</span>
 <span style="color: #000000; font-weight: bold;">*</span> de.codecentric.performance.LeakDemo:<span style="color: #000000;">18</span>
&nbsp;
Information <span style="color: #000000; font-weight: bold;">for</span> Collection java.util.ArrayList <span style="color: #7a0874; font-weight: bold;">&#40;</span>id: <span style="color: #000000;">1444378545</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection is very long <span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #000000;">5000</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #000000; font-weight: bold;">!</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection was never <span style="color: #c20cb9; font-weight: bold;">read</span><span style="color: #000000; font-weight: bold;">!</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection was never reduced<span style="color: #000000; font-weight: bold;">!</span>
Recorded usage <span style="color: #000000; font-weight: bold;">for</span> this Collection:
 <span style="color: #000000; font-weight: bold;">*</span> de.codecentric.performance.LeakDemo:<span style="color: #000000;">18</span>
&nbsp;
Information <span style="color: #000000; font-weight: bold;">for</span> Collection java.util.HashMap <span style="color: #7a0874; font-weight: bold;">&#40;</span>id: <span style="color: #000000;">515060127</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection is very long <span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #000000;">5000</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #000000; font-weight: bold;">!</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection was never <span style="color: #c20cb9; font-weight: bold;">read</span><span style="color: #000000; font-weight: bold;">!</span>
 <span style="color: #000000; font-weight: bold;">*</span> Collection was never reduced<span style="color: #000000; font-weight: bold;">!</span>
Recorded usage <span style="color: #000000; font-weight: bold;">for</span> this Collection:
 <span style="color: #000000; font-weight: bold;">*</span> de.codecentric.performance.LeakDemo:<span style="color: #000000;">19</span>
&nbsp;
Exception <span style="color: #000000; font-weight: bold;">in</span> thread <span style="color: #ff0000;">&quot;main&quot;</span> java.lang.OutOfMemoryError: Java heap space
	at de.codecentric.performance.DummyData.<span style="color: #7a0874; font-weight: bold;">&#40;</span>DummyData.java:<span style="color: #000000;">5</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
	at de.codecentric.performance.LeakDemo.runAndLeak<span style="color: #7a0874; font-weight: bold;">&#40;</span>LeakDemo.java:<span style="color: #000000;">17</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
	at de.codecentric.performance.DemoRunner.main<span style="color: #7a0874; font-weight: bold;">&#40;</span>DemoRunner.java:<span style="color: #000000;">12</span><span style="color: #7a0874; font-weight: bold;">&#41;</span></pre></td></tr></table></div>

<p>Heureka, it works!</p>
<h3>Overhead</h3>
<p>To record each and every invocation of a collection is quite memory intensive. Perhaps I should do this only when I have the indication that this collection could be leaking. But b using those AspectJ pointcuts, my code will always run. In a real environment with hundred thousand such collections it will not be a great idea for sure. Dynamic ByteCode Instrumentation should be used to avoid this. And of course an evaluation over a long period of time makes more sense than my quick checking.</p>
<p>As we can see, the idea is easy to implement, but a production ready solution requires a good amount of thinking and clever algorithms. If you like to improve my analyzer, feel free to send patches via <a href="https://github.com/CodingFabian/JavaCollectionAnalyzer">GitHub</a>.</p>
<h2>Memory Analysis of a Demo Application using AppDynamics</h2>
<p>So lets have a look at what AppDynamics does in a professional solution.</p>
<h3>10:43 &#8211; Application Server Restart</h3>
<p>After starting the AppDynamics Leak Detection you will not get immediate results. It starts in the background analyzing collections. Only after a while possible leaks might show up.</p>
<h3>11:00 &#8211; Collection detected</h3>
<p style="text-align: center;"><img class="aligncenter  wp-image-10690" title="01-collection-detected" src="http://blog.codecentric.de/files/2012/01/01-collection-detected.png" alt="" width="700" height="155" /></p>
<p>This <code>java.util.LinkedList</code> is monitored by AppDynamics. It has 56,881 entries, which makes it indeed interesting. But AppDynamics has no long time information yet, so it is not marked as  &#8220;potentially leaking&#8221;.</p>
<h3>11:10 &#8211; Collection potentially leaking</h3>
<p style="text-align: center;"><img class="aligncenter  wp-image-10689" title="02-collection-leaked" src="http://blog.codecentric.de/files/2012/01/02-collection-leaked.png" alt="" width="700" height="53" /></p>
<p>Time passed, but the collection continued to grow. 98,850 entries are almost twice the amount compared to ten minutes ago. The internal heuristics now mark it as &#8220;potentially leaking&#8221;.</p>
<h3>11:17 &#8211; The leak is growing</h3>
<p style="text-align: center;"><img class="aligncenter  wp-image-10688" title="03-leak-overview" src="http://blog.codecentric.de/files/2012/01/03-leak-overview.png" alt="" width="695" height="370" /></p>
<p>The overview is showing the growth of the leak. Garbage Collection activation would be also drawn here to visualize effects of using SoftReferences.</p>
<h3>11:30 &#8211; Showing the memory leak</h3>
<p style="text-align: center;"><img class="aligncenter  wp-image-10687" title="04-leak-content" src="http://blog.codecentric.de/files/2012/01/04-leak-content.png" alt="" width="694" height="384" /></p>
<p>The Content Inspection shows us what is inside the Collection. In this case there are now 118,990 <code>java.lang.String</code> objects with a total size of 20MB.<br />
AppDynamics can also dump the collection and its contents to disc to allow a more detailed analysis of the contents.</p>
<h3>11:38 &#8211; Identifying the root cause</h3>
<p style="text-align: center;"><img class="aligncenter  wp-image-10691" title="05-leak-creator" src="http://blog.codecentric.de/files/2012/01/05-leak-creator.png" alt="" width="695" height="330" /></p>
<p>By using an Access Tracking Session AppDynamics finds out who is creating this memory leak. While you might come up to this point using heap dumps, the listing of the call hierarchy is something special. LinkedLists containing Strings could have been used everywhere, but this leaking LinkedList is used by the &#8220;newbookmark&#8221; business transaction.<br />
The <code>BookmarkDaoImpl</code> is appending Strings to that list in line 50. However, AppDynamics did not see any code reading or deleting from this list.</p>
<p>So we now got all the information we need to fix this memory leak:</p>
<ul>
<li>We can see the potentially leaking structures.</li>
<li>We get notified about leaks automatically.</li>
<li>We can see the contents of those structures.</li>
<li>Business transactions (Use Cases) responsible for creating the leak are identified.</li>
<li>Accessing code is recorded and shown.</li>
</ul>
<p>The final decision on whether this is a memory leak or just strange code is of course still up to the developer.</p>
<h2>Wrap Up</h2>
<p>It is possible to find memory leaks at runtime without creating heap dumps. The information on the invocating code is very useful for correcting memory leaks. Unfortunately, there is no free or open source product for finding leaks in such a way. As you have seen, it is not recommendable to implement it yourself.<br />
AppDynamics has a free 30 day trial which includes the memory leak finder, so you can check for yourself if it is something you can use.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.codecentric.de/en/2012/01/find-java-memory-leaks-at-runtime-act-5/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
<!-- This Quick Cache file was built for ( 80 blog.codecentric.de/en/author/fla/feed/ ) in 2.03438 seconds, on Jun 19th, 2013 at 10:15 am UTC. -->
<!-- This Quick Cache file will automatically expire ( and be re-built automatically ) on Jun 19th, 2013 at 11:15 am UTC -->
<!-- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -->
<!-- Quick Cache Is Fully Functional :-) ... A Quick Cache file was just served for ( 80 blog.codecentric.de/en/author/fla/feed/ ) in 0.00064 seconds, on Jun 19th, 2013 at 10:52 am UTC. -->