Posts fromJuly 2008

Memory Analysis Part 1 – Obtaining a Java Heapdump

For troubleshooting Java memory leaks and high memory usage problems, the heapdump is one of the most important analysis features. The advantage of heapdumps is that they can be produced in productive environments – the place where the problems most frequently occur. All current Java Virtual Machines support the generation of heapdumps without the need of additional tools.
In this blog series I will show you how to analyze and fix memory problems in production . I will also provide a list of common antipatterns and memory problems.
The first part of this series deals with the vital task of heapdump generation – the main precondition for a successful analysis. The various JVM manufacturers (Sun, IBM, BEA) have different tools and formats to dump the heap of the JVM – this blog therefore focus on the implementation of Sun. The Sun Java Virtual Machine contains several options and tools to create a heapdump:
- Automatically when a java.lang.OutOfMemoryError occurs

- With the command line tool jmap

- By using a provided MBean (Java Management Extension – JMX) and the tool jconsole

Of course, there is the possibility to use the Java Virtual Machine Tool Interface (JVMTI) to produce a dump – but therefore you would have to implement an agent in C. Many Profiling Tools (JProfiler, dynaTrace Diagnostics) provide a JVMTI agent to create and evaluate a heapdump with a GUI.

To automatically generate a heap dump when an OutOfMemoryError is thrown you have to provide this JVM command line parameter:

- XX:+HeapDumpOnOutOfMemoryError

The parameter causes the JVM to dump a HPROF headump to the current directory if an OutOfMemoryError occurs. The name of the dump is by convention java_pid.hprof.  To specify the directory and the name of the file by yourself, you can add the parameter -XX:HeapDumpPath=path_to_file to the JVM command line options.

The automatic production of dumps with these parameters is not always useful. In some situations you want to produce a heapdump at any given time during application execution. In this case Java version 1.4.2_09, 1.5.x and 1.6.x provide the tool jmap. A HPROF heapdump can be requested by executing the following command:

map -dump:file=path_to_file java_process_id.

The provided Java process id determines which local JVM should be dumped. The process Id can be determined with the JVM Tool jps (Note: The jmap tool is not available on every platform and JVM version). You can alternatively use the JVM parameter -XX:+HeapDumpOnCtrlBreak and send a SIGQUIT signal (-3 kill for Unix and Ctrl-Break for Windows) to the running Java process – the signal will also create a heapdump without aborting the JVM.

With Java 6, Sun introduced a JMX MBean which provides methods for generating a heapdump. To create a heapdump via JMX you first start the integrated JMX console with the command jconsole and connect it to the corresponding JVM. For a local connection you don’t need any additional configuration of the JVM – for a connection to a remote machine you have to configure JMX correctly.

You can use the MBean Explorer of jconsole to find the correct MBean within the JVM. The MBean com.sun.management.HotSpotDiagnostic contains the alluded method dumpHeap (String, boolean). With the help of the first method parameter you can set the path and the name of the Heapdump. The screenshot shows the view of the MBean. Pressing on the dumpHeap button will create a heapdump with the given name.

The next part of this series will get into more details how to analyse heapdumps and find memory leaks and common memory antipatterns.

Information about heapdump generation with the IBM JVM can be found in the IBM JVM Diagnosis Documentation.

Information about BEA JRockit and the JRockit Memory Leak Detector can be found in the JRockit Dokumentation.

Mirko Novakovic

 

Codecentric Presents Scrum for Virtual Teams at JUG Cologne

JUG Cologne LogoThe JUG Cologne invited me to one of their regular meetings. On August, 11th I will present the experiences I made as Spec-Lead for Java Specification Request 264, which is the Order Management API, when introducing Scrum to the expert group. After a general introduction to Scrum, I will cover the needs that drove the decision to try Scrum on a virtual, globally distributed team as well as the solutions the team developed to cope with issues like time zone differences, varying team member availability, or online scrum tooling. Also some analysis will be shown if the new way of working had any effect on the group dynamics. read more…

Andreas Ebbert-Karroum

 

Internet Explorer 8 will contain new AJAX functionality

A blog entry at MSDN reports that Internet Explorer 8 will contain an important new feature: it will be possible to control the navigation history by JavaScript.

Up to today, it was problematic for the user to navigate back and forth through an AJAX-based web application. After the page has been loaded completely, the state of the HTML document might be modified by AJAX-based interactions, for example dynamic loading of texts or data. If the user clicks the “Back” button in his browser, the browser will go back to the previously loaded page, losing the entire (potentially modified) state of the application.

The new implementation in Internet Explorer 8, which is adopted from HTML 5, is supposed to solve this problem. It will be possible to add AJAX-related changes of state to the navigation history, enabling the user to navigate back and forth through the application, based on a history that contains all dynamically changed states. This would solve a huge usability problem of AJAX-based web applications.

The new feature is presented here as a video.

Up to now, no estimate can be made about the adoption of the new functionality by other browser vendors. Time will show which decisions will be made by them.

Robert Spielmann

 

Connection Sharing Sideeffects

Today Christian and I fought hard with a misterious connection problem. As a part of a longer process a web service call was failing from time to time. Actually it failed only under load. Due to that we did just debugging outputs, rather than breakpoints, as we were afraid of breaking some runtime behaviour by stopping the threads. But that was not so trivial as we would liked it to have: The webservice client is based upon the  Apache Commons HTTPclient. The webserive itself was realized in Struts2 using its Restful2ActionMappers. Both got several services injected via Spring.

Initially we were after the unusual large size of the request and response, but after some analysis it seemed ok, however we still got flooded by variations of:

java.io.IOException: CRLF expected at end of chunk: 72/84
java.io.IOException: Bad chunk size: somexml

As we realized that this has to happen on the HTTP transport level, we had a look around and found following spring config for our bean:

<bean name="restClient" class="de.codecentric.framework.RestClient">
  <property name="httpClient">
    <bean class="org.apache.commons.httpclient.HttpClient" />
  </property>
</bean>

Doesn’t look bad, but by default the HttpClient has a ConnectionManager which will return always the same Connection instance. Even to different threads.  Once you get to know this you will find immediately

org.apache.commons.httpclient.MultiThreadedHttpConnectionManager

which unfortunately did not help much. In addition we now go an

java.io.IOException: connection closed

What might suprise you is that MultiThreadedHttpConnectionManager is multithreading capable, but does only allow 2 connections per host. After we did increase that out application was running fine again. Even under load.

The Spring config does look similar to this now:

<bean name="restClient" class="de.codecentric.framework.RestClient">
  <property name="httpClient">
    <bean class="org.apache.commons.httpclient.HttpClient">
      <property name="httpConnectionManager">
        <bean class="org.apache.commons.httpclient.MultiThreadedHttpConnectionManager" destroy-method="shutdown">
          <property name="params">
            <bean class="org.apache.commons.httpclient.params.HttpConnectionManagerParams">
              <property name="defaultMaxConnectionsPerHost" value="20" />
            </bean>
          </property>
        </bean>
      </property>
    </bean>
  </property>
</bean>

Leasson learnt: When multiple threads are using the same UrlConnection, it can go fine, but more likely, some very obscure things (exceptions) can (will) happen :-)

Fabian Lange

 

Comparison of Java and PHP for Web Applications

No other language has been causing controversial discussions for a long time as PHP. The codecentric GmbH has specialized in Java, so we get some requests for migration of PHP applications.

This involves often the question of whether Java is better than PHP, which is actually not the main problem. Both in Java as well as in PHP, there are frameworks, designed to create Web applications. Frameworks can of course offset drawbacks of language, but also deny benefits of languages.

To understand the comparison of Java and PHP we must go back in time to about the year 2000. Java  brought with Servlets and Struts first concepts for Web applications, but to create, configure and deploy them was very complicated. With the boom of the Internet, a new developer community grew, which quickly learned HTML. But pure HTML limits the possibility of interaction and CGI-Perl scripts were cumbersome and difficult. PHP, however, offered an elegant and simple way, if we wanted a date in a web page, we renamed “.html” to “.php” and inserted where wanted <?php echo date ()?> . On Apache Webserver, which already was prepared for PHP, the new file worked out of the box.

Although in Java Server Pages the ability to use scriptlets also did exist, this was frowned upon as unclean. Instead, the Java community advocated the use of components. In my opinion, a critical factor for the categorization of Java as “Enterprise”.

For Internet applications a beautiful design is more important than a functional, as it attracts more customers. While HTML built in Dreamweaver or Front Page by the designer could be easily expandable with dynamic functionality by PHP developers, Java component oriented frameworks could not really work with it. PHP could enrich design with functionality. In Java, however, one had to beautify the functionality.

But in recent years both sides improved. Java reduced the complexity, frameworks like Tapestry or GWT, permitted by templates created by designers. PHP learned with version 5 useful object orientation and frameworks such as Zend or symfony brought design concepts to PHP developers. Also additional libraries of Java found correlation to PHP. For example the PHP ORMs Propel and Doctrine.

From today’s point of view also offer Java and PHP similar functionality. Nevertheless, other aspects are to consider:

  • Stability
    PHP has in my opinion, significant weaknesses. The procedural backward compatibility, no real deprecation mechanism, a mess semi platform independent libraries and functionality are just some of the issues the PHP. PHP lacks a clean cut, which the PHP planned to do with version 6.
    Java, however, has a clean platform independence and a fairly well-defined number of core libraries with appropriate quality standards.
  • Performance
    While Java was formerly often described as slow, today’s JVMs are highly optimized for speed, while the script languages, including PHP, still struggle with this. For example a first usable garbage collector will be shipped with PHP 5.3. Also other optimizations were moving very slowly into PHP Runtimes. This might be due to the fact that PHP in contrast to Java restarts the VM after each request, which of course brings additional performance problems. For example for each request session data has to be read from disk. Although there are solutions in PHP (MemCache, APC) these are rarely and partly still heavily in development.
    Interestingly, this drawback makes scaling of PHP applications fairly simple. As completely separate requests can be processed, additional hardware results in relatively linear improvements in the capacity of the server. On the Web, the focus is rather on the number of requests, not directly on the exact duration of an individual requests.
  • Choice
    Ideally, you never will re-invent the wheel. It makes sense to look for already existing solutions. Both in PHP, as well as in Java, there is a lot of modular software, partly with free, partly with non-free licenses. However, PHP modules expose significantly more problems than those written in Java. For example, some PHP module developers invented own concepts (e.g. Zend Loader was created by Zend as a substitute for packages) or the modules are only optimized for a framework (like symfony plug-ins).
    Java is, especially through the “complicated concepts” such as Class Loading and packages, better prepared for modularization. Due to better tool support (Ant / Maven, Javadoc, JUnit) Java Frameworks have easier to install, better documented and tested artifacts. However PHP tools for these tasks are also on the rise (pake / phing, PHPDocumentor, PHPUnit / lime).
  • Integration
    Integration is certainly the strength of Java. On the one hand, Java itself is almost “Industry Standard”, on the other hand, there are many standards implementations in Java. If a PHP Web application should communicate with a specific protocol, the selection of libraries is rather limited. Even worse, implementations are either only partially implemented or very rudimentary (such as Zend OpenID). Integration of PHP applications with other services usually happens through the database layer.
  • Developer know-how
    Even 20 years ago, Frederic Brooks searched for the “Silver Bullet” and did not find it. In his article he came to the conclusion that software design, problem formulation and the capabilities of the developers are far more important than tools or languages. Therefore, it is certainly a good idea to implement a website by a designer with knowledge of PHP with a state of the art PHP Framework. If it would be a Web front-end of a Java EE backend application Java would be the obvious choice.

However developer knowledge should not hide the fact that some technological hurdles can be overcome only with certain technologies or other languages. So consulting an expert with specific skills makes much more sense than to give it a try with inadequate tools.

So I conclude that Java is still the better choice for many projects. For smaller projects that can be isolated scripting languages might reach the target faster. As a compromise you might give Groovy Grails a try?

Fabian Lange

 

XSL-Performance Tuning

During the last days I have intensivley worked on performance tuning of XSLs. These XSL are used within an web application to render HTML for presenting screens to the user. In this case, the response time of the XSL processing was not really a problem, but its need of ressources, e.g. CPU time. We wanted to optimze the XSLs or the processing itself to fix this issue.

First (of course) we “googled” about this. But we realized, that there is not much material about the topic XSL-Tuning in comparison to Java-tuning/-profiling. The most of the entries/sites are old. Also, there are not many tools you can use to profile/tune a XSL transformation. There are some tools (like Stylusstudio, Oxygen), but in our case the XSL transformationen is done within the web application. Even the XML ist dynamically produced at runtime and we didn’t want to change the application to write an XML output which could have been used to profile it “offline”. Some XSL processors have the possibility to trace profile informations, e.g. you can use a specific parameter for Saxon to log these informationes.  We were using Xalan and don’t have this option. Acutally we have tested Saxon with our XSLs, but there was not a performance improvement in our case.

So we have done it our “normal” way. We used a Java profiler and instrumented the Xalan-classes. We also instrumented the classes (translets) Xalan generates during processing, if you have enabled XSLTC. With the visualisation done by the profiler we could easily recognize, which templates were responsible for the heavy CPU usage. The advantage for us was, that we have structured our XSLs in many templates und used includes/call-templates to call them. You can see this in the profiler. So after we knew, which templates had to be focussed, we tried to optimize them.

During the optimization we found out, that you need fundamental knowledge of a XSL-transformation, not only the knowledge about which XSL-tags you can alternativley use, but also how a XSL-tag or XPath expression works. The main thing is, that the whole XSLT consist of tree processing and tree operations. It doesn’t matter, if the concrete implementation uses SAX instead of DOM, even Xalan uses an optimzed model (DTM). All tree operations, which navigation through the tree are expensive (e.g. “..”,”//”,”descandant:”). In our case it was worth trying to minize this kind of navigation. We also reduced the size of the XML-input (= the tree is smaller), which increased performance.  Other operations on the tree (e.g. “nodeSet()”, which we have used to parse XML within XSL)  werde also not cheap in terms of performance.

In general we have tried out the following optimizations:

  1. Read the Xalan FAQ und follow the performance recommendations.
  2. We have tested other XSLT implementations (e.g. Saxon)
  3. Xalan provides Mechanisms to include own Jaca-Code in your XSL. It’s not the best design-style, but you can in some cases replace “hot” XSL-operations.

Even für XSL-tuning there ist still the #1-performance tuning rule: Always implement one (only !) optimzation and the measure the success of this step.

In case of XSLT processing, there are no general rules you can apply (despite of the avoidance of tree operations), because in one XSL (with specific XML data) a operation can be problematic, in other cases it won’t.

If all the XSL tuning does not  help enough, then there is the possibility to let optimized systems/products do the XSL transformations for you (e.g. IBM Datapower, Layer7). We tested the IBM Datapower appliance and the speed of the XSL transformations was incredible. Of course, if you want to use such an appliance, there are other things which are worth to think about (administration, integration in the  application architecture, etc.)

Rainer Vehns

 

next page »