DON’T make an ASS out of U and ME when dealing with Hibernate caching!

No Comments

In my current project a simple question came up. “Is Hibernate’s first-level cache limited to a certain transaction?” Intuitively my answer was: “No, the first-level cache is also called session cache. So it should rather be bound to a Hibernate session. And since the same session can handle multiple transactions during its lifetime, entities cached by one transaction would be accessible by another transaction within this context.” Ok, that is the theory. A day later, I thought again about this small water-cooler chat. I used words like “should” and “were” to answer a very concrete, technical question. That left a bitter taste. But how could I prove that entities are cached inter-transactional, thus confirming my “premature” assumption?

We don’t have a choice, let’s try it out!

Since “keeping things simple!” is a valuable goal, we will create a small plain Hibernate project. It ought to contain something like a test case which tries to access the same entity within two different transactions trying not to talk to the database more than once. In other words, our approach to validating our theory is to count how often separate transactions within the same session have to execute SELECT queries to work with a single entity. If our assumption was right and transactions can share entities in a session-wide cache, only one of these transactions had to read an entity from the database and the other transaction will have access to this entity through the cache without reading it from the database again.

Sounds “simple”, but how could we observe our ORM’s database access without much effort? Do we have to parse database logs or to write some smart interceptors? Fortunately, someone already did that for us. There is ttddyy’s DataSourceProxy project and this tiny library wraps your data source and allows you to collect some useful metrics regarding your ORM behavior. Leveraging such a DataSourceProxy, we can verify every database access on Java level. That makes it very easy to write a JUnit test.

What do we need?

To create a minimum viable test project, we only need a handful of dependencies and a database. The most important dependency is the ttddyy proxy.

<dependencies>
        ...
        <groupId>net.ttddyy</groupId>
        <artifactId>datasource-proxy</artifactId>
        ...
</dependencies>

The database should be up and running, provided with the proper schema. Our only entity contains just an identifier and a creation date since we don’t need huge data for use case.

@Entity
public class SomeEntity {
 
    ...
 
    @Id
    private Integer id;
 
    private Date createdDate;
 
    ...
}

The datasource configuration is a crucial part. Here we have to wrap our real datasource with a DataSourceProxy.

private static DataSource buildProxyDataSource() {
    return ProxyDataSourceBuilder.create(buildDataSource())
        .name("ProxyDataSource")
        .countQuery()
        .build();
}

Well done. Now, what does our test flow look like?

Our test creates an entity (Transaction A). After that, we will immediately clear the first-level cache to force at least one database read on the first entity access (Transaction B). If we did not clear the cache, it would contain the entity right upon the entities’ creation time and we would not have to execute a single SELECT query in our entire test.

... session.beginTransaction();
...
createEntity(session, entityId);
 
transactionA.commit();
 
 
... session.beginTransaction();
 
// clear cache after entity creation, otherwise we would have no select at all
session.clear();
 
// intended only select
... readEntityCreationDate(session, entityId);
 
transactionB.commit();
 
 
... session.beginTransaction();
 
// another read, but no further select expected although we opened a different transaction context
... readEntityCreationDate(session, entityId);
 
transactionC.commit();

Since we now start with an empty session and our test is loading the entity explicitly, one SELECT query is intended. This operation also puts the entity right back into the first-level cache (session cache). After committing (Transaction B), another Transaction (Transaction C) is accessing the entity again by its identifier. This call should be answered by the first-level cache, so we expect no further SELECT query although we are in another transaction context.

Drum roll … The results:

We verify our assumption by counting the sum of executed queries separated by type. The QueryCountHolder offers very convenient methods to do that.

final QueryCount grandTotal = QueryCountHolder.getGrandTotal();
assertThat(grandTotal.getInsert()).isEqualTo(1); // (Transaction A) Inserts our entity
assertThat(grandTotal.getSelect()).isEqualTo(1); // (Transaction B) Only one transaction reads the table
assertThat(grandTotal.getDelete()).isEqualTo(0); // No delete (after the last invocation of QueryCountHolder.clear())
assertThat(grandTotal.getUpdate()).isEqualTo(0); // No updates needed at all

We see that there is only one database INSERT to create our entity and one SELECT to read it again.

The full example test project is available on GitLab.

Summary

Finally I’m able to replace “it should be bound to a Hibernate session” with “it is to be bound to a Hibernate session”. And finally I can sleep peacefully again. 🙂 Joking aside, although that simple example does not even begin to exhaust ttddyy’s DataSourceProxy projects capabilities, it shows how useful this library can be for purposes like ours. It will prevent you from making rash decisions due to assumptions we made in lack of proof. Next time you are in doubt regarding your ORM’s behavior, don’t ASS-U-ME! Maybe a DataSourceProxy could help take a look behind the curtain.

P.S.: If you need more tests, don’t forget

QueryCountHolder.clear();

after each 😉

Kevin Peters

“It’s all about data!” – Every application reads and writes information, so Kevin places great emphasis on the smooth and high-performance handling of these data. He has specialized in the use of the Spring Framework and Hibernate and he is also interested in new technologies. Agile software development and the “Clean Code” concept are further foundations of his work.

Comment

Your email address will not be published. Required fields are marked *