Today something completely different: I’ll interview Oliver Gierke from SpringSource. He we go …
Tobias Trelle: Hi Oliver. Would you mind introducing yourself to listeners that might not already know you.
Oliver Gierke: My name is Oliver Gierke. I work for the SpringSource division of VMware as part of the Spring Data engineering team. I am responsible for the core, the JPA and the MongoDB modules of the project. Beyond that I organize the releasemanagement of all Spring Data modules that build on top of the core module and travel conferences and user groups quite a lot to spread the word.
Before that I worked as architect and developer in banking and automotive industry for quite a few years. I am also part of the JPA expert group.
TT: How did you actually get to SpringSource and the Spring Data project?
OG: My former employer, Synyx GmbH & Co. KG in Karlsruhe has been using open source software technology quite heavily to implement their customer projects. This included that we – as far as customers allowed – extracted libraries from the projects and published them under an open source license. One of these libraries was called Hades. It was based on an article at IBM DeveloperWorks and another one by Eberhard Wolff in German Java Magazine, which both defined ideas how to significantly reduce the amount of code to implement data access layers with Hibernate and JPA.
At this time there was no open source implementation of these ideas around. So we started the project at Synyx and used it in customer projects. Eberhard and me had some exchange about the library and I was involved with Spring quite a lot back in these days. This lead me to start working for SpringSource some time after that.
The Spring Data project had just been born around that time as well and Mark Pollack, the lead of Spring Data, contacted me to evaluate if Hades could be integrated into Spring Data (to coind a JPA module) and in how far the repository abstraction that was implemented in Hades and Spring Data JPA would make sense for other stores as well. It took us a weekend to seperate the non-JPA-specific parts of the Hades codebase from the JPA-specific ones and implement a MongoDB layer on top of the common API. From that time on I also started to get involved into the other Spring Data modules as well.
TT: Spring Data supports both relational and non-relational data stores. How does that go together nicely? Are there really so many commonalities?
OG: That‘s indeed the biggest challenge. Especially the different NoSQL data stores are chosen because of their special traits. We have thought about that for a while and came to the conclusion that it doesn‘t make a lot of sense to try to hide all stores behind a unified API (like JPA) as this would mean we can only expose the least common denominator and all the interesting features like map-reduce in MongoDB or graph traversals in Neo4j cannot be exposed in an abstracted way.
We‘re actually supported by the fact that we build on top of Spring in an interesting way. It‘s not only the technological foundations like dependency injection, the configuration support and so on. It‘s much more the fact that Spring implements certain patterns in a very consistent way that it almosts coins a „Spring way“ of solving problemns. Spring developers are familiar with that, they know JdbcTemplate, JmsTemplate and so on. They are of course all different as they abstract different technologies but they essentially work the same way, have the same responsibilities, conform to the same programming model.
This is essentially the approach we try to implement with Spring Data as well. The main goal is to provide a consistent, Spring-based programming model but retain the store-specific characteristics to let the developer use them. In short this means that if someone is currently using the Spring Data repositories with JPA it should be very easy to get started with the MongoDB module as the programming model is essentailly the same.
TT: There are quite a few NoSQL stores to choose from. Why did you choose to support MongoDB, Neo4j, Redis and Gemfire in the first place?
OG: The selection of supported stores is mainly driven by the deman we see in the market. MongoDB is currently number one choice amongst the all-purpose NoSQL stores. The Neo4j modules is driven by Michael Hunger from NeoTechnologies, the company behind Neo4j and has actually been the first Spring Data module at all. The support for Redis and Gemfire is mostly driven by the fact that both are VMware technologies and we of course strive for first-class Spring support for those.
Of course we see requests for support for other stores like e.g. Cassandra. But our focus currently is focus to not get lost in too many store implementations. With Spring Data Solr we have an entirely community-driven project now which we actively support and which already has released a first milestone a couple of weeks ago. We‘re closely tracking all community activity around Spring Data and actively support it.
TT: What does the roadmap for Spring Data look like? What features can developers look for in the future?
OG: With the release in early October we published new stable versions of the Spring Data core module as well as the JPA, MongoDB, Neo4j and Gemfire modules built on top of it. Going forward, the focus is on the next major generation so that we can incorporate a few major changes. The auditing feature in Spring Data JPA will be moved into the core module and extended into the other store implementations. Beyond that we‘re going to simplify some more advanced usage scenarios like the extension of the repository API. On a fundamental level, this can already be achieved of course but the programming model has quite a few corners that can be simplified. Beyond that we of course monitor the community feedback and implement new features as the individual stores come up with them.
Besides the actual store modules there‘s the Spring Data REST project which I recommend to look at. It allows to expose Spring Data repositories as hypermedia-driven REST resources to easily work with data via HTTP. It covers the usual 80% use cases and offers quite a few knobs to tweak what‘s exosed by default and additional hooks to easily implement custom behavior.
TT: The Spring Data stack seems to be years ahead of the JavaEE one. Do you think NoSQL ideas will make it into the standard‘s world any time soon?
OG: Not sure about that. I actually already outlined the big differences between the individual stores being the biggest challenge to hide all of them behind a single unified API. I don‘t see a reasonable way to actually do that currently. There are a few attempts to actually do that behind the JPA which is quite difficult as the spec exposes quite a few relational concepts expects transactions being available etc. In the best case you can implement a slim profile of the JPA which is exactly what the currently available approaches achieve. Now as a developer you get told: „We have JPA for NoSQL“ followed by a ton of pages which parts of JPA actually don‘t work. This slims down the benefit of actually using JPA quite significantly.
At this point, we actually haven‘t even mentioned support for the special features of NoSQL stores, that are mostly an important reason why one has decided for that store in the first place.
To sum things up: I currently don‘t see a reasonable way of standardizing access to NoSQL databases in the Java world. The first possible option for something like this would be JavaEE 8 anyway which won‘t arrive before 2016. This is probably way too late for Java developers anyway.
TT: The book Spring Data – Modern Data Access for Enterprise Java has been released recently. You‘re one of the authors. How did you come up with the idea for the book?
OG: We‘ve been asked by O‘Reilly at SpringOne 2011 whether we wanted to write a book about the project and we took this as a chance to provide a broad overview over it and show how easy it is to implement data access for relational and non-relational stores nowadays. Within the roundabout 300 pages you‘ll get a decent impression of what it takes to work with every module of the project, in which usage scenario which store makes sense and how to implement a sample domain – an online store in our case – with it.
TT: The team that wrote the book is spread over the entire globe. How does that affect the work?
OG: Working on the book didn‘t really differ from our day-to-day work on Spring Data itself. Most of the chapters were written by the module authors anyway and the more common modules were equally distributed amongst the team as well. By definition Book projects always take longer than expected, especially as it of course was additional work on top of the actual one. But as we we‘re 6 people in total we actually could wirte the entire content ind roundabout 2 months and finish it in time for the release of the recent release train of a variety of Spring Data modules. We also decided to donate all earnings to the Creative Commons organization as it would have been close to impossible to come up with a fair distribution key.
TT: Do you also work on other projects? I‘ve heard about the bulky acronym HATEOAS quite a bit recently.
OG: Ben Hale, a SpringSource colleague actually described HATEOAS as „the word there‘s no pronounciation for“. Spring HATEOAS is a tiny, young library that was coined in a variety of projects I helped customers to implement hypermedia-driven REST WebServices. Basic problems like content negotiation and request mapping are usually solved by Java web frameworks already. But if it comes to hypermedia, i.e. enriching representations with links and thus the implementation of discoverability and guidance of clients through the API, all the frameworks fall short in support. Spring HATEOAS now actually closes the gap by providing helper classes and APIs.
The Spring Data REST module now in turn uses this API to expose entities managed by Spring Data repositories as hypermedia-driven REST resources. It‘s a quite neat example to see how the different Spring projects work together to create a seamless developer experience. My GitHub account has a sample implementation (http://github.com/olivergierke/spring-restbucks) of the use-case from the „REST in practice“ book by Jim Webber, Ian Robinson and Savas Parastatidis. Spring Data JPA, REST and the Spring HATEOAS project build the foundation for it and show how easy it is to actually implement hypermedia-driven REST web services.
TT: Do you consider yourself a conference junkie? Looking at your Twitter account you seem to be speaking at conferences constantly. When do you actually find time to work on Spring Data?
OG: Autum has been a season that is quite packed with conferences traditionally. The latest Spring Data release as well as the publication of the Spring Data book are of course things I really like to talk about to developers. Another aspect to that is that the conference engagements are an important feedback channel for us to make sure we know where developers still have issues and in what areas we can improve our support.
Travelling is exhausting of course but there‘s always time to write some code at the hotel or even at the conferences themselves. Beyond that: writing code on a plane is probably the purest form of cloud computing, isn‘t it?
TT: What conferences are you going to speak at next?
OG: I‘ll have an asia tour early december visiting Beijing, Tokyo, Hyderabad and Bangalore. 2013 starts with OOP in Munich for me, at which I‘ll be speaking about the Spring HATEOAS and Spring Data REST projects. Everything beyond that is still in planning phases.
TT: Thank you for the interview.
OG: You‘re most welcome, Tobias!