Overview

Transactions in Elasticsearch

3 Comments

Earlier this year a customer mentioned a search requirement that I hadn’t really thought about before: How to achieve transactions in Elasticsearch? Recently, the same requirement popped up again in a conversation I had with other search aficionados. Time to share some thoughts about transactions in Elasticsearch and how to implement them!

Before we start looking for solutions, we need to clarify what we are actually talking about as the term transaction is quite overloaded. In our scenario, there is a stream of documents to be indexed, some of which form logical groups. The transactional requirement is that, for each document group, no document of that group becomes searchable before any other document of the same group. Put differently, all documents in a group become visible for search at exactly the same point in time. Thus, we are talking about atomicity with respect to visibility for search, and I’m going to refer to that as a transaction. So how to achieve transactions with Elasticsearch?

Elasticsearch indexing operations

Let’s take a look at how Elasticsearch handles indexing operations so that we know what we can build on. Elasticsearch internally uses Lucene, so let’s check how the two work together.

With Lucene, documents to be indexed are first kept in an in-memory buffer where they are neither searchable nor persisted. A Lucene flush() operation, which is executed based on certain heuristics, turns the in-memory buffer into a segment. A Lucene commit() goes one step further, as apart from forcing a flush() it persists any in-memory segments on disk and makes them searchable. Because a commit() is rather heavyweight in terms of I/O, Lucene has a feature called near-real-time search (NRT) which is able to make in-memory segments searchable even before commit() is called.

Illustration of the Lucene flush() and commit() operations.

Illustration of the Lucene flush() and commit() operations.

Thus, with Lucene, a segment is the atomic unit with respect to search visibility, but only a commit() guarantees durability. If you are curious why Lucene uses segments in the first place and how this approach benefits search performance, I highly recommend reading Lucene in Action, Second Edition, Manning 2010.

When Elasticsearch receives an indexing request via the REST API, it needs to persist the document so that it can send the client an acknowledgement of safe reception. As it would be way too costly to execute a Lucene commit() for each received document, Elasticsearch uses its own persistence mechanism called the transaction log. The transaction log supports two major operations which eventually map to the Lucene operations introduced above. The first operation, the Elasticsearch refresh(), turns transaction log contents into a segment and makes them available for search via NRT, which involves a Lucene flush(). The second operation, the Elasticsearch flush(), executes a Lucene commit() and then clears the transaction log as all its documents have now been persisted by Lucene. By default, the refresh() operation is scheduled every second and the flush() operation is executed depending on certain heuristics.

Illustration of the way in which Elasticsearch indexing operations are mapped to Lucene.

Illustration of the Elasticsearch transaction log, the refresh() and flush() operations and how they call Lucene operations.

Changing refresh behavior

Via the REST API, Elasticsearch allows us to redefine the refresh interval or even disable automated refresh completely. One approach to achieve transactional search visibility would be to disable automated refresh and instead request refresh manually each time we sent Elasticsearch a full group of documents. This solution, however, works only under two assumptions:

  • The stream emits all documents of a group in direct succession, without any other documents in between. If the stream instead mixed documents of several groups, there would never be a safe point in time to call refresh because there might always be a group where only part of the documents have been sent to Elasticsearch so far.
  • There is no parallel indexing by multiple, independent clients. If there were multiple clients explicitly requesting refresh, any documents sent by other clients would become visible for search as well.

At our customer, none of these assumptions was satisfied, so changing refresh behavior was not an option. Apart from that, the approach is undesirable for performance reasons because issuing refresh with high frequency will create lots of small segments (which makes search more costly and requires many future segment merges).

Explicitly modeling transactions

If changing refresh behavior does not help, we need to represent transactions explicitly in our data model. How about this simple idea: We extend our document mapping by a boolean field, say “completed”, initially set to false for newly indexed documents. We extend all search queries by a filter for documents where “completed” is true. Only when all documents of a group have been indexed, we update “completed” to true for all of them.

Kommentare

  • Michael Thoma

    Hello Patrick
    Thank you for your considerations. These helped me a lot. I tried to apply your thoughts for update and delete transactions too, but somehow I do not find a proper solution, first of all because it seems not possible to modify the parent of a document.
    Do you have a solution for this kind of things? Best regards, Michael

  • Patrick Peschlow

    Hey Michael,

    it’s correct that you cannot apply the same idea, to have queries consider some field in a common parent document, for updates and deletes. If you query for the new value of the field (that you would set after the update/delete has completed), then you wouldn’t get the old versions of the documents anymore. With inserts, this is not a problem because the documents were not visible in the first place (there is no previous state to be missed).

    However, what you ask for would be possible to do on the Lucene level. So if you can make some additional assumptions, like there is only a single writer and the update/delete transactions don’t overlap each other, then you can probably use the Elasticsearch refresh mechanism to get the desired behavior. If you decide to go down that road you should make sure to evaluate the performance of the resulting solution, because there will likely be some performance penalty.

    Depending on your exact requirements (what exactly is an update or delete transaction for you? is it exactly like I described, for visibility, or is it different), there may be other ways.

    -Patrick

  • Michael Thoma

    Hey Patrick

    I’am working on the renewal of our portfolio-management-system.
    Elastic will play a leading role in our new architecture. For the most part of our data, Elastic is
    not the primary data store. Nevertheless there will be some application-specific data, simulations par example.
    As we all come from a RDBMS world, we feel quite uncertain about the lack of transactions, most of all because we build a financial app, not some kind of social media thing!
    Your considerations are very valuable to us, in terms of what we can do with Elastic but also what eventually not.
    At the moment I tend to let a hammer be a hammer and a nail be a nail and to use a RDBMS where it’s really necessary.
    But you are right, one has to look careful at the exact requirement of each and every use case.

    Again, thank you very much!
    Michael

Comment

Your email address will not be published. Required fields are marked *