Behaviour Driven Development with Elasticsearch

No Comments

Elasticsearch has been riding on top of the hype for a while now, and I expect it to hit even harder with the release of 1.0 – We will continue to see a massive growth in various fields throughout the tech world, and even more use cases will be discovered and put to production in stunning speed.

While it’s all hot and fresh I want to urge every Developer to try to include proper craftmanship techniques in his daily work with Elasticsearch. We all strive to ensure great results without regression – In this post I want to talk about a behaviour driven approach when it comes to Elasticsearch, something we at codecentric have had tremendous success so far.

Let’s imagine you’re having an Elasticsearch cluster up and running, and you’re trying to improve your search results for a specific use case, maybe by using the fantastic function score query – would you feel save pushing that change into production? Are you sure all the queries your customers throw at you will be answered sufficiently? If your answer is “HELL NO!” then you know you have a problem.

It’s not unsolvable though: This can be approached by having a decent set of tests that will provide accurate safety against regressions and support agile development of new features: Acceptance Tests! We are able to execute our tasks fast by starting up a whole ES node with the NodeBuilder in the java API and with a JUnit Rule ( as described by Florian Hopf here ) :

public class ElasticsearchTestNode extends ExternalResource {
 
    private Node node;
    private Path dataDirectory;
 
    @Override
    protected void before() throws Throwable {
        try {
            dataDirectory = Files.createTempDirectory("es-test", new FileAttribute[]{});
        } catch (IOException ex) {
            throw new IllegalStateException(ex);
        }
        ImmutableSettings.Builder elasticsearchSettings = ImmutableSettings.settingsBuilder()
                .put("http.enabled", "false")
                .put("path.data", dataDirectory.toString());
 
        node = NodeBuilder.nodeBuilder()
                .local(true)
                .settings(elasticsearchSettings.build())
                .node();
    }
 
    @Override
    protected void after() {
        node.close();
        try {
            FileUtils.deleteDirectory(dataDirectory.toFile());
        } catch (IOException ex) {
            throw new IllegalStateException(ex);
        }
    }
 
    public Client getClient() {
        return node.client();
    }
}

So let’s write our first test for it! Let’s create an Index, index a document and retrieve it again – in only a couple of clean lines!

public class NodeCreationTest {
 
    @Rule
    public ElasticsearchTestNode testNode = new ElasticsearchTestNode();
 
    @Test
    public void indexAndGet() throws IOException {
        testNode.getClient().prepareIndex("myindex", "document", "1")
                .setSource(jsonBuilder().startObject().field("test", "123").endObject())
                .execute()
                .actionGet();
 
        GetResponse response = testNode.getClient().prepareGet("myindex", "document", "1").execute().actionGet();
        assertThat((String) response.getSource().get("test"),equalTo("123"));
    }
}

Run the test and we’ll see in the console log that the node boots up, actually handles the request and shuts down gracefully, awesome!

test output of the Node creation text

So we could be done right here and commence happy TDD – but let’s crank it up a notch and

  1. add JBehave to our stack
  2. create a custom mapping within our code that we want to test

Let’s imagine we are building the next Twitter application and after careful consideration we come up with the follwing story:

 Scenario:  Basic Tweet retrieval

Given A user Chris submitted a tweet I luv tweeting

When We list all tweets for the user Chris

Then A tweet with the text I luv tweeting will be found

To introduce JBehave I can really recommend the fantastic JUnitReportingRunner from my workmates, grab it from Maven Central and create a Story Class that wires our story with some sane defaults. For further explanation check out Andreas’ post here.

@RunWith(JUnitReportingRunner.class)
public class TwitterStories extends JUnitStories {
 
    private final CrossReference xref = new CrossReference();
 
    public TwitterStories() {
        super();
    }
 
    @Override
    protected List storyPaths() {
        String codeLocation = codeLocationFromClass(this.getClass()).getFile();
        List paths =  new StoryFinder().findPaths(codeLocation, asList("Tweet.story"
        ), asList(""),"");
        return paths;
    }
 
    @Override
    public InjectableStepsFactory stepsFactory() {
        return new InstanceStepsFactory(configuration(), new TweetRetrievalTest());
    }
 
    @Override
    public Configuration configuration() {
        Class<? extends Embeddable> embeddableClass = this.getClass();
        Properties viewResources = new Properties();
        viewResources.put("decorateNonHtml", "true");
        viewResources.put("reports", "ftl/jbehave-reports-with-totals.ftl");
        // Start from default ParameterConverters instance
        ParameterConverters parameterConverters = new ParameterConverters();
        // factory to allow parameter conversion and loading from external resources (used by StoryParser too)
        ExamplesTableFactory examplesTableFactory = new ExamplesTableFactory(new LocalizedKeywords(), new LoadFromClasspath(embeddableClass), parameterConverters);
        // add custom converters
        parameterConverters.addConverters(new ParameterConverters.DateConverter(new SimpleDateFormat("yyyy-MM-dd")),
                new ParameterConverters.ExamplesTableConverter(examplesTableFactory));
        return new MostUsefulConfiguration()
                .useStoryLoader(new LoadFromClasspath(embeddableClass))
                .useStoryParser(new RegexStoryParser(examplesTableFactory))
                .useStoryReporterBuilder(new StoryReporterBuilder()
                        .withCodeLocation(CodeLocations.codeLocationFromClass(embeddableClass))
                        .withViewResources(viewResources)
                        .withFormats(STATS)
                        .withFailureTrace(true)
                        .withFailureTraceCompression(true)
                        .withCrossReference(xref))
                .useParameterConverters(parameterConverters)
                        // use '%' instead of '$' to identify parameters
                .useStepPatternParser(new RegexPrefixCapturingPatternParser(
                        "$"))
                .useStepMonitor(xref.getStepMonitor());
    }

Here you can see we’re loading our previous story called “Tweet.story” and a test called “TweetRetrievalTest”. This test maps our story to actual executable code and takes care of the Elasticsearch node bootup:

public class TweetRetrievalTest{
 
    public ElasticsearchTestNode testNode = new ElasticsearchTestNode();
 
    @BeforeStory
    public void setUp() throws Throwable {
        testNode.before();
 
        testNode.getClient().admin().indices().create(new CreateIndexRequest("twitter")).actionGet();
        testNode.getClient().admin().indices()
                .preparePutMapping("twitter")
                .setType("tweets")
                .setSource(mapping())
                .execute().actionGet();
    }
 
    @AfterStory
    public void after(){
        testNode.getClient().admin().indices().prepareGetFieldMappings("twitter").execute().actionGet();
        testNode.after();
    }
 
    private SearchResponse response;
 
    @Given("A user $user submitted a tweet $tweet")
    public void userTweets(@Named("tweet") String tweet , @Named("user") String user) throws IOException {
        testNode.getClient().prepareIndex("twitter", "tweets", "1")
                .setSource(jsonBuilder()
                        .startObject()
                        .field("user", user)
                        .field("message", tweet)
                        .endObject())
                .execute()
                .actionGet();
    }
 
    @When("We list all tweets for the user $user")
    public void retreiveTweetsForUser(@Named("user") String user) {
        response = testNode.getClient().prepareSearch("twitter").
                setTypes("tweets")
                .setQuery(QueryBuilders.termQuery("user", user))
                .setFrom(0).setSize(60).setExplain(true)
                .execute()
                .actionGet();
 
    }
 
    @Then("A tweet with the text $text will be found")
    public void expectTweet(@Named("tweet") String tweet)  {
        for (SearchHit hitFields : response.getHits().getHits()) {
            if(hitFields.field("tweet").getValue().equals(tweet)) {
                return;
            }
        }
        fail("expected Tweet " + tweet + "not found");
    }
 
    /**
     * Overriding mapping
     */
    public XContentBuilder mapping() throws Exception {
        XContentBuilder xbMapping =
                jsonBuilder()
                        .startObject()
                        .startObject("tweet")
                        .startObject("properties")
                        .startObject("source")
                        .field("type", "string")
                        .endObject()
                        .startObject("user")
                        .field("type", "string")
                        .endObject()
                        .startObject("message")
                        .field("type", "string")
                        .endObject()
                        .endObject()
                        .endObject()
                        .endObject();
        return xbMapping;
    }
 
}

As a side-note, see how easy it is to inject a custom mapping into the whole setup! Feel free to experiment with it:

  • provision your Elasticsearch production nodes with a custom mapping from a .yml file
  • make use of the API: boost values, give it a custom scoring, try out different filters or analyzers
  • run the test and know that it’s going to work. Awesome!

Happy testing folks! You can grab the code for this small example on our company github account here.

Google

Comment

Your email address will not be published. Required fields are marked *