It’s been a few years now since Spring introduced the Spring Batch framework, a powerful framework for developing batch processing applications. It eased up our everyday work when it comes to importing data provided by another system, digesting larger sets of data into ready to use information or doing any other job which follows read-process-write pattern. While developing batch jobs, one of the important parts to think about is how to test its components and how to run a complete job chain in an integration test environment.
In this post, I will share with you my testing experience from a project I’m working on and where the Spring Batch framework is heavily used.
Testing Batch Components
Testing batch components as Readers, Processors and Writers comes down to writing usual unit tests, but with one thing to have in mind. They can require Spring Batch domain objects to be set up and initialized up front. For example, ItemReader requires an execution context to be opened before it can actually read. Luckily, Spring provides spring-batch-test module, which is used to produce all kinds of batch domain objects. More specifically, it provides MetaDataInstanceFactory for creating step execution contexts, job instances, etc.
Let’s say you want to test an ItemReader which reads csv file rows into Items.
Here is how you would first open an execution context, and then match csv row to an item.
Another thing to think about is splitting the configuration. When writing a Job Configuration, it is easy to overload one configuration class with beans needed by that job. Specially when using out of the box components provided by Spring, which usually require customization to suit your needs. Although this approach keeps all configuration at one place, it can become hard to read and hard to test. If one component requires many properties to be set up, then it is better to have it isolated in its own unit and make a logical unit test for it. If you want an example, think about
FlatFileItemReader, where you probably want to set up your line tokenizer, field mapper, etc.
Testing a Batch Job
Testing a batch job is an entirely different story from testing isolated components. The goal is to start a complete batch job, so it reads the input in the same manner as it would do in the real run, and it produces real output which is being verified at the end. To achieve this goal, there are several things to think about:
- Boot Spring context inside the container.
- Provide input resources for the batch job, e.g. csv/xls files, database entries, etc.
- Start batch job.
- Verify job state and job output.
Boot Spring context
In order to run the Job, Spring context is required and all required dependencies must be available. For example, the database is the most usual dependency you don’t care about when writing unit test, but it needs to be set up when doing integration test. Before I give you a short recipe on how to set up an environment for testing, here is the image illustrating architecture of the project that will serve as an example.
The image illustrates an application which is responsible for running more than one batch Job.
BatchApplication class is the main class responsible for launching the application. The application starts three jobs, where each job (yellow) has its own Configuration class. The BatchConfiguration class is responsible for configuring Spring Batch domain specific beans, such as
JobExplorer, etc. It is loaded from BatchApplication, and it is used by all jobs.
Now let’s say we want to test the first job from the list, named Store Job.
The first thing to do is to figure out which annotations are required on
StoreJobTest class, to boot Spring context inside the container.
What does every one of these annotations do?
@IntegrationTest marks up the test as an integration test and tells that container should be started in the same way as in production.
@ActiveProfiles is used to activate the “test” profile and instantiate beans meant for testing only (e.g. test dataSource). In this case the “test” profile is called batchtest. This is useful if we want to prevent some components from doing their real action, like sending emails, deleting files, uploading to 3rd party storage etc. In that case, we make “fake” implementations (doing nothing) and mark them with
@SpringApplicationConfiguration is telling the container which configurations to pick up when starting up ApplicationContext. This is also the most critical part. You need to give all required beans to the container, otherwise it will end up with bunch of “Could not autowire” exceptions. One tip to set up @SpringApplicationConfiguration is to have the job configuration in one class and all other required configurations and beans packed inside another class. In this example there is
StoreJobConfiguration which configures job and
TestJobConfiguration which configures everything else. That way
TestJobConfiguration can be reused when testing all other jobs from this app. Another option would be to pass in main BatchApplication.class, but that would always boot all jobs configurations, no matter which job you are testing.
@EnableAutoConfiguration actually belongs to Spring Boot, which would auto configure beans like BatchApplication class does.
@Import BatchConfiguration to configure Spring Batch domain specific beans.
@Import EmbeddedDataSourceConfig to set up an embedded database for testing. HSQL is the most common one, but if you need a database that goes nicely with MySQL, you can try MariaDB. In this example, MariaDB is marked with
@Profile("batchtest") and it will be booted only for testing.
The following picture illustrates how the Test class fits into the project structure.
One piece of advice here would be to clearly separate src/test/java @Configurations from /src/main/java @Configurations. More precisely, don’t let src/test/java @Configurations be @ComponentScan-ed when really booting the application. The default behaviour for @ComponentScan is to pick all Spring components on its path, no matter if the package belongs to /src/main/java or src/test/java. My way of doing this is to keep the source code in de.codecentric.batch packages and test the code in de.codecentric.test.batch, so they are never mixed.
Start batch job
After setting up the test class, it is time to write the actual test and launch the job.
What you want to note here is how to launch the job.
Use JobLauncherTestUtils, which is a Spring-provided class to simply launch the job which is injected in the context. And just to remind you, the job was injected in the context by StoreJobConfiguration.class. In order to use JobLauncherTestUtils you need to:
- Add spring-batch-test module to your classpath.
- Define JobLauncherTestUtils bean. This is done in
- Autowire it in a
JobLauncherTestUtils is actually using the same jobLauncher as it would be used in production, but it is starting the job with random job parameters which enables job to be run multiple times.
What to verify in a job depends on a job itself, but there are some common things that can be checked. You can verify the Exit Status of the Job Execution and number of items being read/written/skipped. When there is a complex job flow, it is useful to verify the flow configuration and especially what happens when the job gets restarted.
Provide input resources
The last missing piece is providing input data to test. Usually, there are two types of input. Either you want to prepare external files (csv, xls) which are processed by the job, or you want to prepare the database. Or both.
Let’s say that job is picking up input files from the folder configured in application.properties, named import.folder.path. In order to let job pick up files during test run, create another application-batchtest.properties file, and configure import.folder.path to be whatever it suits you. Use the
classpath:anyfoldername notation, and put the files inside src/test/resources/anyfoldername. Remember that application-batchtest.properties is named after @Profile(“batchtest”).
Filling the database for test is a common scenario, so you can use whatever you preferr. I find Spring’s
@Sql annotation extremely useful, especially when inserting a bunch of data.
At this point you should be ready to set up tests for Spring Batch Jobs and its components. A final word of advice is to keep in mind not to overload Job Configurations and not to mix different Job Configurations in the same file, because it can make testing more challenging. Think carefully what the Job is doing, and except only verifying the output, verify its state and flow through steps and different states.