Configuring and Running a Job in Spring Batch

In this chapter we will explain the various configuration options and run time concerns of a Job. While the Job object may seem like a simple container for steps, there are many configuration options of which a developers must be aware . Furthermore, there are many considerations for how a Job will be run and how its meta-data will be stored during that run.

Configuring a Job- 

There are multiple implementations of the Job interface, however, the namespace abstracts away the differences in configuration. It has only three required dependencies: a name, JobRepository , and a list of Steps.

<job id="myEmpExpireJob">
    <step id="readEmployeeData" next="writeEmployeeData" parent="s1"></step>
    <step id="writeEmployeeData" next="employeeDataProcess" parent="s2"></step>
    <step id="employeeDataProcess" parent="s3"></step>
</job>

The namespace defaults to referencing a repository with an id of ‘jobRepository‘, which is a sensible default. However, this can be overridden explicitly:

<job id="myEmpExpireJob" job-repository="specialRepository">
    <step id="readEmployeeData" next="writeEmployeeData" parent="s1"></step>
    <step id="writeEmployeeData" next="employeeDataProcess" parent="s2"></step>
    <step id="employeeDataProcess" parent="s3"></step>
</job>

1. Restartability-

One key issue when executing a batch job concerns the behavior of a Job when it is restarted. The launching of a Job is considered to be a ‘restart’ if a JobExecution already exists for the particular JobInstance. Ideally, all jobs should be able to start up where they left off, but there are scenarios where this is not possible. It is entirely up to the developer to ensure that a new JobInstance is created in this scenario. However, Spring Batch does provide some help. If a Job should never be restarted, but should always be run as part of a new JobInstance, then the restartable property may be set to ‘false’:

<job id="myEmpExpireJob" restartable="false">
    ...
</job>

another way…

Job job = new SimpleJob();
job.setRestartable(false);

JobParameters jobParameters = new JobParameters();

JobExecution firstExecution = jobRepository.createJobExecution(job, jobParameters);
jobRepository.saveOrUpdate(firstExecution);

try {
    jobRepository.createJobExecution(job, jobParameters);
    fail();
}
catch (JobRestartException e) {
    // expected
}

2. Intercepting Job Execution-

During the course of the execution of a Job, it may be useful to be notified of various events in its lifecycle so that custom code may be executed. The SimpleJob allows for this by calling a JobListener at the appropriate time:

public interface JobExecutionListener {

    void beforeJob(JobExecution jobExecution);

    void afterJob(JobExecution jobExecution);

}

JobListeners can be added to a SimpleJob via the listeners element on the job:

<job id="myEmpExpireJob">
    <step id="readEmployeeData" next="writeEmployeeData" parent="s1"></step>
    <step id="writeEmployeeData" next="employeeDataProcess" parent="s2"></step>
    <step id="employeeDataProcess" parent="s3"></step>
    <listeners>
        <listener ref="sampleListener"></listener>
    </listeners>
</job>

It should be noted that afterJob will be called regardless of the success or failure of the Job. If success or failure needs to be determined it can be obtained from the JobExecution:

public void afterJob(JobExecution jobExecution){
    if( jobExecution.getStatus() == BatchStatus.COMPLETED ){
        //job success
    }
    else if(jobExecution.getStatus() == BatchStatus.FAILED){
        //job failure
    }
}

The annotations corresponding to this interface are:

  • @BeforeJob
  • @AfterJob

3. Inheriting from a Parent Job-

If a group of Jobs share similar, but not identical, configurations, then it may be helpful to define a “parent” Job from which the concrete Jobs may inherit properties. Similar to class inheritance in Java, the “child” Job will combine its elements and attributes with the parent’s.

In the following example, “baseJob” is an abstract Job definition that defines only a list of listeners. The Job “job1” is a concrete definition that inherits the list of listeners from “baseJob” and merges it with its own list of listeners to produce a Job with two listeners and one Step, “step1“-

<job abstract="true" id="baseJob">
    <listeners>
        <listener ref="listenerOne"/>
    <listeners>
</listeners></listeners></job>

<job id="job1" parent="baseJob">
    <step id="step1" parent="standaloneStep"/>

    <listeners merge="true">
        <listener ref="listenerTwo"/>
    <listeners>
</listeners></listeners></job>

4. JobParametersValidator-

A job declared in the XML namespace or using any subclass of AbstractJob can optionally declare a validator for the job parameters at runtime. This is useful when for instance you need to assert that a job is started with all its mandatory parameters. There is a DefaultJobParametersValidator that can be used to constrain combinations of simple mandatory and optional parameters, and for more complex constraints you can implement the interface yourself. The configuration of a validator is supported through the XML namespace through a child element of the job, e.g:

<job id="job1" parent="baseJob3">
    <step id="step1" parent="standaloneStep"/>
    <validator ref="paremetersValidator"/>
</job>

Configuring a JobRepository-

As described in earlier, the JobRepository is used for basic CRUD operations of the various persisted domain objects within Spring Batch, such as JobExecution and StepExecution. It is required by many of the major framework features, such as the JobLauncher, Job, and Step. The batch namespace abstracts away many of the implementation details of the JobRepository implementations and their collaborators. However, there are still a few configuration options available:

<job-repository id="jobRepository"
    data-source="dataSource"
    transaction-manager="transactionManager"
    isolation-level-for-create="SERIALIZABLE"
    table-prefix="BATCH_"
 max-varchar-length="1000"
/>

Configuring a JobLauncher-

The most basic implementation of the JobLauncher interface is the SimpleJobLauncher. Its only required dependency is a JobRepository, in order to obtain an execution:

<bean class="org.springframework.batch.core.launch.support.SimpleJobLauncher" id="jobLauncher">
    <property name="jobRepository" ref="jobRepository" />
</bean>
job-launcher-sequence-sync

The sequence is straightforward and works well when launched from a scheduler. However, issues arise when trying to launch from an HTTP request. In this scenario, the launching needs to be done asynchronously so that the SimpleJobLauncher returns immediately to its caller. This is because it is not good practice to keep an HTTP request open for the amount of time needed by long running processes such as batch. An example sequence is below:

job-launcher-sequence-async

The SimpleJobLauncher can easily be configured to allow for this scenario by configuring a TaskExecutor:

<bean class="org.springframework.batch.core.launch.support.SimpleJobLauncher" id="jobLauncher">
    &lt;property name="jobRepository" ref="jobRepository" /&gt;
    <property name="taskExecutor">
        &lt;bean class="org.springframework.core.task.SimpleAsyncTaskExecutor" /&gt;
    </property>
</bean>

Any implementation of the spring TaskExecutor interface can be used to control how jobs are asynchronously executed.

Running a Job-

At a minimum, launching a batch job requires two things: the Job to be launched and a JobLauncher. Both can be contained within the same context or different contexts. For example, if launching a job from the command line, a new JVM will be instantiated for each Job, and thus every job will have its own JobLauncher. However, if running from within a web container within the scope of an HttpRequest, there will usually be one JobLauncher, configured for asynchronous job launching, that multiple requests will invoke to launch their jobs.

1. Running Jobs from the Command Line-
The CommandLineJobRunner-

Because the script launching the job must kick off a Java Virtual Machine, there needs to be a class with a main method to act as the primary entry point. Spring Batch provides an implementation that serves just this purpose: CommandLineJobRunner. It’s important to note that this is just one way to bootstrap your application, but there are many ways to launch a Java process, and this class should in no way be viewed as definitive. The CommandLineJobRunner performs four tasks:

  • Load the appropriate ApplicationContext
  • Parse command line arguments into JobParameters
  • Locate the appropriate job based on arguments
  • Use the JobLauncher provided in the application context to launch the job.
bash$ java CommandLineJobRunner endOfDayJob.xml endOfDay schedule.date(date)=2013/01/08

2-Running Jobs from within a Web Container-

Historically, offline processing such as batch jobs have been launched from the command-line, as described above. However, there are many cases where launching from an HttpRequest is a better option. Many such use cases include reporting, ad-hoc job running, and web application support. Because a batch job by definition is long running, the most important concern is ensuring to launch the job asynchronously:

launch-from-request

 The controller in this case is a Spring MVC controller.

@Controller
public class JobLauncherController {

    @Autowired
    JobLauncher jobLauncher;

    @Autowired
    Job job;

    @RequestMapping("/jobLauncher.html")
    public void handle() throws Exception{
        jobLauncher.run(job, new JobParameters());
    }
}

 

Previous
Next

2 Comments

  1. Massimo October 17, 2014
  2. valentina January 26, 2015