InfoQ

News

Spring Batch: Simplified Development of Batch and Offline Processes

Posted by Ryan Slobojan on Jun 17, 2008 07:00 AM

Community
Java
Topics
Enterprise Architecture
Tags
Spring Batch

The Spring Batch project, a lightweight and comprehensive Spring-based batch framework, released version 1.0 recently. InfoQ spoke with project lead David Syer to learn more about this release and what it provides for the Spring community.

Syer described Spring Batch as a framework which manages batch and offline processing concerns so that an application developer can focus on business logic. Syer identified two new ideas that Spring Batch brings to the batch processing space -- the ability to write lightweight application code that can be tested in isolation, and a powerful framework to execute, manage and monitor the results of offline processing.

Syer identified the major features of this release as:

  • Infrastructure - This provides reusable, low-level support for repeat and retry, transaction synchronization, reading to and writing from flat files, XML and databases
  • Core - This is a thin API which allows launching and management of batch jobs, and provides all features which are needed operationally
  • Execution - This is an implementation of the core API, which provides a runtime/execution environment for batch processing
  • Comprehensive set of sample applications - Spring Batch 1.0 provides a samples module which contains several example applications which show all of the main features of Spring Batch in operation

Syer also indicated that a detailed explanation of all of the features in version 1.0 was available.

One of the things which sets Spring Batch apart from other projects in the Spring Portfolio is that Spring Batch is the result of a partnership between SpringSource and Accenture. Syer told InfoQ that the partnership has resulted in a significant benefit in terms of the number and depth of contributions to the codebase, and that the partnership had been "extremely successful" with Accenture having assigned some of their best resources to the project. Syer was also careful to note that Spring Batch is run in the same manner and held to the same standard as every project in the Spring Portfolio.

Spring Batch has had a relatively long development process, with dates as early as January 2007 noted in the Subversion repository. Syer explained the reasons for this:

Of course it would have been nice to get to a final release a bit quicker, but actually we always had a plan with a hard stop in March 2008 to coincide with the release of the rest of the Spring Portfolio and the availability of the SpringSource subscription products. We have to be very careful with our public API design, in particular so that we do not need to make changes where we have plans for future development. General product quality is also a major driver - we are perfectionists.

That being said, the maturity and richness of functionality can be shown by the handful of projects that have gone into production using milestone releases.

When asked to describe some of these production projects in more detail, Syer said:

  • A large European health care provider has migrated a number of their mainframe batch processing to Spring Batch as part of an overall Application Renewal project. This is quite a common pattern and request due to the fact that the skills available in today's job market make it easy to find good Spring developers, and more difficult to find good COBOL programmers. This client uses Spring Batch XML streaming and mapping capabilities, along with Hibernate for business object persistence where they are able to get some level of re-use from the work done as part of their online processing
  • A large sports organisation updates scores and statistics live as games happen for real-time tracking by users. They developed a system where file reading only required configuration, allowing for fast development. The modular approach also allowed for jobs to be run faster, launched every 5 seconds
  • A large state government in the US has an IT renewal project replacing mainframe batch jobs with Java. The objective is to process unemployment claims. Challenges here include legacy mainframe and government-specific data formats, and strict rules about partial job failures. In this case, the batch job development is only piece of a much larger programme which is ahead of schedule

Syer also listed the three categories that implementing applications tend to fall into:

  • Close-of-business processing such as reporting, order processing, and account reconciliation
  • Import and Export handling such as form processing, inventory import, and allocation export
  • Large-scale output jobs such as email campaigns and financial statements

When asked about plans for the future of Spring Batch, Syer said:

We are providing an excellent platform for single process (possibly multi-threaded) execution in 1.0. The future is full of possibilities for moving to numerous multi-process models on a variety of platforms, and we have been very careful to anticipate those changes in the 1.0 codebase. Building on the platform we have, which already provides most of the hooks and data structures we need, we are also going to be thinking very hard about the usability and deployability of batch applications. Monitoring and managing batch applications is very important in real-life production situations, and we see a number of ways we can add additional value in this area. Along with the rest of the Spring portfolio projects, we see OSGi as a key part of our strategy for the future (1.0 will be packaged as OSGi bundles, but that is really only the start).

Syer also thanked all those who had contributed to Spring Batch through forums, bug reports, discussions and code, describing both the quality and amount of feedback as "truly impressive".

Spring Batch Webinar by Adam FitzGerald Posted Jun 18, 2008 2:04 PM
  1. Back to top

    Spring Batch Webinar

    Jun 18, 2008 2:04 PM by Adam FitzGerald

    Sorry for the shameless plug but for those people interested in Spring Batch, SpringSource will be hosting a webinar with David Syer on July 9th, 2008. The session will cover all the basics for using Spring Batch and how it has become a critical solution for enterprises in an area where Java was seldom used previously. Registration is now open.

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.