BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Orchestrating Long Running Activities with JBoss / JBPM

Orchestrating Long Running Activities with JBoss / JBPM

A common implementation problem is the requirement to orchestrate activities that extend over very long (hours, days, weeks). Although technically BPM engines are specifically design to invoke long running activities, due to their ability to hydrate/dehydrate process state, the issue here is that very long running activities are typically implemented as standalone processes1 , thus implementation of callbacks, informing the BPM server, always becomes a non-trivial design issue. In this article we will show one of the approaches to use JBoss jBPM for solving this problem.

Overall Implementation approach

The overall implementation approach (Figure 1) is fairly straight-forward. A BPM server initializes execution of an activity and "waits" until the activity notifies that it is completed.

Figure 1 Basic implementation architecture

Anyone who ever used WS-BPEL based BPM implementation will immediately say - this is an easy problem - just use send/receive activities to synchronize business process and activity executions and the problem is solved. Unfortunately jBPM does not provide receive activity in its set of standard nodes. Fortunately it provides a rich set of the client APIs, which can help in development of required functionality2. These APIs provides two essential capabilities, required for our implementation - setting process variables and "forcing" process continuation.

One of the common ways of exposing existing internal APIs to the outside world is through middleware. jBPM/ESB integration, for example, is leveraging ESB messaging and effectively exposes some of the existing jBPM client APIs as an ESB service. Another popular approach is usage of Web/REST services to achieve the same purpose3. In our implementation we decided to use JBoss JAX-RS implementation - ReastEasy4 to expose required client APIs as REST services.

The other complication is that many of the existing long running activities are implemented as a self-contained process with minimal/no APIs. As a result, it is not always possible to instrument them with either listeners or "callback" implementations. In our implementation we are using the most generic approach - creation of an integration script invoking both - long-running activity execution and custom completion notifier, which communicates to the REST server (Figure 2). Introduction of the integration script, in our case allows us to instrument existing implementations with all of the functionality, required for their orchestration (parameters passing, "callback" handler, etc) without any modifications.

Figure 2 Overall implementation Architecture

Based on the overall architecture ((Figure 2), the following components have to be developed - jBPM REST server, completion notifier, integration script and the actual process. We will discuss implementation of each of these components below.

jBPM REST Server

As we have mentioned above, there are several implementations of jBPM REST server available, but most of them are geared mostly towards query operation on an engine and are used mostly for building GUI or simple jBPM clients.

In our case we need a REST server to be able to set process variables and signal jBPM server to continue process execution. In order to better understand the functionality that we need to implement, let's first look at the Hierarchy of jBPM Execution objects (Figure 3).

A jBPM engine can have one or more process definitions deployed at any given time. Each definition can have any number of process instances running at the same time5. Finally a process implementation uses one or more tokens for its execution.

Figure 3 Hierarchy of jBPM Execution objects

While process definitions and process instances are common for virtually any business process engine implementation, the notion of a token is specific to jBPM server. A token is jBPM's abstraction of execution thread. While implementation itself does not support threading, it uses tokens to logically partition an execution. Every process instance starts on a single token - root token - and can start additional tokens as required. For example jPDL Fork node starts multiple execution tokens, which ends when execution passes a corresponding Join node (Figure 4).

Figure 4 Token's management in jBPM

Tokens are used by jBPM not only for partitioning execution but also variables. Variables in jBPM are associated not with a process instance, but rather with a token - process instance variables are in reality variables of the instance's root token. The parent's token variables are accessible from the child's token, but not vice/versa; sibling token's variables are not visible on different tokens.

Based on this our REST server implementation provides the following methods (Listing 1)6

  @GET
  @Path("instance/{id}/variables/xml")
  @Produces("application/xml")
  public ParameterRefWrapper getInstanceVariablesXML(	
	  @PathParam("id") String instanceId
  )
  ……………………
  @GET
  @Path("instance/{id}/variables/json"/)
  @Produces("application/json")
  @Mapped
  public ParameterRefWrapper getInstanceVariablesJSON(
	  @PathParam("id") String instanceId
  )
  ……………………


  @POST
  @Path("instance/{id}/parameters")
  public Response setInstanceParameters(
      @PathParam("id") String id,
      @QueryParam("parameter") List<ParameterRef> params)
  ……………………
  @POST
  @Path("instance/{id}/signal")
  public Response signalProcess(
      @PathParam("id") String id
  )
  ……………………
  @GET
  @Path("token/{id}/variables/xml")
  @Produces("application/xml")
  public ParameterRefWrapper getTokenParametersXML(
	  @PathParam("id")String tokenId
  )
  ……………………
  @GET
  @Path("token/{id}/variables/json")
  @Produces("application/json")
  @Mapped
  public ParameterRefWrapper getTokenParametersJSON(
	   @PathParam("id") String tokenId
  )
  ……………………
  @POST
  @Path("token/{id}/parameters")
  public Response setTokenParameters(
      @PathParam("id") String id,
      @QueryParam("parameter") List<ParameterRef> params)
  ……………………
  @POST
  @Path("token/{id}/signal")
  public Response signalToken(
      @PathParam("id")String id
  )
  ……………………

Listing 1 jBPM REST APIs

Business Process Implementation

The simplest jBPM process invoking a long running activity is presented at Figure 5.

Figure 5 Simple jBPM process

The process contains four steps:

  • Start node
  • A node (Starting pipeline, in our case) is responsible for invoking an integration script
  • A state (Pipeline completed, in our case) is a synchronization point, invoked by a callback handler to set execution results and continue process execution.
  • End node

The most important part of this process is an action handler used by Starting pipeline node. An action's implementation is based on java ProcessBuilder class allowing for invoking of an external process, integration script in our case. There are two basic ways to invoke an external process using ProcessBuilder class - synchronous and asynchronous. A synchronous invocation can be implemented either explicitly - through calling process.waitFor() method or implicitly - through reading process execution results. In this case an invoker thread is kept in memory during external process execution. If neither of above approaches to make execution synchronous are in place, an execution is asynchronous - an invoker can complete while a long running process is still running7. In our implementation we are using asynchronous invocation of the long running process.

As we have defined above, a REST server requires a process instance ID and/or token ID to determine which process instance/token request refers to. Both process instance ID and token id can be passed to the integration script, which can invoke callback handler with these parameters.

Because our REST server implementation is based on the content of jBPM server database, direct invocation of the integration script from an action handler has a potential of a race condition. If a callback handler (see further in the article) will try connecting to a REST server before results of the Starting pipeline node are committed to the database, results of the REST server invocation can be out of synch. In order to avoid this race condition potential we have introduced a simple thread pool. In this case, instead of directly invoking invocation script, an action handler builds a run able object and submits it to a thread pool for an invocation. Before invoking an integration script, this run able object sleeps for a short time, allowing a jBPM server to commit the results of a node execution to a database.

Once Starting pipeline node completes its execution, a process transitions to a Pipeline completed state, where it waits for callback handler to signal that a long running activity is completed and a process can continue.

Callback handler

A callback class implementation is based on the usage of the jBPM REST APIs for setting parameters and execution continuation8. A basic implementation can be done using a code snipet, presented in Listing 2

URLString = baseURL + "token/" + tokenId + "/parameters?parameter=" +
……………..
postURL = new URL(URLString);
connection = (HttpURLConnection) postURL.openConnection();
connection.setRequestMethod("POST");
connection.disconnect();

Listing 2 Posting a request to a REST server

In this implementation HTTP Post is implemented slightly different, compared to a normal HTTP Post. Instead of defining content as URL encoded and writing parameters to the output stream, we are using an URL, containing a true query string here. The reason that it works is because RestEasy sertvlet implements a Service, rather then doPost and doGet methods. As a result, all requests (Post/Get/Put/Delete) are delivered to this method and then processed by RestEasy implementation, which is parsing request URL correctly.

Error handling

If everything works correctly, a simple process, presented at Figure 5 works. Unfortunately, in the real life, things can go wrong and consequently this simple implementation has to be enhanced with error handling.

Invocation of long running activities can introduce two additional types of errors:

  • Runaway scripts - situations when abnormal execution of script can prevent callback handler from execution. As a result a process invoking long running activity will stuck
  • Errors in execution of the long running activity. Because these errors occur outside of the process execution, they have to be explicitly reported to a process.

Runaway scripts

An effective mechanism for dealing with Runaway scripts is timeouts. jBPM provides timer support, which allows to alter a normal course of process if duration of some execution exceeds a predefined time interval. Timer execution allows either the invocation of an appropriate action or the transition to a specified node. Our error handling implementation for runaway scripts is based on timer's transitions. In general, you would typically transition to a human task, allowing a person to decide on exact corrective action in any specific case. In the most simplistic case (Figure 6) such task can be modeled with a simple node, which just loops back to the starting pipeline node. This node - Timing out - is invoked using timer, attached to a Running pipeline state. If timer fires, it is using timeOut transition to invoke Timing out node.

Figure 6 jBPM process with timeout

Implementation of process Figure 6 requires some changes to our base implementation, described so far. First timing out node can either decide to continue waiting or take a corrective action. In both case control typically returns to a Starting pipeline node; and the node must be able to keep track of whether a script is running (continue wait action) or not (starting script). The easiest way to implement this logic is through process variables (Listing 3):

String state = (String)executionContext.getVariable(processName);

if(state != null){
	if("completed".equals(state)){
		executionContext.leaveNode(completeTransition);
		return;
	}
	else{
		executionContext.leaveNode(waitTransition);
		return;
	}
}
executionContext.setVariable(processName, "started");
…………

Listing 3 Selecting execution path

Second - "blind" signaling of the process instance/token to complete execution will not work in this case. The reason for this is twofold:

  • Because Running Pipeline state, in this case, can have multiple transitions the REST API should support the ability to specify which transition has to be taken as a result of the process/token signaling.
  • When the callback handler is invoked, in the case of a process at Figure 6 there is no guarantee that the process is in the Running Pipeline state. It can alternatively be in, for example, Timing out node. This means that a REST API should support signaling process execution only if a process in a given state.

This can be achieved by modifying REST APIs (Listing 1) to support two additional parameters for process instance/token signaling (Listing 4)

  @POST
  @Path("instance/{id}/signal")
  public Response signalProcess(
      @PathParam("id") String id,
      @QueryParam("transition") String transition.
      @QueryParam("state") String state	


  )
  ……………………
  @POST
  @Path("token/{id}/signal")
  public Response signalToken(
      @PathParam("id")String id,
      @QueryParam("transition") String transition,
      @QueryParam("state") String state
  )
  ……………………

Listing 4 Extended signaling REST API

Because jBPM provides support APIs for getting current node for a given process instance token and signaling with a desired transition, extended implementation is fairly straightforward.

Script execution results

Supporting script execution results is fairly straightforward. This can be done by simply introducing additional process variables, which can be set using callback handler.

Process Componentization

One of the important cases of the long-running activities is a JBPM process itself. Current version, jPDL-3, does not provide explicit support for coordination of execution of multiple jBPM processes. As a result process designers often try to implement the required functionality as one monolithic process. Such approach has the same drawbacks as creating very large Java classes - readability and maintainability issues, reuse limitations and so on. One of the approaches to alleviate these issues is decomposition of the process into multiple processes9 and their coordination at runtime.

jBPM/ESB integration provided by JBoss SOA platform allows coordination of multiple process execution10 through wrapping these process in JBoss ESB services and using jBPM ESB node to invoke them. Although this approach works well, it requires introduction of the JBoss ESB in the overall solution. If ESB usage is limited to the process invocation coordination, it is typically hard to justify. A more lightweight approach can be implemented through starting subordinated process programmatically (from one of the process nodes) and then using jBPM REST server and callback handler (described above) for process execution coordination.

Conclusion

Simplicity and extensibility of JBoss jBPM makes it fairly simple to implemented additional functionality, which is not part of the jBPM distribution. This significantly increase the reach of jBPM - based solutions

Acknowledgements

I am thankful to my NAVTEQ colleagues, especially Catalin Capota for discussion of implementation approaches and help in prototyping solution.


1 In operating system terms

2 Existing jBPM/ESB integration, provided as part of JBoss SOA platform is based on such approach.

3 See Edgar Ankiewsky's article or this article on http://www.mastertheboss.com for examples of exposing jBPM client APIs using REST

4 http://www.jboss.org/resteasy/

5 Compare to an ordinary Java program, that can have one or more class definitions (process definitions), each of which can have multiple objects (process instances).

6 Interfaces in Listing 1 are expressed as JAX-RS annotations.

7 When invocation of the script is implemented using ProcessBuilder class, an output of the script is redirected to the invoker class. As a result, if the invoker class is completes while a script still has an "output pipe"opened, a script will be terminated. To avoid this situation, in a windows script, for example, make sure that you have "@ECHO OFF"at the beginning of a script.

8 Additional support can be implemented for accessing process/token parameters during script execution.

9 Here I am intentionally trying to avoid the word subprocess, which often has a connotation as being a part of the main process. A typical usage of subprocesses is to “… allow the end-to-end process to be described at multiple levels of detail” See article. I am, on another hand is talking here about independently developed processes - process services - which could be reused by different higher - level processes.

10 http://www.infoq.com/articles/jboss-esb-jbpm.

Rate this Article

Adoption
Style

BT