BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Growing EAI with Apache Camel

Growing EAI with Apache Camel

Lire ce contenu en français

Bookmarks

Requirements in IT projects are prone to change, and that includes requirements on integrating with other systems. Being able to quickly respond to such changes can be critical to project success, so the software and development process must enable this. Fortunately, Enterprise Application Integration (EAI) provides us with all the knowledge, technology and best practices to build extensible, maintainable and capable integration solutions in a productive fashion.

However, most integration solutions place us in a dilemma: while they are full of features and can be quite productive for large projects and a demanding environment, they also require big investments up front when it comes to learning the system, deploying it and maintaining it.

For this reason, when faced with simple integration requirements, ad hoc solutions seem very appealing. But they become hard to maintain and counter productive should integration needs grow. Applying EAI best practices would cure this, but implementing them yourself requires effort and the knowledge to do it correctly. What seems like the path of least resistance at first can later become a dead end.

How then can we be productive when faced with simple as well as complex integration tasks, while avoiding big investments early on? In this article I will argue that Apache Camel offers a solution. I will aim to demonstrate that Camel can meet complex integration challenges enabling you to leverage EAI best practices, while being easy to pick up and easy to master. All the while, Camel lets you to concentrate on what provides business value, not dealing with the complexities imposed by some frameworks.

I will show this by looking at practical examples of typical integration challenges and see how Camel helps us meet these challenges. These examples are presented in the context of an integration solution that starts simply but grows over time as new integration needs arise. Each time I will investigate how Camel is be able to meet these demands, primarily from the point of view of managing complexity and staying productive.

I have chosen Apache Camel because, in my opinion, it offers an excellent, lighter-weight alternative to full ESB products such as Service Mix, Mule ESB, OpenESB and JBossESB.  Its closest rival is probably Spring Integration, which is a good option to consider particularly if your project is already using SpringSource technologies.  As you will see, you can also use Camel and Spring together. Gunnar Hillert offers further discussion of the alternatives here.

Humble beginnings

Integration often starts simple. For instance, fetch some file from an FTP server and put it in a local file. At this stage the do-it-yourself solution seems very appealing. But let’s look a bit more closely.

The do-it-yourself solution might look something like this:

public class FTPFetch {

     public static void main(String[] args) {
        FTPClient ftp = new FTPClient();
        try {
              ftp.connect("host"); // try to connect

              if (!ftp.login("camel", "apache")) // login to server
              {
                  ftp.disconnect();
                  return;
              }

              int reply = ftp.getReplyCode();

              if (!FTPReply.isPositiveCompletion(reply)) {
                  ftp.logout();
                  ftp.disconnect();
                  return;
              }

               ftp.changeWorkingDirectory("folder");

              // get output stream for destination file
              OutputStream output = new FileOutputStream("data/outbox/file.xml");

              ftp.retrieveFile("file.xml", output); // transfer the file
              output.close();

              ftp.logout();
              ftp.disconnect();

        } catch (Exception ex) {
                  ex.printStackTrace();

        } finally {
                  if (ftp.isConnected()) {
                     try {
                           ftp.disconnect();
                     } catch (IOException ioException) {
                           ioException.printStackTrace();
                     }
              }
         }
     }
}

This solution uses the FTPClient class from Apache Commons. As it is just a client and nothing more, we need to set up an FTP connection and do error handling ourselves. But what if the file on the FTP server changes later? I suppose we should schedule this to run periodically.

Now let’s look at Apache Camel. Camel is an integration framework designed to solve this kind of problems by following EAI best practices. Camel should be viewed as both a toolbox of ready made integration components and a runtime which can be customized for specific needs by combining them. With Camel, this is how we would solve the problem above:

public class CamelRunner{
     public static void main(String args[]) throws Exception {
         Main camelMain = new Main();
         camelMain.enableHangupSupport(); //ctrl-c shutdown
         camelMain.addRouteBuilder(new RouteBuilder() {
                public void configure() {
from(
"ftp://host/folder?username=camel&password=apache&fileName=file.xml&delay=360000" )
       .to("file:data/outbox");

                }
        });

        camelMain.run(); //Camel will keep running indefinitely
     }
}

Please note the from and to methods. Camel calls this a ‘route’: the path that is traversed by the data from source to destination. Moreover, data is not exchanged in raw form but rather it is wrapped in Messages: containers for the actual data. This is similar to a SOAP envelope, which has sections for a body, attachments and headers.

Message sources and destinations are called ‘endpoints’, and it is through them that Camel receives and sends data. Endpoints are specified with a URI formatted string as seen in the arguments for the from and to methods. Therefore the way we tell Camel what to do is by declaratively creating routes between endpoints, and then registering these routes with Camel.

The rest is just boilerplate which gets reused as more routes are added and is a great deal simpler than talking to an FTP server. Camel will take care of the awkward FTP details and will even poll the server periodically in case the file changes, as it has been set up to keep running indefinitely.

The compactness and clarity of the code comes from the Camel DSL, a Domain Specific Language where the ‘domain’ is EAI. That means that, unlike with other solutions, there is no translation to be made from the EAI problem domain to the Camel application domain: the two are virtually the same. This helps to keep the learning curve gentle and the entry point low in comparison: once you understand your EAI problem, it’s a small step to implement it with Camel.

But the code you write is not the only thing that’s simple: all that is needed to get this running is camel-core.jar and camel-ftp.jar and their dependencies, together just a few MB. This main class can then be run from the command line. No need for an application server with added complexity. In fact, since Camel is so lightweight, it can be embedded just about anywhere. Choosing a do-it-yourself solution on the sole basis that frameworks add a lot of complexity is not valid: Camel is simple to understand, simple to use and simple to run.

Growing complexity

Now let’s say more and more integration needs to be made. We not only want to be able to have more integration, but also to keep it maintainable. How would Camel cope with this?

As more connections need to be made, we just add more routes to Camel. These new routes might need to connect via other endpoints such as HTTP, JMS, SMTP, etc... Fortunately Camel’s list of supported endpoints is extensive. What’s great is that each of these represents reusable code that you don’t have to write.

Of course, sooner or later you will need something which is not on the list. The question then becomes: how easily can I plug my own code into Camel? In this case we can use what Camel calls ‘Components’. Components define a contract which when implemented will make your code available as just another endpoint to be called from the DSL.

So now we know we can add more and more routes, connecting with just about any type of protocol whether Camel provides for it out of the box or not. But at some point routes start to get quite numerous and you find you are repeating yourself. We would like to reuse bits of routes, maybe even split the whole solution into separate, coarse grained parts.

Camel’s strategy for reuse is based on some special, internal endpoints which only Camel can see. Should you need to reuse part of an existing route, it is possible to refactor that route into two, linked by an internal endpoint. Please see below:

Original:

//original
from(“ftp://server/path”).
     to(“xslt:transform.xsl”).
         to(“http://server2/path”);

Refactored:

//receiving from internal endpoint d1
from(“direct:d1”).
     to(“xslt:transform.xsl”).
         to(“http://server2/path”);

//sending to d1
from(“ftp://server/path”).
     to(“direct:d1”);

//also sending to d1
from(“file://path”).
     to(“xslt:other-transformation.xsl”).
         to(“direct:d1”);

The connecting endpoint is the one of type ‘direct’. Endpoints of this type are only addressable from within the same Camel context. Another interesting endpoint type is VM. VM endpoints are addressable from another Camel context, provided both contexts run on the same JVM instance.

A Camel context is like a container for your routes. Each time you run Camel, it instantiates a context and looks for routes inside it. So when we run Camel, we are actually running a context instance.

Being able to address routes in other Camel context instances via VM is quite useful. It opens the possibility to break your entire solution into interconnected modules in a more lightweight fashion than, for instance, via JMS.

The picture below shows the various routes now spread between different Camel instances, each separately running on the same JVM instance and addressing each other with a VM endpoint:

(Click on the image to enlarge it)

We have decomposed our solution into modules. Now we can develop, deploy and run any other module which also sends to ‘Consumer Context’, independently of ‘Producer Context1’ or ‘Producer Context2’. This is key in order to keep even the largest solution manageable.

At this point it might make sense to use an application server, as it is able to fully exploit modularity. Or maybe you already are using one. A very common approach then is packaging Camel into a WAR file and deploying to Tomcat. But you could also deploy it to a full blown Java EE application server like JBoss, WebLogic or WebSphere. Other options include an OSGI container or even Google App Engine.

Mastering complexity

Sheer volume is not the only way in which applications can grow. Routes can also grow in complexity: messages may undergo various amounts and types of transformations, filtering, enrichment, routing, etc in any number of combinations. In order to discuss how Camel can help in that regard, let us consider how we can deal with complex problems in the first place.

Complex problems arise in any field, but the general strategy for solving them is usually the same: divide and conquer. We try to decompose the problem into subproblems that are more simple to solve. These solutions are then combined by reversing the decomposition to yield the total solution.

 Through observation one then notices that certain problems keep recurring; through experience one identifies the most optimal solution. What I am talking about are patterns. The EAI patterns have been catalogued by Gregor Hohpe and Bobby Woolf and summarized online.

EAI patterns can be very simple in nature, often representing basic operations like some transformation or filtering. Most importantly, they can be combined to form complex solutions. These could well be patterns themselves. This ability stems from the fact that all EAI patterns have the same ‘interface’: messages can get in and out of a pattern. Patterns can then be linked together by taking the output of one pattern and using it as the input of another.

That implies that, broadly speaking, EAI problems are in fact just a combination of patterns. Which means solving an EAI problem, even a complex one, is reduced to finding that combination that meets your requirements. Implementing individual patterns can still hold plenty of complexity of course, but that has been isolated and is manageable.

Let’s consider an actual pattern as an example. This pattern is called ‘Composed Message Processor’ and is in fact a combination of more basic patterns. It is used when parts of the same message need to be processed by different components. This pattern is not directly implemented by Camel, but its subpatterns are. So this is a good example of how patterns can be combined by the Camel DSL.

Below is the pattern diagram. ‘Splitter’ will split the incoming message into parts, while ‘Router’ will decide which system to send them to: either ‘Widget Inventory’ or ‘Gadget Inventory’. These systems can be thought of as doing some business related processing, then returning the processed messages. ‘Aggregator’ will then combine the results into one outgoing message again.

(Click on the image to enlarge it)

Here is the Camel implementation:

from("some:input")
     .setHeader("msgId") //give each message a unique id based on timestamp
         .simple("${date:now:S}")
     .split(xpath("//item")) //split the message into parts (msgId is preserved)
         .choice() //let each part be processed by the appropriate bean
             .when( xpath("/item[@type='widget']") )
                 .to("bean:WidgetInventory")
             .otherwise()
                 .to("bean:GadgetInventory")
         .end()
     .aggregate(new MyAggregationStrategy()) //collect the parts and reassemble
         .header("msgId") //msgId tells us which parts belong together
         .completionTimeout(1000L)
.to("some:output"); //send the result along

In this implementation, the ‘beans’ are actually POJOs registered under the bean name, for example via JNDI. In this way we can do custom logic in the route. MyAggregationStrategy is also custom code, it specifies how to reassemble the processed message parts.

Note the split, choice, and aggregate methods, which directly correspond to the ‘Splitter’, ‘Router’ and ‘Aggregator’ patterns.The Camel implementation of ‘Composed Message Processor’ is essentially a textual representation of the diagram above. So mostly there is no need to think in terms of ‘Camel’, just in terms of EAI. The result is that Camel actually stays relatively out of the way, and more emphasis can be placed on understanding the problem and identifying the appropriate patterns. That helps improve the overall quality of the solution.

However, it’s not all goodness. Camel does have its own ‘way of doing things’, its own behind-the-scenes logic. And there will be moments when the unexpected happens and you will be left clueless. But such setbacks should be viewed in light of the time actually saved by using Camel: other frameworks have a steeper learning curve and quirks of their own, do-it-yourself means you don’t get to reuse all the great features Camel has to offer and keep reinventing the wheel.

No argument about managing complexity and evolving software would be complete without talking about unit tests. Camel can be run embedded in any other class, so it will also run inside a unit test.

Camel also solves one of the most cumbersome things about integration testing: having to set up an FTP or HTTP server in order to be able to run tests. Basically it avoids this because it is possible to alter existing routes at runtime. Here is an example:

public class BasicTest extends CamelTestSupport {

     // This is the route we want to test. Setup with anonymous class for
     // educational purposes, normally this would be a separate class.
     @Override
     protected RouteBuilder createRouteBuilder() throws Exception {
         return new RouteBuilder() {
               @Override
               public void configure() throws Exception {
                      from("ftp://host/data/inbox").
                             routeId("main").
                                  to("file:data/outbox");
               }
         };
     }

     @Override
     public boolean isUseAdviceWith() {
         // Indicates we are using advice with, which allows us to advise the route
         // before Camel is started
         return true;
     }

     @Test
     public void TestMe() throws Exception {
         // alter the original route
         context.getRouteDefinition("main").adviceWith(context,
                new AdviceWithRouteBuilder() {
                             @Override
                             public void configure() throws Exception {
                                  replaceFromWith("direct:input");
                                  interceptSendToEndpoint("file:data/outbox")
                                         .skipSendToOriginalEndpoint()
                                                      .to("mock:done");
                             }
                });
         context.start();

         // write unit test following AAA (Arrange, Act, Assert)
         String bodyContents = "Hello world";
         MockEndpoint endpoint = getMockEndpoint("mock:done");
         endpoint.expectedMessageCount(1);
         endpoint.expectedBodiesReceived(bodyContents);

         template.sendBody("direct:input", bodyContents);

         assertMockEndpointsSatisfied();
     }
}

AdviceWithRouteBuilder allows for programmatically changing an existing route in its configure method without altering the original code. In this case we have replaced the original source endpoint with one of type DIRECT, and made sure the original destination gets bypassed in favor of the mockendpoint. In this way, we do not need to have an actual FTP server running in order to test our route, even though it is programmed to pull messages from FTP. The MockEndpoint class then provides a convenient API for setting unit tests up in a declarative way, similar to jMock. Another great feature is the template we use in order to easily send messages to our route under test.

Relying on Camel

One important characteristic of integration solutions is that, as they are the intermediate through which all other systems are connected, by their very nature they are a single point of failure. As more and more systems get connected or the data gets more important system failure, data loss and performance degradation become less tolerable even as the volume increases.

Even though this article is about Camel, a solution that addresses all these challenges is beyond the scope of Camel alone. However, Camel is a central part of such a solution because it contains all the logic for moving the data around. So it is important to know that it can fulfill its duties even in these demanding conditions.

Let’s consider an example to see how these requirements are typically met. In this example there is an incoming JMS queue where messages are placed by external systems. Camel’s job will be to take the messages, do some processing, then deliver them to an outgoing JMS queue. JMS queues can be made persistent as well as highly available separately, so we will focus on Camel, and assume that external systems can ‘always’ put messages on the incoming queue. That is until it fills up, which will happen if Camel cannot pick up and process messages fast enough.

Our aim then is to make Camel resilient to system failures and increase its performance, and we do this by deploying it on more servers, each running a Camel instance connected to the same endpoints. See also the picture below:

(Click on the image to enlarge it)

This is in fact an implementation of another EAI pattern called ‘Competing Consumers’. This pattern has two benefits: first, messages are taken from the queue from multiple instances and get processed in parallel, which improves performance. Second, should one server go down, others are already running and taking messages, so message processing continues automatically and without any intervention, which improves failure resilience.

When one Camel instance takes a message, it is no longer available to others. This ensures messages are processed once. And the workload gets distributed across servers as each server takes messages: faster servers can take messages at a faster rate and automatically take on more of the burden than slower servers. In this way we can achieve the necessary coordination and workload distribution between Camel instances.

However, there is one element missing: should one server go down while processing a message, another must take up its work otherwise the message is lost. Similarly, if all nodes go down, messages that are in the middle of processing should not be lost.

For that to happen, we need transactions. With transactions, the JMS queue will wait for an acknowledgement from the instance that took the message before really discarding it. If the server that took the message fails during processing, that acknowledgement will never come, and eventually a rollback will kick in and the message will reappear on the queue and become available again to the instances that are left running. If none are running, the message just stays there until a server eventually gets back online.

For Camel this means that the routes must be made transactional. Camel does not by itself provide for transactions, but instead makes use of 3rd party solutions. That keeps Camel simple, while enabling reuse of proven technology and making it possible to easily switch implementations.

As an example we will configure a Camel context with transactions inside a Spring container. Note that as we are running inside Spring, it’s more practical to use the Spring XML version of the Camel DSL instead of the Java one, even though the latter is great for starting out.

Of course, changing DSLs mid-project means rework, so it’s important to migrate wisely and at an appropriate time. Fortunately, the Spring DSL also runs from a unit test, so unit tests can help to safely make the transition since they will work on routes regardless of which DSL type was used.

<beans //namespace declarations omitted >

     //setup connection to jms server
     <jee:jndi-lookup id="jmsConnectionFactory" jndi-name="ConnectionFactory">
         <jee:environment>
            java.naming.factory.initial=org.jnp.interfaces.NamingContextFactory
            java.naming.factory.url.pkgs=org.jboss.naming.client
            java.naming.provider.url=jnp://localhost:1099
         </jee:environment>
     </jee:jndi-lookup>

    //configuration for the jms client, including transaction behavior
    <bean id="jmsConfig" class="org.apache.camel.component.jms.JmsConfiguration">
         <property name="connectionFactory" ref="jmsConnectionFactory"/>
         <property name="transactionManager" ref="jmsTransactionManager"/>
         <property name="transacted" value="true"/>
         <property name="acknowledgementModeName" value="TRANSACTED"/>
         <property name="cacheLevelName" value="CACHE_NONE"/>
         <property name="transactionTimeout" value="5"/>
    </bean>

   //register camel jms component bean
<bean id="jboss" class="org.apache.camel.component.jms.JmsComponent">
      <property name="configuration" ref="jmsConfig" />
   </bean>

   //register spring transactionmanager bean
   <bean id="jmsTransactionManager"
class="org.springframework.jms.connection.JmsTransactionManager">
         <property name="connectionFactory" ref="jmsConnectionFactory"/>
   </bean>

     <camelContext xmlns="http://camel.apache.org/schema/spring">
          <route>
               <from uri="jboss:queue:incoming"/>
               <transacted/>
               <log loggingLevel="INFO" message="processing started." />
               <!-- complex processing -->
               <to uri="jboss:queue:outgoing?exchangePattern=InOnly" />
          </route>
     </camelContext>
</beans>

With the <transacted/> tag the route is marked as transactional, so Camel will enlist resources in the transaction through the transaction manager for that route. In case of failure during processing, the transaction manager will make sure the transaction is rolled back and the message reappears in the incoming queue.

However, not every route can be marked transactional because some endpoints, FTP for instance, do not support transactions. Fortunately, Camel has error handling that works even without transactions. Of particular interest is the DeadLetterChannel, an error handler which implements the Dead Letter Channel pattern. This pattern states that messages that could not, or should not, be delivered to their intended destination must be moved to a separate location, so as not to clutter the system. The messaging system then decides what to do with such messages.

For instance, suppose that delivery to an endpoint such as an FTP location fails. If configured on that route, the DeadLetterChannel will first attempt to redeliver the message a few times. If the failure persists then the message is called ‘poison’, meaning nothing useful can be done with it and it should be taken out of the system. By default Camel will then log the error and drop the message. Naturally, this mechanism can be customized: for instance you could specify that Camel should perform at most 3 redelivery attempts, and store the message in a JMS queue if they are exhausted. And yes, the DeadLetterChannel can be combined with transactions, bringing the best of both.

Conclusion

Unmaintainable integration usually begins with simple integration needs which are met in an ad-hoc fashion. Such approaches do not scale to more rigorous demands, and making them do so is a considerable investment in itself. Early on investment on specialized EAI middleware carries a great risk due to the complexity they often bring, and has a high probability of not paying off.

In this article I have investigated a third option: using Camel in order to keep things simple in the beginning while still being able to meet higher demands later. In this regard I believe Camel has shown itself quite capable: it has an easy learning curve and is lightweight in use and in deployment, so early on investments are small. Even in simple cases, learning Camel can actually be a faster path to integration than do-it-yourself solutions. Camel is therefore great as a low threshold entry to EAI.

I also think that Camel is a good choice for the greater demands that can be placed on an integration solution. In regards to productivity it has extensibility and reuse, and an amazing integration DSL. Because of it there is almost no complexity overhead in using Camel, so you can focus on the actual problem. When you reach the limits of what can be done with out-of-the-box Camel, it has a plugin infrastructure for Components and POJO invocation empowering you to take matters into your own hands.

Unit test support with Camel is invaluable. Camel also proved itself as part of a High Availability solution.

On the whole, Camel is a great option for integrations of virtually any size and complexity: you can start out small and simple with minimal upfront investment, confident in the knowledge that should integration needs get more complex Camel can still deliver. In the meantime you can stay productive while reaping the benefits of a mature and complete integration framework.

About the Author

Frans van der Lek is a software engineer with experience in web, mobile and EAI solutions. He is currently employed by Capgemini in the Netherlands where he has worked as a designer, developer and specifier on a number of projects. When not writing or thinking about software he enjoys a good book, a fine cup of coffee and spending time with his family.

 

 

Rate this Article

Adoption
Style

BT