BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Parallel Processing Framework JPPF offers Load Balancing, Failover and J2EE Integration

Parallel Processing Framework JPPF offers Load Balancing, Failover and J2EE Integration

Bookmarks

Java Parallel Processing Framework (JPPF) project team announced the first Release Candidate (RC1) of Version 1.0 of their product last week. JPPF is an open source grid computing framework that can be used to run multiple java applications in parallel in a distributed execution environment.

JPPF architecture consists of three main components called clients, servers and nodes. The principle behind how the framework works is that it takes in a number of tasks, distribute their execution over several nodes, and after the execution of all the steps, recompose the results to send back to the client.

JPPF also provides the services such as load balancing, failover, and recovery. A JMX based administration console allows the monitoring of the nodes as well as the executed tasks. Tasks can be cancelled and restarted remotely, or they can be configured to timeout at a given date or after a given elapsed time.

The framework integrates with J2EE application servers using a JCA 1.5 compliant Resource Adapter which provides the servers with an access to native grid services. The Resource Adapter implements asynchronous tasks submission to eliminate any risk of JTA transaction timeouts. JPPF supports the following application servers:

InfoQ caught up with Laurent Cohen, founder of the framework, about JPPF parallel processing capabilities and the future roadmap of the project. Speaking of Version 1.0 GA release, Laurent said his team is planning on the GA release for next month.

Responding to a question on how JPPF framework compares with java.util.concurrent API introduced in JDK 1.5, Laurent said that in an environment where there is a single computer with many processors, JPPF will be measurably slower than using java.util.concurrent classes directly. But if the architecture is a network of machines, JPPF is a good solution compared to JDK concurrency classes. JPPF uses java.util.concurrent APIs internally in every component of its architecture. Its nodes are configured to perform multithreaded processing of the tasks using the ExecutorService interface.

A key feature that will be introduced in Java 7 as part java concurrency API is the fork-join framework used for fine-grained parallel processing requirements. InfoQ asked if there are any similar features available in JPPF. Here is his response:

This is what JPPF has been all about from the start. JPPF is designed to take in a number of tasks, distribute their execution over the compute nodes, then recompose the results into the appropriate format. In this regard, JPFF can be viewed as an extended, distributed fork-join framework.

Responding to a question on how JPPF compares with other open source parallel computing frameworks such as GigaSpaces, Terracotta and GridGain, Laurent said:

GridGain is the closest open source framework to JPPF in terms of scope and functionalities. What differentiates them is their implementation architectures: GridGain uses a peer-to-peer topology whereas JPPF uses a multi-tiered architecture to achieve the distributed processing.

Terracotta has a very different philosophy. Their implementation of a distributed JVM is an extraordinary achievement, however that doesn't make it a grid computing framework per say. Terracotta is great at clustering and provides vital features such as distributed caching, transaction management, replication etc.

JPPF fits well into these frameworks where JPPF would implement the local topology within single organizations while GigaSpaces or Globus Toolkit managing the larger picture.

Regarding the implementation details of load balancing in JPPF, he explained that:

The applications submit tasks to a centralized JPPF server, grouped as a "task bundle". Based on the nodes' performance profile - computed dynamically - the bundle is then partitioned into several sub-bundles that are sent to each node. The size of each sub-bundle is computed as a function of the past performance of the node. The performance profiles are constantly re-evaluated, causing the framework to automatically adapt to new and changing conditions, including the type and number of tasks sent by the applications, the number of nodes actually registered with the server, etc.

And regarding the failover capability:

Failover is implemented in all JPPF architecture components and relies on 3 major mechanisms: dynamic topology, failure detection and automatic resubmission. JPPF components can be added to, or removed from the network at any time and in any order.

Laurent talked about 2 typical cases of failure and how JPPF fails over to a different node in these scenarios.

In the first case, a node crashes or suddenly gets disconnected from the server. The server will detect the failure and automatically submit the incomplete work to another node.

In the second case, the client is disconnected from the server. The client will automatically attempt to reconnect to the server, until it succeeds or an optionally specified timeout expires. In the meantime, a client can be configured to connect to many servers, organized in a hierarchy of server connection pools that defines an effective failover strategy. The client will then resubmit the work to the next server in the hierarchy.

Responding to a question on how JPPF framework supports application level security, he said JPPF will use any security framework that's used in the application, in a transparent way. Also, JPPF nodes have a configurable security policy which defines what the client code can and cannot do on node's host (such as writing to / reading from the file system, opening connections to other servers etc).

Finally, speaking of what's coming in the future releases of JPPF framework, Laurent said that there will be integration with Business Rules Engines (such as ILOG Inc. and JBoss Rules) and Web Services. It will also integrate with tools in the areas of ETL, Business Intelligence (BI), and Data Mining where distributed processing plays a critical role in retrieving data against large sets of data stored in a data warehousing system.

JPPF project was started 2 years ago as part of SourceForge.net. It's currently licensed under Apache License Version 2.0 and the latest version of JPPF can be downloaded from SourceForge project website.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Correction on GigaSpaces and JPPF

    by Nati Shalom,

    Your message is awaiting moderation. Thank you for participating in the discussion.


    JPPF fits well into these frameworks where JPPF would implement the local topology within single organizations while GigaSpaces or Globus Toolkit managing the larger picture.


    Corrections:

    GigaSpaces provides:
    1. EDG - Enterprise Data Grid and Distributed Caching framework - (same category as Terracota but very different implementation approach with strong emphasize on seamless integration with existing data bases).
    2. XAP - eXtreme Application Platform for scaling out stateful applications. It is built on top of our EDG and extends the Spring Framework. It basically makes scaling-out of the entire application as simple as running it on a single server through the virtualization of the entire middleware stack (an SLA-Driven Container, Data, Messaging, and Processing). It also provides a set of abstractions (JMS/MDB,JDBC, Remoting, Declarative Transactions) that enable a seamless transition of tier-based applications to a scale-out model.


    We partner with other Grid solutions such as GridGain, DataSynapse, Platform for handling compute intensive applications. JPPF could fit nicely into those application scenarios.

    You can find more details and case studies on how GigaSpaces works with other Grid solutions in one of my recent posts: Bringing Data Awareness to the Grid (and Amazon EC2)

    As you realize by now we have almost nothing in common with Globus - actually that is the first time that I've seen anyone comparing us.

  • Re: Correction on GigaSpaces and JPPF

    by Nikita Ivanov,

    Your message is awaiting moderation. Thank you for participating in the discussion.


    As you realize by now we have almost nothing in common with Globus - actually that is the first time that I've seen anyone comparing us.

    I would second that. Comparing Globus or its derivatives with GigaSpaces/GridGain/Terracotta is like comparing Cobol to Java...

    Best of luck to JPPF,
    Nikita Ivanov.
    GridGain - Grid Computing Made Simple

  • Re: Correction on GigaSpaces and JPPF

    by Laurent Cohen,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hello Nati,

    Thank you for this clarification.
    My intent was never to compare GigaSpaces and Globus, and I appreciate that you've now made it clear.
    The goal was rather to emphasize the scale and scope of usage when compared to that of JPPF.

    Laurent Cohen
    Visit JPPF at www.jppf.org

  • CommonJ?

    by Cameron Purdy,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    How does this compare to the CommonJ specification, which is the basis of the Java EE Executor framework from JSR 236/237?

    Peace,

    Cameron Purdy
    Oracle Coherence: The Java Data Grid

  • Open source grid and cluster computing frameworks

    by Srini Penchikala,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I found this link on manageability.org web site about open source grid and cluster computing frameworks written in java.

    www.manageability.org/blog/stuff/open-source-gr...

    The list includes JPPF and GridGain among other open source grid computing frameworks.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT