Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News GridGain 2.0 Supports Load Balancing, Work Stealing and Data Partitioning

GridGain 2.0 Supports Load Balancing, Work Stealing and Data Partitioning

This item in japanese

The latest version of GridGain, a java based open source grid computing framework, supports load balancing and data partitioning features. GridGain Systems recently released version 2.0 of the framework which also includes a "work stealing" feature where the scheduled jobs running on overloaded nodes in the grid are "stolen" to run on underloaded nodes.

The load balancing feature supports several different policies such as Round Robin (default), Weighted Random, Adaptive, and Affinity (Sticky) Load Balancing. There is also a custom Affinity load balancing policy for Oracle Coherence product.

Work stealing SPI works by dynamically load balancing the scheduled jobs between the grid nodes (by offloading them from an overloaded node in the grid to an underloaded node). This feature supports job scheduling as well as the fail-over of jobs executed in the grid. The design ideas for the grid job stealing feature are based on java Fork/Join Framework by Doug Lea and planned for Java SE 7. This feature can be used to avoid the jobs being stuck at a slower node, as they will be stolen by a faster node. The failover SPI makes sure that the job is re-routed to the node that has sent the initial request to steal it.

When working with large data sets, one must be aware of the amount of data passed over network between the nodes in the grid. GridGain comes with the following features to optimize working with large data sets:

  • Data Partitioning & Affinity Load Balancing: This feature works by co-locating the computations with the data by partitioning it across grid nodes and sending grid jobs exactly to the nodes where the data is located.
  • Node Segmentation: The grid nodes are segmented into separate groups where each group works on its own designated data set. This is useful in a scenario where some nodes in the grid only submit jobs to grid (masters), and other groups of nodes only execute these jobs (workers).
  • Intermediate Checkpoints: When dealing with long running jobs it is often useful to periodically save intermediate job state. Checkpoint SPI provides this feature so the jobs don't have to be restarted from scratch if they fail over to another node.

Other new features in GridGain 2.0 release include:

  • Monitoring: Developers can get runtime metrics about all grid nodes using GridNodeMetrics interface. These metrics include CPU and Heap Memory information as well as details about active, idle, and rejected jobs in the grid.
  • Resource Injection: GridGain now supports the injection of any resource (like JDBC connection) into tasks and jobs using @GridUserResource annotation. Spring Application Context can also be injected using @GridSpringApplicationContextResource annotation.
  • Grid Job Context: This features allows the developers to attach attributes to a specific job (using GridJobContext interface) or all task participant jobs (using GridTaskSession).

GridGain framework has integration with several open source and commercial frameworks and application servers such as JUnit, AspectJ, Spring, JBoss & JGroups, GlassFish, WebLogic, WebSphere, Coherence, Mule, JXInsight, and GigaSpaces.

There is also a distributed JUnit4 GridGain task that can be used to run JUnit4 tests or test suites across the grid to speed up overall execution of all tests. Distributed JUnit testing is configured using @GridifyTest annotation. This is helpful in the integrated development server environments where the entire test suite in an application is executed as part of the nightly application builds which typically take long time to run the several unit and integration functional tests.

GradGain has support for asynchronous notifications that can be set using the GridTaskSessionAttributeListener interface. It also handles the dependencies between the jobs using GridTaskSession interface which has checkpoint feature for managing the tasks.

Rate this Article