InfoQ

News

Deploying a 1 Terabyte Cache using EhCache Server

Posted by Gavin Terrill on Aug 28, 2008 04:49 PM

Community
Architecture,
Java
Topics
Clustering & Caching ,
REST
Tags
JSR 311 ,
Caching ,
EHcache

Greg Luck, of the EhCache team, announced in early August the availability of SOAP and RESTful APIs for caching. As described in the documentation:

Ehcache now comes with a Cache Server, available as a WAR for most web containers, or as a standalone server. The Cache Server has two APIs: RESTful resource oriented, and SOAP. Both support clients in any programming language.

In a follow up post, Greg outlines his thoughts on deployment options for a theoretical 1 terabyte cache:

The largest ehcache single instances run at around 20GB in memory. The largest disk stores run at 100Gb each. Add nodes together, with cache data partitioned across them, to get larger sizes. 50 nodes at 20GB gets you to 1 Terabyte.

The first, and simplest, approach involves setting up several nodes running  ehcache server and have the client determine the server to use based on an object's hashcode:

String[] cacheservers = new String[]{"cacheserver0.company.com", "cacheserver1.company.com", "cacheserver2.company.com", "cacheserver3.company.com", "cacheserver4.company.com", "cacheserver5.company.com"};
Object key = "123231";
int hash = Math.abs(key.hashCode());
int cacheserverIndex = hash % cacheservers.length;
String cacheserver =cacheservers[cacheserverIndex];

To support redundancy, a load balancer is introduced, and each node runs two ehcache server instances, with replication between them enabled using the existing distributed caching options (RMI or JGroups). In this approach, clients would still determine their servers using the hashcode, but now failures are handled transparently behind the virtual IP assigned by the load balancer.

The third option Greg describes involves moving the responsibility for routing requests to the load balancer.

The RESTful version of the EhCache Server is based on Jersey - the JSR 311 reference implementation. Paul Sandoz, one of the Jersey developers, discussed how the client API of jersey could be used to access the cache for creating and retrieving a sample XML document:

// retrieving a node
Node n = r.accept("application/xml").get(DOMSource.class).getNode();
// creating a node
String xmlDocument = "...";
Client c = Client.create();
WebResource r = c.resource(http://localhost:8080/ehcache/rest/sampleCache2/2);
r.type("application/xml").put(xmlDocument);

So, in what scenarios would a RESTful cache be useful? James Webster reports on seeing an increase in adoption of this architectural style in large enterprises:

An architectural pattern that I have observed a few investment banks implement is a distributed memory cache accessed via a RESTful front-end over HTTP for providing access to market data (e.g.. stock prices, interest rate curves, or derived values like volatility surfaces & correlations) and static data (e.g. counterparty details, settlement defaults). The distributed cache can be ‘easily’ scaled to hold massive data sets and the front-end allows the data to be accessed in a technology agnostic fashion, as long as the client can speak HTTP.

As James points out, it will be interesting to see how long it will take commercial vendors (such as Oracle and Gigaspaces) to support RESTful interfaces in their products.

bug by Greg Allen Posted Aug 28, 2008 6:09 PM
Who would want to expose caching as a service? by Richard L. Burton III Posted Aug 29, 2008 10:20 AM
Re: Who would want to expose caching as a service? by Carlos Zuniga Posted Aug 29, 2008 11:12 AM
Re: Who would want to expose caching as a service? by Gavin Terrill Posted Sep 1, 2008 7:30 PM
Re: Who would want to expose caching as a service? by Zubin Wadia Posted Sep 2, 2008 3:28 PM
REST as opensource project for Gigaspaces by Mathias Kluba Posted Sep 2, 2008 4:01 PM
  1. Back to top

    bug

    Aug 28, 2008 6:09 PM by Greg Allen

    int hash = Math.abs(key.hashCode()); is not always positive. https://www.blogger.com/comment.g?blogID=33967480&postID=115843269502969349

  2. Back to top

    Who would want to expose caching as a service?

    Aug 29, 2008 10:20 AM by Richard L. Burton III

    Why would venders like Oracle or Gigaspaces want to expose caching via SOAP or REST? Caching is not a service; exposing such a 'service' enterprise wide would only lead to problems down the road. Best Regards, Richard L. Burton III

  3. Back to top

    Re: Who would want to expose caching as a service?

    Aug 29, 2008 11:12 AM by Carlos Zuniga

    They are not saying that caching would be provided as a service by Oracle or Gigaspaces. They are merely pointing out that the services that are provided by these vendors might have RESTful APIs for their clients, in the future. RTFA!
    Carlos

  4. Caching is not a service
    Mark Nottingham's "Leveraging the Web for Services at Yahoo!" talk (QCon London 2007) challenged my idea of what caching could be used for. He explains how HTTP-based caching using Squid has been used to integrate the many Yahoo! properties.
    exposing such a 'service' enterprise wide would only lead to problems down the road
    I'd be interested in hearing about the problems you foresee.

  5. I second Gavin's request - what's wrong with caching as a service?

  6. Back to top

    REST as opensource project for Gigaspaces

    Sep 2, 2008 4:01 PM by Mathias Kluba

    http://www.openspaces.org/display/RES/REST+API

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.