JGroups Implementation of Memcached Supports Failover and JMX
Memcached is a distributed memory object caching system used in dynamic web applications to alleviate database load. It is used to speed up dynamic database-driven websites by caching data and objects in memory to reduce the number of times the database must be read. Memcached is based on a hashmap storing key/value pairs. The daemon is written in C, but clients can be written in any language and talk to the daemon via the memcached protocol. But it does not provide any redundancy (e.g. via replication of its hashmap entries); when a server S is stopped or crashes, all key/value pairs hosted by S are lost.
Bela Ban, JGroups and Clustering Team Lead at JBoss, recently wrote a JGroups-based implementation of memcached which allows Java clients to access memcached directly. The implementation is written completely in Java and has few advantages over memcached framework:
- Java clients and PartitionedHashMap (org.jgroups.blocks.PartitionedHashMap) can run in the same address space and therefore don't need to use the memcached protocol to communicate. This allows servlets to access the cache directly without serialization overhead.
- All PartitionedHashMap processes know about each other, and can make decisions as to what to do when a cluster membership change occurs. For example, a server to be stopped can migrate all of the keys it manages to the next server. With memcached, the entries hosted by a server S are lost when S goes down.
- When a cluster membership change occurs (e.g. a new server S is started), then all servers check whether an entry hosted by them should actually be hosted by S. They will move all entries to S. This has the advantage that entries don't have to be re-read from the DB (for example) and inserted into the cache (as in memcached's case), but the cache rebalances itself automatically.
- PartitionedHashMap has a level 1 cache (L1 cache). This allows for caching of data near to where it is really needed. For example, if we have servers A, B, C, D and E and a client adds a (to be heavily accessed) news article to C, then memcached would always redirect every single request for the article to C. So, a client accessing D would always trigger a GET request from D to C and then return the article. JGroups caches the article in D's L1 cache on the first access, so all other clients accessing the article from D would get the cached article, and we can avoid a round trip to C. Note that each entry has an expiration time, which will cause the entry to be removed from the L1 cache on expiry, and the next access would have to fetch it again from C and place it in D's L1 cache. The expiration time is defined by the submitter of the article.
- Since the RPCs for GETs, SETs and REMOVEs use JGroups as transport, the type of transport and the quality of service can be controlled and customized through the underlying XML file defining the transport. For example, we could add compression, or decide to encrypt all RPC traffic. It also allows for use of either UDP (IP multicasting and/or UDP datagrams) or TCP.
- The connector (org.jgroups.blocks.MemcachedConnector) which is responsible for parsing the memcached protocol and invoking requests on PartitionedHashMap, PartitionedHashMap which represents the memcached implementation, the server (org.jgroups.demos.MemcachedServer) and the L1 and L2 caches (org.jgroups.blocks.Cache) can be assembled or replaced at will. Therefore it is simple to customize the JGroups memcached implementation; for example to use a different MemcachedConnector which parses a binary protocol (requiring matching client code of course).
- All management information and operations are exposed via JMX.
The main class to start the JGroups memcached implementation is org.jgroups.demos.MemcachedServer. It creates an L1 cache (if configured), a L2 cache (that's the default hashmap storing all entries), and a MemcachedConnector. The API is very simple and includes the following caching methods:
- public void put(K key, V val): puts a key / value pair into the cache with the default caching time
- public void put(K key, V val, long caching_time): same as above, but here we can explicitly define the caching time. 0 means cache forever, -1 means don't cache and any positive value is the number of milliseconds to cache the entry
- public V get(K key): gets a value for key K
- public void remove(K key): remove a key / value pair from the cache (L2 and L1, if enabled)
InfoQ spoke with Bela Ban about the motivation behind JGroups implementation of memcached. He said that JGroups implementation of memcached allows them to experiment with a distributed cache and see how the various caching strategies fit into JBoss Clustering. He also explained how this new memcached implementation compare with JBossCache caching framework:
We see caching as a continuum between distributing data across multiple nodes (hosts) in a cluster (without redundancy) and fully replicating data (total replication of every data item to every cluster node). Between distribution and total replication, we have buddy replication, which replicates data to a few selected backup nodes. This can be compared to RAID, where RAID 0 has no redundancy (distribution), RAID 0+1 has fully redundancy and RAID 5 has partial redundancy.
Currently, the PartitionedHashMap in JGroups provides distribution, and JBossCache provides total replication and partial replication (with Buddy Replication). The idea is to let the user define K *per data item* they place into the cluster, so K=0 means distribution, but if a node which hosts one or more stripes, crashes then the data is gone, to K=X (where X < N) which is RAID 5, to K=N which is total replication.
The memcached implementation in JGroups is a first step to experiment with K=0, which is pure data distribution without redundancy. This will eventually make it into JBossCache.
Where does memcached implementation fit in JBoss Application Server modules?
It will be part of the Clustering subsystem, provided by JBossCache. Note that our implementation is really written with "Java" clients in mind, so we don't have to use that terribly inefficient memcached protocol, with the marshalling/unmarshalling/copying overhead.
Talking about the typical use cases for using JGroups implementation of memcached, Bela said:
The server side code (e.g. servlets) running in a JBoss or Tomcat cluster, which accesses a DB and needs a cache to speed up things and remove a DB bottleneck. The other use case is similar, but instead of accessing a DB, access is to the file system. For example, an HTML page caching server (Squid comes to mind).
Are there any plans to introduce memcached into JBoss Application Server in the future.
Absolutely. The Data Partitioning feature will allow users to configure caching according to their needs. So having something like a distributed cache is not a new feature in itself, but a matter of JBossCache configuration. The cool thing is that this can be dynamic, so developers can decide which redundancy features (none=distribution, full=total replication or partial) they want per data item they put into JBossCache.
Regarding the future direction of the project in terms of new features, Bela listed the things are on the todo list:
- Provide an eviction policy based on number of bytes rather than number of elements in the cache.
- Store elements received from a remote server as byte buffers rather than objects. On first access, unmarshal the byte buffer into the object. This is used in JBoss' HTTP session replication code and has served us well: no hit on performance due to unneeded unmarshalling.
- Implement the full memcached protocol: currently I only provide GET, GET-MULTI, SET and DELETE. The other stuff (APPEND, PREPEND, CAS) is easy to implement, but I've not done it (yet) because the main use case is for Java clients to reside in the same JVM as our memcached implementation, and therefore the memcached protocol is not needed.
- Provide a better implementation of consistent hashing.
JGroups implementation of memcached and its library dependencies can be downloaded on their sourceforge website. Below is the command to launch the program:
java -jar memcached-jgroups.jar
Bela is looking for the feedback from the community. He said this is an experimental feature, but will become a supported feature of JBossCache, and community input will have a great influence on the direction of this feature.
terribly inefficient memcached protocolthat the rest of the world seems to like so much. If the ascii protocol is so terrible and inefficient, than maybe the new binary protocol? Then you could open this thing up to non-java clients. Then you've got some real comparison to memcached and its versatility. That; would be interesting.
Large Heap Handling
One of the real selling points of the C memcached is the simple and robust slab memory allocator. Does anybody have any info on how Java's GC compares. I could imagine some cases in which it could do better, since it doesn't have to actually grab and free, but in general, I worry about triggering stop the worlds constantly due to cache churn.
Re: Interesting ...
Re: Large Heap Handling
Correct, but that's a feature of Java versus C in general, and not
PartitionedHashMap in particular.
memcached uses something similar to a buddy memory allocation scheme
(), which is great, but they need to make sure they don't waste
memory. For instance, if you always allocate pages of 500 bytes, then
this mechanism is not the best, because the smaller pages won't get
used, and the larger pages are wasted, unless they get fragmented into
I'll take the stance that, unless you know exactly what the avg size of
your app's memory requirements is, the OS does a better job at
allocating memory and in addition you'll benefit from future
improvements in the mem allocator code of your OS.
memcached probably shines when you know exactly what the memory pages
sizes are and you change the src to accommodate that.
I'd also claim that even with GC, this is very useful, because the few
GC cycles are a good tradeoff against having to go to the DB.
Note that we could implement something like memcached's memory
allocator, too: grab direct memory (ByteBuffer.allocateDirect() /
MappedByteBuffer), divide it up into lists of fixed sizes (buddy pools)
and then use those buffers to store data. Direct memory is allocated
outside of the Java heap, so it will never get garbage collected, but
TBH I'm not sure this is a good idea. I mean, we're coming back to Java
versus C here. There's a reason I switched and a big part is garbage
collection and the avoidance of dangling pointers.
I've attached the doc describing memcached's memory allocation strategy.
Hanson Char wrote:
> One of the major benefits of using the native memcached is that unlike
> a JVM, GC (full GC in particular) can be entirely avoided. Wouldn't
> that benefit be lost if a memcached impl is done entirely in Java ?
Re: Large Heap Handling
A few months ago, I had written a small blog entry related to allocators - javaforu.blogspot.com/2008/05/memory-dont-forge...
You might also find this interesting - blogs.sun.com/jonthecollector/entry/our_collectors.
JVMs also have Thread level allocation buffers (TLAB) - conceptually similar to Arena allocators found in Google's TCMalloc and other such Malloc alternatives.