InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Infinispan Interview

Posted by Mark Little on May 20, 2009

Sections
Operations & Infrastructure,
Enterprise Architecture,
Development,
Architecture & Design
Topics
REST ,
SOA ,
Clustering & Caching ,
Cloud Computing ,
Java
Tags
JBoss Cache ,
JSR 107

In this interview we talk with Manik Surtani (MS below), the lead of JBoss Cache and Infinispan projects.

InfoQ: In a nutshell, what is Infinispan?

MS: Infinispanʼs an open source data grid platform.  It exposes a simple data structure - a Cache - in which you can store objects.  While Infinspan can be run in local mode, its real value is in distributed mode where caches cluster together, and expose a large memory heap.  This is more powerful than simple replication in that it distributes a fixed number of replicas of each entry - providing resilience to server failure - as well as scalability since the work done to store each entry is fixed in relation to cluster size.

InfoQ: What does this offer developers?

MS: An easy mechanism to address a very large memory heap.  If you have distribution tuned to maintaining 1 copy of each entry, and run a 100 node cluster and allocate each node a 2GB heap, you can collectively address 100GB from any instance in the grid.  And this is all in-memory, and very fast.  And Infinispan is JTA compliant so it plays nice with ongoing transactions. We also have a powerful new asynchronous API, which gives you all of the guarantees of synchronous network calls along with the parallelism and scalability of asynchronous ones.  For example,

 Future f = cache.putAsync(k, v)

allows your thread to block - by calling f.get() - to ensure the network calls succeed, or go away by ignoring f altogether.  But more important, your thread can go do something else, i.e., be useful.  And then come back later, and check whether the network call succeeded by calling f.get().  Think of it as NIO to traditional, blocking IO.

InfoQ: What about persistence?

MS: Infinispan exposes a CacheStore interface, and several high-performance implementations - including JDBC CacheStores, filesystem-based CacheStores, Amazon S3 CacheStores, etc.  CacheStores can be used for “warm starts”, or simply to ensure data in the grid survives complete grid restarts.  Or simply to overflow to disk if you really do run out of memory.

InfoQ: Why do you think this is different to other efforts?

MS: Most open source offerings are far more limited in scope - either to smaller clusters, not offering data distribution, or not offering a complete platform.  And there is the obvious difference between Infinispan and proprietary offerings. Also, as far as I am aware, the asynchronous API is unique.

InfoQ: And what are your motivations behind the project?

MS: Iʼve been the project lead for JBoss Cache for a few years now. During that time I have seen a lot of demand for an open source data grid platform. The complaints I have always had have been that the commercial ones are too expensive and not, well, open source. And that the open source offerings have always fallen short - whether in API, usability, performance, stability, or scalability - JBoss Cache included. Hence the efforts in building a spiritual successor to JBoss Cache, but with much wider scope, greater goals.

InfoQ: Is this something that will only be useable within other JBoss projects, or can I use it elsewhere?

MS: All you need is a Java 5 compatible JVM. And being LGPL licensed, it is business and OEM friendly.

InfoQ: What is a data grid?  How does it differ from a cloud, if at all?

MS: From my experience, clouds tend to refer to the provisioning on-demand of computing resources. This would include storage, processors, operating systems, memory. The current fashion is to use virtualization for this. Data grids are more of a service. A uniform sea of memory, spanning several servers. Typically, data grids would be deployed on top of a cloud.

InfoQ: So when's the right time to use a grid?  And the wrong times?

MS: Any time you find that a database is becoming an unbearable bottleneck - and it usually becomes one pretty quickly as you scale out - use a data grid. :-) Data grids scale very well. In addition, if you use a compute grid to process tasks in parallel, you usually want a data grid superimposed as well, to provision the state for the compute grid to work off. I have seen data grids used for message passing though, this is a definite no-no. This can put a lot of unnecessary pressure on nodes where keys get mapped to. If you need to use a distributed tool for message passing, use JMS. Thatʼs what JMS is optimized for.

InfoQ: How does Infinispan relate to JSR-107, and JBoss Cache?

MS: Infinispanʼs Cache interface tracks the ongoing developments in JSR-107 and is, as such, compliant with the current snapshot of the specification. Infinispan implements all optional parts of JSR-107, including JTA compliance and clustering. Infinispan bears no relationship to JBoss Cache - except in some design features and perhaps a few reusable classes that were copied over. Fundamentally, though, Infinispan is all-new.

InfoQ: So does Infinispan need to run in a cluster?

MS: No. It is a perfectly viable and very high-performance local-mode cache as well. Weʼve implemented state-of-the-art concurrent container algorithms as our core, with minimal use of mutexes such as locks and synchronized blocks. Infinispan performs very well on multi-CPU and mult-core servers under high concurrency. The eviction algorithms are designed to perform well under high concurrency as well.

InfoQ: What else is new and cool on Infinispan's roadmap?

MS: There is a lot of cool stuff coming up, in addition to what I have mentioned above. People should follow the Infinispan roadmap on the project page for more details, but Iʼll mention two features I think are most exciting.

 

  • We have an NIO-based server module on the roadmap. This will speak 2 protocols - a memcached-compliant RESTful one, and a custom binary one. The first protocol will allow any existing memcached client - in any language or platform - to work with Infinispan, widening Infinispanʼs appeal beyond just Java. The second binary protocol will contain additional information such as server cluster topology and consistent hash function, to allow for “smart clients” which could handle load balancing and failover. Weʼd provide a Java client for this, I expect to see more clients come up for other platforms.
  • We also have a powerful Query API on the roadmap. Cached state can optionally be indexed, allowing the entire grid to be searched. This would typically happen in parallel, as each node receives and performs the query on its locally cached state. And returns results. Yes, it does look like Map/Reduce. :-)

InfoQ: Thanks for taking the time to talk with us Manik. More information can be found on the Infinispan project page.

  • This article is part of a featured topic series on SOA

No comments

Watch Thread Reply

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.