Java In-Memory Data Grid Hazelcast 3.0 Supports Continuous Queries and Entry Processing
The latest version of open source Java In-Memory Data Grid Hazelcast supports entry processing, multi-thread execution, continuous queries and lazy indexing. Hazelcast version 3.0, released during JavaOne Conference two weeks ago, represents the largest change to the product since it was created in 2008 and the effort involved rewriting 70-80% of the code. They have also re-implemented all of the existing distributed objects like map, queue, executor service using Service Provider Interface (SPI).
With the new multi-thread execution, the operations are now executed by multiple threads (by factor of processor cores) which helps with scaling up on multicore machines. The new SPI allows for developing new partitioned services, data structures. All Hazelcast data structures like Map, Queue are reimplemented with the SPI.
Other technical features of Hazelcast 3 include:
Entry Processing: A new feature in Hazelcast 3, called Entry Processing, enables fast in-memory operations on a Map without having to worry about locks or concurrency issues. The EntryProcessor is a function that can modify or replace the value of a map entry. It can be applied to a single map entry or on all map entries. The future releases will also add support to select target entries using a Predicate, like "search and replace". EntryProcessor can be combined with another new feature in Hazelcast: the in-memory-format setting. By default, the entry value is stored as a byte array (binary format), but when it is stored as an object (object format) then entry processor is applied directly on the object. Another feature of the EntryProcessor is that it automatically gains exclusive access to the map entry, without the need for synchronization to prevent lost updates.
Serialization: As an alternative to the existing serialization methods, Hazelcast offers a Portable serialization that has the advantages like support for multiversion of the same object type and querying and indexing support without de-serialization and/or reflection. There is also IdentifiedDataSerializable, a slightly optimized version of DataSerializable that doesn't use class name and reflection for de-serialization. Hazelcast also lets the developers to plug a custom serializer to be used for serialization of objects.
Continuous Query: This feature enables programmers to set queries that trigger upon any matching data during an addition/update/remove/evict event. This is done by using the listeners that register with a query and are notified when there is a change on the Map that matches the Query. This would be useful for supporting use cases like Complex Event Processing (CEP) that typically require separate products.
Lazy Indexing: With the Lazy Indexing feature, developers no longer need to add indexes at the very beginning any more. Indexes can be added to the entries at any point.
Distributed Transactions: Hazelcast 3 support distributed transactions with two phase commit. The new transaction API supports both 1-phase (local) and 2 phase transactions.
InfoQ spoke with Fuad Malikov, co-founder of Hazelcast about the features in the new release.
InfoQ: What was the motivation to add the support for distributed transactions in the new version? How does this work in in-memory data grids compared to 2PC transactions in relational databases? Are there any limitations to this feature?
Fuad: We listen to community a lot. Hazelcast users are asking for a 2 phase commit (2PC) transaction capability from Hazelcast. They want to be able to consume an item from one distributed queue, process it and save an entry into another distributed Map for example. The whole process needs to be transactional so that no unprocessed data can be lost in a node failure.
Hazelcast is all in-memory solution and by default it relies on multi-node memory replication for durability. This is also true for the 2PC implementation. In prepare state we replicate the transaction state on multi nodes.
In the next release Hazelcast will also be able to participate in XA transaction via JCA with other resources such as JMS and JDBC.
InfoQ: Can you explain how the new continuous query feature in Hazelcast v3 works?
Fuad: This is another one of those features for people to realize that an in-memory data grid is much more than a cache. Continuous Query provides the convenience and distributed processing power of the age-old idea of "stored procedures" but it's much more like a cousin of the modern Complex Event Processing (CEP) paradigm. Unlike database stored procedures, continuous query keeps application logic cleanly in the application tier in Java, but it has the positive qualities of being extremely scalable and ensuring that the processing is happening where the data is, making it very fast and efficient.
Our Continuous Query implementation combines Events and our Predicate API. Hazelcast supports EntryListeners that listens to the when items are ADDED, UPDATED, REMOVED or EVICTED from a Map. In previous versions it was possible to listen to all entries of the Map or to a particular key. In addition to that, with the Continuous Query feature, you can define a query (Predicate) and the events will be fired only if the updated entry matches the predicate. This way a listener receives a continuous stream of events based on a query.
InfoQ: What are the new features and enhancements that the developers can expect to see in the next release of Hazelcast?
Fuad: 70-80% of Hazelcast was rewritten to create Hazelcast 3, which enabled us to support some major architectural modularization. One such feature that is really exciting in the next release is the Service Provider Interface (SPI). We modularized the internals of Hazelcast into Networking, Clustering, Partitioning and Services. And the internals are exposed as an SPI so that the community can extend Hazelcast and develop custom distributed data structures and processing services. Another feature will be portable client protocol. This protocol enables anyone to implement a client in any language. We will be releasing C++ client and expect the community to implement other clients such as Python and Ruby.
Hazelcast 3 can be downloaded on the product website under the Apache 2 license.