Emmanuel Bernard, the developer behind Hibernate Validator, Hibernate Search, among others, recently announced the birth of Hibernate OGM. The new framework's goal is to provide a common interface for NoSQL datastores using JPA constructs. OGM stands for Object Grid Mapping.
InfoQ caught up with Emmanuel Bernard to talk to him about Hibernate OGM and which NoSQL backing stores they planned on supporting. Emmanuel said they started with Infinispan as the team was able to collaborate readily, but plan to support others. Infinispan and Hibernate OGM are both JBoss initiatives. Infinispan's was a good first choice because its transactional model is close enough to relational databases to bridge JPA readily.
Hibernate OGM is nascent, but they plan on supporting other NoSQL implementations. For example, the EhCache team plans on being a Hibernate OGM provider, and is working with the Hibernate OGM team to provide enhancements and abstractions to Hibernate OGM as needed. Emmanuel mentioned that there is interest around contributions for MongoDB, CouchDB and Redis. He hopes support for these will get under way soon. He hopes other projects and individuals will pitch in to support key/value stores, document oriented databases, column oriented databases and graph oriented databases.
Like Infinispan, Cassandra and Voldemort, let you store Apache Lucene indexes in the datastore. The Hibernate OGM JP-QL engine implementation relies on Lucene and Hibernate Search so those might be a natural choice for Hibernate OGM early adoption as well. The NoSQL stores that do not have such support will need to use a different strategy to implement querying.
InfoQ: how hard will it be for a developer who is familiar with JPA and MySQL to use HibernateOGM and Infinispan? What is the learning curve going to be?
It is very easy! And that was our goal.
The programmatic model and the semantic are literally the same. We are not talking about a JPA-like API. Hibernate OGM is a full-blown JPA engine. We already support most of the mapping constructs and CRUD operations (including entity hierarchy, associations etc). If you need to convert a JPA application using Hibernate Core into one using Hibernate OGM, you need to do the following steps:
- add Hibernate OGM jar and its dependencies in your application
- edit your persistence.xml and change the provider to Hibernate OGM, remove JDBC-like properties (like the JDBC driver, the dialect the schema generation) and add a pointer to the Infinispan configuration file
- run your app
(Here is an) example of persistence.xml before / after.
That's it. The limiting factor today is JP-QL. Alpha 2 does not have JP-QL support yet and the next version (say alpha3) will have support for simple JP-QL queries. However, if you are familiar with Hibernate Search, you can already use full-text queries (all of it).
The cool part of the project is that we reuse most of Hibernate Core for JPA's CRUD support and this gives us a huge engine maturity. Don't expect weird JPA related bugs or at least expect them in Hibernate Core as well.
InfoQ: When should you use JPA/RDBMS and when should you use JPA/NoSQL? Is there a cookbook or a particular set of use cases where one makes more sense than the other?
I won't pretend to know the answer to this. Frankly, the industry at large is trying to figure this out. That being said let me try and make a fool of myself. First of all, if relational database gives you satisfaction in your project(s), stick with it by all means. NoSQL is a set of very different tools that cover situations where current relational database engines fall short. Graph oriented DBs shine for any graph related queries (give me the friends of my friends of my friends who live in Paris). Data Grids for examples are used when low latency and transactionality are key and when data size is not humongous. BigTable clones are good when size matter (i.e. a lot of entries amounting for a lot of data).
Emmanuel added that using the JPA programmatic model will be orthogonal to the choice of NoSQL solutions. JPA does not fit all NoSQL use cases. Applications that use domain models will work nicely with Hibernate OGM. An obvious use case for JPA/NoSQL is to take load off of relational databases. Hibernate OGM is a success if it encourage developers to try and explore NoSQL solutions.
InfoQ: What are some simple queries that might be supported in the short term? Given the mismatch of the relational model to most NoSQL stores how much of JPA can you support?
The mismatch between traditional relational databases and NoSQL stores will show up I think in two big areas:
- in the transaction and recovery model
- in how associated data is stored (and subsequently accessed)
For the transaction differences, I think Hibernate OGM should not try and mask these differences but rather embrace the underlying transactional model and make the user aware of it. Doing otherwise would be a mistake as it would denature each NoSQL solution.
He went on to say that dehydrated entities and associations will probably be different depending on the underlying datastore family. He thinks JPA's relationship model will fit many NoSQL stores quite naturally. The mismatch eluded to in the question is not really a problem in JPA because JPA can have two associated entities with independent lifecycles and it can have embeddable objects or even collections of embedded objects which is like a document oriented model. In Hibernate OGM, the schema is hosted by the domain model and can be decorrelated from the actual object structure as to fit schemaless datastores.
InfoQ: A common complaint against Google App Engine's JPA support is that it felt a bit forced compared to JPA on RDBMS systems, which is similar in that it is JPA talking to a NoSQL store. How natural will Hibernate OGM feel to developers familiar with JPA and RDBMS? And have you looked at some of the complaints and problems that developers have run into with Google App Engine's JPA support?
I think the JPA pains in Google App Engine are a mix of BigTable storage limitations, query engine ones and time constraint on the GAE/J team. The guys behind GAE/J are pretty smart and quite a small team for what they have accomplished and they cannot be on all fronts.
In Hibernate OGM (so far), you will more see performance limitations (i.e. excessive key lookups in some situations) than non supported features. Of course, our initial support for JPA-QL is far from complete and people will have to bear with us for a little while. Our goal is to be natural to JPA developers. That being said, we won't be able to run contrary to the underlying NoSQL engine ability and strengths.
InfoQ: How does Hibernate OGM compares to SQLFire?
I won't go into details because I don't know the product well but if I had to summarize:
- it's only for GemFire thus not NoSQL (or even data grid) agnostic
- it's not open source ( end of the story ;) )
When we worked on Hibernate OGM and Infinispan we played with the idea of supporting JDBC as well. We might do it down the road but I see JPA and a more interesting level of abstraction (associated entities vs embeddable objects, etc). You can see Hibernate OGM as a denormalization engine while keeping duplicated pieces of data consistent for you. That's a huge win to best tailor and optimize your data access patterns. We will offer some declarative ways to denormalize your data. This is something you cannot achieve that naturally at the relational layer.
Since a lot of developers use Hibernate and JPA, NoSQL support seems like a natural addition to the Hibernate framework. Hibernate OGM could further drive the adoptoin of NoSQL by unifying the interface to NoSQL implementations, but the question will be how well does object mapping and JPA-QL translate to the various NoSQL implementations.