BT

MongoDB, Java and Object Relational Mapping

Posted by Brian C. Dilley on Apr 30, 2012 |

MongoDB Introduction

Today's NoSQL landscape includes a number of very capable contenders tackling big data problems in many different ways. One of these contenders is the very capable MongoDB. MongoDB is a document-oriented schema-less storage solution that uses JSON-style documents to represent, query and modify data.

MongoDB is well documented, easy to install and setup and just as easy to scale. It supports familiar concepts like replication, sharding, indexing and map/reduce. The MongoDB open source community is very large and active. MongoDB boasts many large and high-traffic production deployments including Disney, Craigslist, Foursquare, Github and SourceForge. MongoDB is an open source project created and maintained by 10gen.com, a company founded by former DoubleClick execs. In addition to the superb community support (in which 10gen participates), 10gen offers commercial support. 

 

MongoDB and NoSQL: Pitfalls and Strenghts

MongoDB has the advantage of being a very approachable NoSQL solution. When I first delved into the NoSQL database world I sampled a number of Java based solutions and found myself taking a lot of time figuring out what column families were, what Hadoop's relationship to HBase is and what exactly is a ZooKeeper? While I eventually figured it all out and also found that offerings like Cassandra and HBase are obviously very solid and very provoen solutions to the NoSQL conundrum. MongoDB was easier to grasp with less concepts to overcome before I could start writing code compared to other solutions.

Like any software, MongoDB is obviously not without it's flaws. During my time spent with MongoDB I've come across a few things that I would consider "Gotchas":

  • Don't treat it like an RDBMS. This may seem obvious, but MongoDB makes it so easy to create and execute complex queries that you may find yourself going overboard and run into issues with performance when trying to use it for real time queries. (like i did)
  • MongoDB's indexes are binary trees. If you aren't familiar with what a b-tree is, you should probably look it up. The order in which you provide your query criteria needs to match the order in which you've created your index.
  • Design your indexes carefully. This ties into the B-tree bullet point above. My first few indexes contained many fields from the document, you know - "just in case" I needed to query on them. Don't make the same mistake. One of my indexes on a pretty small collection (~10 million records) grew to over 17GB in size, larger than the collection itself. You probably don't want to index an array field if it's going to contain hundreds or thousands of entries.
  • MongoDB takes a very interesting approach to addressing NoSQL; it uses BSON for storage, JSON for representation, and JavaScript for administration and Map/Reduce. As a result odd little issues like this one (broken equals operator for NumberLong) are bound to crop up now and again while MongoDB catches up in age to the more popular big data solutions. 

MongoDB, Console and Drivers

Administration of MongoDB is typically done using a JavaScript client console application making complex tasks like data migration and manipulation a breeze and programmed completely using the JavaScript programming language. In this article, we will show examples of using this console. There are a myriad of production quality MongoDB clients offered today that the MongoDB community refers to as drivers. Typically there exists a driver per programming language and all of the popular programming languages are covered, as well as some of the not-so-popular. This articles shows using the Java driver for MongoDB and compares it to using an ORM library (MJORM).

Introducing MJORM: an ORM solution for MongoDB

Among the many interesting problems to solve that the recent trend in NoSQL data stores have brought to the lives of application programmers is Object Relational Mapping. Object Relational Mapping (ORM) refers to the mapping of persisted data, traditionally stored in an RDBMS, to objects used by the application. This makes working with the data more fluid and natural to the language that the application is written in.

MongoDB's document-oriented architecture lends itself very well to ORM as the documents that it stores are essentially objects themselves. Unfortunately there aren't many Java ORM libraries available for MongoDB, but there are a few like morphia-(A type-safe Java library for MongoDB), and spring-data( the MongoDB implementation of the Spring Data umbrella project).

These ORM libraries make heavy use of annotations, something that is not an option for me for a number of reasons, the most important being the portability of the annotated objects across many projects. This lead me to start the mongo-Java-orm or "MJORM" (pronounced me-yorm) project; a Java ORM for MongoDB. MJORM is MIT licensed and available as a google code project. The project is built with maven and the maven artifact repository is currently hosted by the google code subversion server. As of this writing MJORM's latest stable release version is 0.15 and is being used by a few projects in a production environment.

Getting started with MJORM

Add the MJORM library to your project

Maven users will first add the MJORM maven repository to their pom.xml file to make the MJORM artifacts available to their projects:

<repository>
	<id>mjorm-webdav-maven-repo</id>
	<name>mjorm maven repository</name>
	<url>http://mongo-Java-orm.googlecode.com/svn/maven/repo/</url>
	<layout>default</layout>
</repository>

And then the dependency itself:

<dependency>
	<groupId>com.googlecode</groupId>
	<artifactId>mongo-Java-orm</artifactId>
	<version>0.15</version>
</dependency>

This will enable you to import and use the MJORM classes in your application. If you're not using maven then you will need to download the MJORM library manually along with the dependencies listed in the MJORM pom.xml.

Create your POJOs

Now that the dependencies are in place it's time to start writing code. We'll start with our Java POJOs:


class Author {
	private String firstName;
	private String lastName;
	// ... setters and getters ...
}

class Book {
	private String id;
	private String isbn;
	private String title;
	private String description;
	private Author author;
	// ... setters and getters ...
}

What we've described with this object model is that authors have an ID, a first name and a last name while books have an id, ISBN number, title, description and an author.

You may have noticed that the book's id property is a String, this is to accommodate MongoDB's ObjectId type which is a 12-byte binary value represented as a hex string. While MongoDB requires that every document in a collection have a unique id, it doesn't require that the id be of type ObjectId. Currently MJORM only supports ids of type ObjectId and represents them as Strings.

You also may have noticed that the Author object doesn't have an id. This is because it will be a sub document of the Book document and is therefore not required to have an id. Remember, MongoDB only requires ids on the root level documents within a collection.

Create XML mapping files

The next step is creating the XML mapping files that MJORM will use to map MongoDB documents to these objects. We'll create a document per object for this demonstration, but it is perfectly reasonable to put all of your mappings into a single XML file or separate them as you see fit.

Here's Author.mjorm.xml:

<?xml version="1.0"?>
<descriptors>
	<object class="Author">
		<property name="firstName" />
		<property name="lastName" />
	</object>
</descriptors>

And: Book.mjorm.xml:

<?xml version="1.0"?>
<descriptors>
	<object class="Book">
		<property name="id" id="true" auto="true" />
		<property name="isbn" />
		<property name="title" />
		<property name="description" />
		<property name="author" />
	</object>
</descriptors>

 

The mapping files are fairly self explanatory. The descriptors element is the root element and must be present in every mapping file. Beneath it are object elements that define each class that is being mapped to a MongoDB document. objects then contain property elements that describe all of the properties on the POJO and how they map to properties on the MongoDB document. A property must contain a name element at a bare minimum, this is the name of the property on the POJO and the name of the property on the MongoDB document. Optionally a column attribute can be added to specify an alternate property name on the MongoDB document.

propertys with the id attributes are considered to be the unique identifier for the object. An object may only contain one property element with an id attribute. The auto attribute tells MJORM that it should auto-generate a value for this property when persisting it.

Head over to the MJORM project website on google code for a more detailed description of the XML mapping file.

Putting it all together

Now that we've created our data model and created our mapping files to tell MJORM how to marshal and un-marshal our POJOs in and out of MongoDB we can start with the fun stuff. First we must open our connectionion to MongoDB:

Mongo mongo = new Mongo(
	new MongoURI("mongodb://localhost/mjormIsFun")); // 10gen driver

The Mongo object comes from the Java driver written by the guys over at 10gen. This example opens a connection to a local MongoDB instance and uses the mjormIsFun database. Next we create our MJORM ObjectMapper. Currently the only implementation of the ObjectMapper interface available in MJORM is the XmlDescriptorObjectMapper that uses the XML schema described above although future implementations of MJORM may include support for annotations or other configuration mechanisms.

XmlDescriptorObjectMapper objectMapper = new XmlDescriptorObjectMapper();
mapper.addXmlObjectDescriptor(new File("Book.mjorm.xml"));
mapper.addXmlObjectDescriptor(new File("Author.mjorm.xml"));

We've created our XmlDescriptorObjectMapper and added our mapping files to it. Next we create an instance of the MongoDao object provided by MJORM:

DB db = mongo.getDB("mjormIsFun"); // 10gen driver
MongoDao dao = new MongoDaoImpl(db, objectMapper);

What we've done first is get an instance to the 10gen driver's DB object. After that we create our MongoDao providing it the DB object and the ObjectMapper that we created earlier. We're ready to start persisting data, lets create a Book and save it to MongoDB

Book book = new Book();
book.setIsbn("1594743061");
book.setTitle("MongoDB is fun");
book.setDescription("...");

book = dao.createObject("books", book);
System.out.println(book.getId()); // 4f96309f762dd76ece5a9595

First we created the Book object and populated, after that we called the createObject method on the MongoDao passing it the collection name "books" and our Book object. MJORM then turns the Book into a DBObject (the underlying object type that 10gen's Java driver uses) using the XML mapping files that we created earlier and persists the new document into our "books" collection. Then MJORM returns your instance of the Book object but now with it's id property populated. It is important to note that by default MongoDB doesn't require that you create databases or collections before using them; it creates them when needed, this can sometimes lead to confusion. A look at this new Book in the MongoDB console may look similar to this:

> db.books.find({_id:ObjectId("4f96309f762dd76ece5a9595")}).pretty()
{
	"_id":          ObjectId("4f96309f762dd76ece5a9595"),
	"isbn":         "1594743061",
	"title":        "MongoDB is fun",
	"description":  "..."
}

 

Lets take a look at what that createObject would look like if we were not using MJORM and instead using 10gen's Java driver directly:

Book book = new Book();
book.setIsbn("1594743061");
book.setTitle("MongoDB is fun");
book.setDescription("...");

DBObject bookObj = BasicDBObjectBuilder.start()
	.add("isbn", 		book.getIsbn())
	.add("title",		book.getTitle())
	.add("description",	book.getDescription())
	.get();

// 'db' is our DB object from earlier
DBCollection col = db.getCollection("books");
col.insert(bookObj);

ObjectId id = ObjectId.class.cast(bookObj.get("_id"));
System.out.println(id.toStringMongod()); // 4f96309f762dd76ece5a9595

 

We can now query for the object:

Book book = dao.readObject("books", "4f96309f762dd76ece5a9595", Book.class);
System.out.println(book.getTitle()); // "MongoDB is fun"

The readObject method reads a document by it's id from the given collection, turns it into the appropriate class (again, using our mapping files from earlier) and returns it.

An astute reader will have noticed that our Book doesn't have an Author, yet it was still persisted. That is due to MongoDB's schema-less nature. We can't require that a document in a collection contain any properties (other than the _id property) so creating a Book without a Author is perfectly OK by MongoDB. Lets add an Author to our book and update it it:

Author author = new Author();
author.setFirstName("Brian");
author.setLastName("Dilley");

book.setAuthor(author);

dao.updateObject("books", "4f96309f762dd76ece5a9595", book);

And now our Book contains an Author and it is persisted in MongoDB. Now lets have a look at our Book in the MongoDB console:

> db.books.find({_id:ObjectId("4f96309f762dd76ece5a9595")}).pretty()
{
	"_id":          ObjectId("4f96309f762dd76ece5a9595"),
	"isbn":         "1594743061",
	"title":        "MongoDB is fun",
	"description":  "..."
	"author": {
	    "firstName": "Brian",
	    "lastName": "Dilley"
	}
}

As you can see, our persisted Book now contains an author. Here's the same thing again without MJORM:

Author author = new Author();
author.setFirstName("Brian");
author.setLastName("Dilley");

book.setAuthor(author);

DBObject bookObj = BasicDBObjectBuilder.start()
	.add("isbn", 		book.getIsbn())
	.add("title",		book.getTitle())
	.add("description",	book.getDescription())
	.push("author")
		.add("firstName", 	author.getFirstName())
		.add("lastName", 	author.getLastName())
		.pop()
	.get();

DBCollection col = db.getCollection("books");
col.update(new BasicDBObject("_id", bookObj.get("_id")), bookObj);

 

An in depth description of all the methods provided by the MongoDao is beyond the scope of this article. Anyone interested in using MJORM in their own projects is urged to take a look at the documentation provided by the MJORM project or the MongoDao interface that it provides.

Conclusion

Hopefully this article has sparked some interest in MongoDB and MJORM. MongoDB is a an excellent NoSQL data store with a huge number of awesome features and is sure to be around for a very long time. If you end up using it in a Java project than you may also consider using the MJORM library for your ORM needs, and if so any feature request, bug reports, documentation or patches to the source code would be greatly appreciated!

Author Bio

Brian Dilley is an experienced senior engineer and team leader with over thirteen years of experience who specializes in Java/Java EE /Spring Framework/Linux internals and admin. Brian has a lot of experience in ground level (0+ employee) internet startup companies, getting them to market, and building/maintaining their product. He is an expert in IaaS, cloud, PHP, and Linux admin from procurement, installation and configuration of production and corporate hardware and software infrastructure including load balancing, database, web, etc. You can follow Brian on twitter at Twitter.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Great article by Richard Hightower

Brian, Great article. I really like how you brought in your own personal experience with MongoDB and NoSQL into the mix. It is nice to see that the honeymoon phase of NoSQL is over, and we can start honestly talking about the pitfalls and engineering tradeoffs. Also, as I suspected, MongoDB is a great on ramp to NoSQL, it is easy enough to get going and powerful enough for a lot of applications.

I think they hit a nice sweet spot that really resonates with a lot of developers www.indeed.com/jobtrends/MongoDB.html.

Re: Great article by Shi Kafune

springframework also provides a client for working with mongoDB, its a good option if you are already using spring in your project.

It also provides an easy way to mix jpa and nosql (like document db, graph db).

Should we call it ORM? by Roopesh Shenoy

Obviously the framework is providing a sweeter API to deal with objects in MongoDB than the original driver, but the term ORM to describe it seems wrong - after all there is no relational data here!

Maybe use some other name? Database mapper?

Re: Should we call it ORM? by Jean-Baptiste DUSSEAUT

I started a similar project : www.mongolink.org
We called it an Object Document Mapper

By the way, why using xml for mapping declaration ? It seems to me that a good api is far more expressive, and do not resist to refactoring.

Re: Great article by sreenath venkataramanappa

Spring API for MongoDB was changing so fast and was not stable enough. We moved on with Morphia which is decent API. Th esad part of Morphia is the guy who was heading this framework joined MongoDB.

Re: Should we call it ORM? by brian dilley

I'm actually in the process of adding Annotations support to the library as well. The way MJORM is built it makes it easy to create alternate implementations.

I'm also in the process of introducing a new query API that should prove useful.

Advantages of MJORM by Adams Thomas

Hi,

I'm using morphia, but from the article i still have no idea what's the difference esp. to morphia. And also
i would like to to know where it is used already?

c. Thomas

I made piece with Morphia annotations by Oyku Gencay

I see your point on not wanting to use annotations. I do have the same reservations. On the other hand, benefits of using Morphia is huge in terms of ease of development and querying, that I'd rather include a couple of annotations like @Id and @Entity and dependency to Morphia than not using it. Also if I want portability in persistence I'd rather go with a persistence framework like Spring-Data where other abstractions are required at several different layers. It's a cuts a whole layer of bureaucracy to use both Morphia and JAXB annotations on the same domain objects. This is just like pre JPA hibernate vs. annotated JPA.

we are live with MongoDb by sreenath venkataramanappa

Really interesting by Terence Alderson

I've been hearing a lot about these nosql repositories lately. How does having a reverse mapping work? Such as having an author with a collection of books... or am I thinking about this to relationally? <-- my word I know it's wrong. Would you need a separate record in authors ?
thanks for the great article.

Re: I made piece with Morphia annotations by Sidharth Gopalan

I see your point on not wanting to use annotations. I do have the same reservations. On the other hand, benefits of using Morphia is huge in terms of ease of development and querying, that I'd rather include a couple of annotations like @Id and @Entity and dependency to Morphia than not using it. Also if I want portability in persistence I'd rather go with a persistence framework like Spring-Data where other abstractions are required at several different layers. It's a cuts a whole layer of bureaucracy to use both Morphia and JAXB annotations on the same domain objects. This is just like pre JPA hibernate vs. annotated JPA.


It is very clear that we want to avoid the use of Annotations. With Morphia it is not possible. I personally hate using annotations. I somehow feel that when I annotate my domain, it is kind of confined to the DAO layer. I want my domain object to be used up until my GUI layer. For that reason, I'm better off with XML configuring my domain objects.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

11 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT