DataMapper Reaches 1.0 Milestone
DataMapper, an Object Relational Mapper (ORM) for Ruby, arrived at a milestone recently reaching 1.0 status. The release, announced during RailsConf 2010, and coinciding with Dirkjan Bussink's presentation at the conference on the subject.
DataMapper (DM) has been a community project maturing for the past couple of years. DM offers and ORM which is fast, thread-safe and feature rich. InfoQ had the opportunity to speak with Dan Kubb, lead developer on the DataMapper project.
DataMapper was released as version 1.0, a significant leap from the previous release of 0.10. Dan offered some insight about this milestone:
The choice to go with 1.0 was pretty difficult. On one hand you want 1.0 to be perfect and done, but on the other hand you realize no software is every finished nor perfect, and you use other criteria to make the decision. In our case, we decided that 1.0 meant that the API was stable meaning that there aren't any planned changes for the public API that everyone uses, and the "semipublic" API that plugin authors use. There may be additions, but the main thing is that code written to work with the 1.0 API should work with only minor changes until 2.0 is released. When DM was new (just over 2 years ago), the API was in constant flux, but over the last 6-12 months things have settled down and we are really happy with how the API works now.
In reality DataMapper 1.0 is probably where most Ruby open source projects are when they hit version 2.0 or 3.0. We have a tremendously active community behind DM, with about 150 DM related gems and over a thousand developers who have committed to one or more of those gems. There are about 40 adapters that allow DM to use different storage engines to persist the data. We've got the usual RDBMS adapters you'd expect, NoSQL systems and even some web services like Salesforce.
DataMapper's design allows us to decouple a lot of the storage concerns from the front-end API that you interact with. It provides some basic predicates for matching objects, but adapter authors are free to implement all or some of those that make sense to the storage engine. Most of the official DM plugins make no particular assumptions about the storage engine and should work across the board with all the adapters.
DataMapper takes a different implementation approach to other ORM such as ActiveRecord, which is what Rails uses by default. When asked what features a Ruby developer might find compelling over other ORM’s, he explained:
I think there are a lot of features developers will find compelling, but I'll start with the thing most people who have only worked with ActiveRecord notice fairly quickly and that is how DM is declarative by nature. In DM every property and relationship is declared in the model, and that is used as the authoritative source rather than the database. You specify the constraints and behavior in one place and everything is reflected from the model and used by migrations, validations and other things. So for example take this simple case:
property :id, Serial
property :name, String, :length => 1..30, :required => true, :unique => true
Assuming you're using an RDBMS, when you call DataMapper.auto_migrate! a table called "contacts" will be created with two columns. The "id" column will be set as the primary key and will be auto-incrementing. The "name" column will be a CHAR(30), set to NOT NULL, and have a unique index on it. In addition, there will be validations setup that make sure id and name are present, id is an Integer, name is a String, the name length is between 1 and 30 characters long, that name is not nil, and that the name is unique.
In ActiveRecord this requires you to move between multiple files, remembering to keep things in sync between the DB and validations as they're developing. What happens in a lot of cases is developers just end up not bothering with all that and specify the validations only, treating their DB as a "dumb" data store, while DM makes that fairly painless. We encourage people to use the strengths of the underlying data store and not treat it as a dumping ground for data.
When I used to develop with AR before coming to DM, I would always forget what fields are in my DB, so I would run plugins like annotate_models to add comments to the top of my models showing the current DB schema. In effect I was duplicating exactly what you declare with DM, except with none of the benefits.. I still had to specify all my validations and initial migrations.
The popularity of Ruby on Rails 2.x and with the completely rewritten Rails 3 on the near horizon, the subject of integrating DM with Rails projects came up. Any developer considering steering away from the “normal” way of doing things needs to weigh the possible benefits of changing from ActiveRecord versus how easy the implementation will be. We discussed using DM with Rails and it turns out to be a very active development effort:
There are quite a few people working with DM and Rails 2 and 3 and it works quite well. The Rails 2.x gem is called rails_datamapper, and the Rails 3 gem is dm-rails. We decided to split the plugins in two rather than supporting 2 and 3 in a single gem because of the rapid development happening in Rails 3, and making sure we stay in sync with it. Several DM developers track Rails 3 edge and ensure dm-rails works perfectly with it.
Our primary focus at the moment is definitely getting DM 1.0 to work with Rails 3 as well as it does with ActiveRecord. The rails-core team has gone to great lengths to make Rails 3 ORM agnostic, and worked with the DataMapper core team to ensure we had the hooks we needed to make everything work properly. In the past, with Rails 2, we had to do a lot of hacks to get things to integrate, but that's no longer the case anymore.
We're really happy with how dm-rails has been shaping up, and in the future some of it's "guts" will be factored out and put into a gem which we can use across all web frameworks to provide consistent rake tasks, model (re)loading, etc. This allows the glue to make DM work with each web framework to be simplified even further.
The idea of using DM instead of ActiveRecord in a Rails project raises questions of how easy it will be for a developer to adapt to conventions used by an alternative ORM. We wanted to be clear that DM completely replaced ActiveRecord in a Rails project and not just complimented it:
Yes, that's correct. With DM, you include a module into your existing classes. Most of the AR finders have equivalent finders in DM, and in some cases the equivalent is simpler than with AR. We support all the same validations, and there are over 150 DM related gems available. While the syntax is different in some cases, I'm pretty confident DM can do all of what AR does out of the box with or without plugins.
One of the nice features of DM is how all the property and relation is declared in the model. This seemed to imply Rails migrations would no longer be needed but Dan was quick to be clear how migrations are still needed:
This is partly true. Migrations are still required once you have your app in production, because you have to tell the DB how to change itself in a way that preserves information. Things like auto-migrations are destructive in that they drop all the tables and recreate them to match your currently declared models, and auto-upgrading adds new tables, columns and indexes but it's not smart enough to do things like renaming, deleting columns or decomposing a column into more than one column. I'm not entirely convinced it would ever be possible to handle those things in a smart way without being explicit about what you want in a migration. However, I do think migrations can be simplified if you use auto-upgrading as part of your migration process to handle additive changes, while using explicit migrations to handle destructive and ambiguous cases.
With that said, we are considering a way to *generate* migrations that could simplify things even more. It should be possible to diff the models against the database and generate migrations that perform the change. A developer would have to review the migrations to make sure they do what you intended, and deploy them, but writing migrations from hand is probably not something that you'd have to do anymore. So we'd cleanly handle the additive changes, destructive and ambiguous changes all with minimal effort from the developer.
DM introduces the notion of automigrations which gives developers some advantages over the typical Rails migrations:
For rapid development it's definitely a huge win. Being able to do TDD and just flipping between your model and specs to add new properties is so much easier than moving between models, migrations and specs, and then breaking your flow to migrate a table down then up to apply each change to the DB.
One of the most attractive aspects of this projects for developer should be the abundance of plugins which can be used with DM to add functionality. These plugins include accessing various data stores including CouchDB, REST storage and Google AppEngine. The ability to use DM to access the Google AppEngine data store is particularly interesting, allowing developers to create applications running with JRuby on the Google infrastructure. When asked about this:
I've been involved sort of on the periphery of the dm-appengine-adapter so I probably can't provide a lot of detail on it. I have spoken with John Woodell, who works at Google on the App Engine project, at a few conferences about the adapter and his challenges which has influenced the development of the DM adapter API.
From what I've heard it works quite well, and supports most of the DM query system. From Google's point of view it's a win over developing their own object mapper for App Engine, which I know was started with a project called Bumble, but as far as I know it never really got any traction. With DM there's a large development community, and lots of plugins and the only challenge is just developing the initial adapter which from what I heard wasn't too difficult. I've certainly developed DM adapters for specific storage engines in a fraction of the time it would take to write an entire object mapper from scratch.
This sort of leads me to one benefit of DM that no one really talks about. As each storage engine is added to DM, feedback is made to the DM core team which is eventually reflected in the next release of the adapter API. So each adapter that is created it becomes easier to add more adapters with more capabilities. For example there is currently work on a MongoDB adapter, and it has a few concepts that aren't supported by DM, but will be affecting DM in the future. The developer of this plugin is helping update DM to add support. Things like Embedded Resources in Mongo can be implemented using something called an Embedded Value, and should work just fine with all existing adapters.
DataMapper was not created to be specifically used by Rails developers, we wanted to know what type of developer DM targeted:
DM targets a developer who realizes that in the future most apps will be a combination of relational databases for structured data, and NoSQL for unstructured/variable data, or other use cases RDBMS are not as well suited or. The whole debate pitting RDBMS vs. NoSQL is just silly. There are use cases where one is better than the other and most nontrivial apps are a combination of both; it's not an either/or proposition.