MoSQL: Synchronizing MongoDB and PostgreSQL Made Easy

| by Jonathan Allen Follow 578 Followers on Feb 06, 2013. Estimated reading time: 1 minute |

San Francisco based Stripe has announced MoSQL, a tool for making reporting from MongoDB much easier via a live-replicating PostgreSQL database. MoSQL is based on MongoRiver, their companion product for monitoring MongoDB data updates in near-realtime.

The purpose of MoSQL is to simulate a traditional RDBMS design wherein reporting and ad hoc queries are performed on a read-only copy of the production data. It is not unusual for this read-only copy to undergo several transformations before being offered to the business analysts, so we aren’t really in uncharted territory.


MongoRiver is a general library for MongoDB oplog tailing. Written in Ruby, MongoRiver allows developers to watch a MongoDB instance for update operations. There isn’t much documentation on it yet, the github site is limited to just the source code. MongoRiver is offered under the MIT License.


Built on MongoRiver, MoSQL performs the actual data transformation. It requires a YAML-stye mapping file they refer to as the “Collection Map file”. Creating this file is the only preparation that developers need to make. MoSQL will automatically create the necessary target tables in PostgreSQL.

MoSQL can run in one-time or tailing mode. In one-time mode, enabled by the “skip-tail” flag, it will simply perform an import. In tailing mode it will monitor the aforementioned oplog so that it can keep PostgreSQL in sync. When starting MoSQL you can also force a fresh import, an operation that will drop the current tables and recreate them anew.

If MoSQL encounters values in the MongoDB database that don't fit within the stated schema (e.g. a floating-point value in a INTEGER field), it will log a warning, ignore the entire object, and continue.

If it encounters a MongoDB object with fields not listed in the collection map, it will discard the extra fields, unless :extra_props is set in the :meta hash. If it is, it will collect any missing fields, JSON-encode them in a hash, and store the resulting text in _extra_props in SQL. It's up to you to do something useful with the JSON. One option is to use plv8 to parse them inside PostgreSQL, or you can just pull the JSON out whole and parse it in application code.

MoSQL is also offered under the MIT License.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you