BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Git-for-Data, Version-Controlled Database Dolt Gets PostgreSQL-Flavor

Git-for-Data, Version-Controlled Database Dolt Gets PostgreSQL-Flavor

This item in japanese

DoltgreSQL builds on top of the version-controlled database Dolt to provide Git-like log, diff, branch, and merge functionality for your Postgres database schema and data.

Dolt was born as an SQL database you can clone, fork, branch, and merge just like a Git repository. Using Dolt, application developers can build branch- and merge-workflows for their customers, for example, by sending pull requests to fix mistakes in the data. Similarly, Dolt enables a simple model to change a production database by branching it, applying the changes, then testing it in your staging setup, and eventually deploying back to production.

Since its beginnings, Dolt adopted MySQL's syntax and a command line-oriented paradigm that is surely familiar to Git users.

DoltgreSQL focuses instead on the database server experience, providing a customizable, easy-to-deploy server. Furthermore, the company does not provide command-line support to better align with the general PostgreSQL audience:

DoltgreSQL works by emulating a PostgreSQL server, and converting received commands into an AST that is given to an underlying Dolt server. This enabled us to get something up and running quickly, while leveraging the capabilities and functionality that Dolt already provides.

This approach has the advantage of building these new features on top of Dolt's foundation, leveraging the latter's stability and reliability and reducing development scope and effort.

DoltHub says they investigated distinct approaches, including writing a foreign data wrapper, building a new PostgreSQL storage backend, and even forking PostgreSQL itself. Some of those approaches turned out to be too limited, while others, such as forking PostgreSQL, would have required years of development.

On the negative side, one shortcoming of the emulation-based approach is that you are not running the actual PostgreSQL binary. Instead, as mentioned, DoltgreSQL converts PostgreSQL syntax into its AST representation and runs it within the Dolt layer.

Once you have DoltgreSQL installed, you can connect to it using the psql command line client. To see the status of the database, you can run the query:

select * from dolt_status;

This will list all existing tables and specify whether they are new, staged, and so on. To add a table to the staging area, you run:

call dolt_add('my_table_name');

And commit changes with:

call dolt_commit('-m', 'updated schema');.

The equivalent to git log is select * from dolt_log;.

Doltgres is still experimental and has several limitations, including lack of support on DoltHub and DoltLab, no authentication or user management, limited support for SSL connections, no support for replication, clustering, etc., and more.

While Dolt's "Git for data" value proposition may sound compelling, database expert J. Andrew Rogers observed on Hacker News this goal is not so unlike what multi-version concurrency control (MVCC) has attempted for decades with several major drawbacks. Dolt CEO Tim Sehn highlighted running Dolt against native MySQL shows it is only marginally slower for the sysbench benchmark.

About the Author

Rate this Article

Adoption
Style

BT