The Database as a Value
During QCon New York 2013, Rich Hickey gave a talk on functional databases. Hickey is well known for creating the Clojure programming language and is currently developing Datomic, a functional database. During his talk, Hickey argued that the useful properties of functional languages: data as values and pure functions, are just as useful in the context of databases. Programming with objects, he argues, is very different. An object combines data and logic and is like a machine: it does sequential processing and constantly changes its state. Calling a method on an object may give very different results every time. This state may change in unpredictable ways, for instance because a different thread also has a reference to the object and is modifying it. This makes programs with objects difficult to reason about. The databases that we use today are very much like objects: a database server is a single entity (possibly replicated) that has constantly changing state, query logic and mutations, using the concept of transactions to compensate for the lack of immutable values.
Hickey asked the audience: wouldn't it be nice to have all these functional properties for databases as well? In such a world, the entire database would be represented as an (immutable) value, and queries are just functions that take a database, or many databases as arguments. The database can be safely handed over between threads, and the same query on the same database value will always produce the same result. Naturally, since an immutable database is not very useful, a transactor would be used to make changes, or more precisely: to produce new database values based on previous ones. A transactor, given a database value and number of mutations, returns a new database value. Programs can ask the database for its current value at any time. Datomic is a database with these properties.
This functional approach to databases applies the core functional programming idea of strictly separating data from logic to the database. Data is persisted in a dumb data store, which can be a simple key-value store like Amazon's DynamoDB. A transactor process coordinates transactions and is primarily useful when multiple "peers" (Datomic lingo for clients) are performing transactions at the same time. A peer is typically integrated in the application itself as a Java library. A peer has a connection to a transactor to perform mutations transactionally. Queries are performed by the peer itself, lazily loading data from the data store as required. To make querying easier, Datomic supports Datalog as a query language, which is a subset of Prolog that is very suitable for querying databases in a declarative way.
Stuart Williams Aug 02, 2015
David Beyer, Olaf Carlson-Wee, Richard Minerich Aug 02, 2015