BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

Rich Hickey's Datomic embraces Cloud, intelligent Applications and Consistency

by Michael Hunger on Apr 03, 2012 |

Early March the Relevance team around Rich Hickey and Stuart Halloway announced a new database platform that they've worked on since 2010.

Datomic leverages recent developments in distributed computing, especially storage as a service, the availability of arbitrary numbers of application servers and the requirement for scaling reads more than writes. Other important aspects of datomic are:

  • separation of read and write concerns,
  • strong transactional guarantees on writes,
  • the notion of immutable, append only databases
  • database snapshots in time as queryable values
  • and making time (transactions) part of the core datastructure the Datom being a fact of (Entity, Attribute, Value, Transaction)
  • Datalog as a logic based, structural query language which allows for complex queries including inferred joins

 

	Datalog Example (shipping costs exceeding product price)
	[:find ?customer ?product
	 :where [?customer :shipAddress ?addr]
	        [?addr :zip ?zip]
	        [?product :product/weight ?weight]
	        [?product :product/price ?price]
            // function call into application code
	        [(Shipping/estimate ?zip ?weight) ?shipCost]
	        [(<= ?price ?shipCost)]]	

The Datomic architecture consists of these integral building blocks:

  • a distributed, fast storage service (AWS Dynamo DB on SSD)
  • a single transactor service, only responsible for serializing writes into a consistent data stream
  • a peer library which is part of the application and handles querying and index/data fetching

Datomic Architecture

Rich also discussed Datomic in an upcoming interview with InfoQ's Werner Schuster and spoke about it at his Keynote at Clojure/West.

Since the announcement which was accompanied by video presentations about the Datomic architecture and Datalog there have been a number of interesting discussions.

Sergio Bossa and Daniel Spiewak questioned some of the design decisions of datomic.

One is the reduced write throughput and the selection of a single transactor as single point of failure and main bottleneck.

Another being the decision to move massive amounts of data to the code (applications executing queries) instead of moving code to the data as many other approaches (like map-reduce) do right now.

Rich Hickey answered those on Alexandru Popescu's and Michael Fogus' blogs.

He pointed out that the transactor can be build as a highly available component and there is the possibilty to create multiple, parallel "sharded" datomic databases which can be cross-queried. He also outlined that Datomics sweet spot is not in extremely high write throughput but scaling of reads, rich querying and a consistent transactional system.

The answer to moving the data to the application discusses the current strain of database servers to take care of too many concerns, like querying, writing, sharding, optimzing, logging, monitoring and many more. Datomic tries to separate these concerns. Applications are much easier to scale out than database servers. They can as well take care of different query- and use-case responsibilities catering for different query characteristics and data needs (hot dataset).

Another interesting point was that Datomic can be seen as globaly distributed index. It is updated regularly in the storage. Additionally index-deltas are being constantly computed in the transactor and each application to be virtually merged with the main index. The cached immutable index segments allow the query engine to retrieve the targeted pieces of the database values directly without the need of transporting large parts of the database to the client.

The current offerings of Datomic covers:

  • the peer library which also comes with an in memory implementation of transactor and storage service (for development)
  • a virtual box appliance that contains a transactor instance and persistent data storage service (for testing and small apps)
  • a public, commerical offering on AWS with a free tier (1000 hours) using Amazons' Dynamo DB

 

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT