Bindings, Platforms, and Innovation
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
Tracking change and innovation in the enterprise software development community
Posted by Sebastien Auvray on Nov 26, 2007 09:30 AM
While Relational Databases fit a client-server model, in a world of services new solutions are needed. RDBMS are subject to scalability issues: How to create redundancy, parallelism ?[Relation Databases] become a single point of failure. In particular, replication is not trivial. To understand why, consider the problem of having two database servers that need to have identical data. Having both servers for reading and writing data makes it difficult to synchronize changes. Having one master server and another slave is bad too, because the master has to take all the heat when users are writing information.In addition, Assaf Arkin also believes that write consistency is the reason RDBMS are imploding under their own weight.
Features like referential integrity, constraints and atomic updates are really important in the client-server world, but irrelevant in a world of services.Those are typical issues that Document Oriented Distributed Databases are notably trying to address.
Inspired by CouchDB and the notion that you insert documents into the database and then define views for querying, Anthony Eden started to write his own Document-Oriented Database: RDDB. An exhaustive review is already available.
What CouchDB is
What CouchDB is not
- A document database server, accessible via a RESTful JSON API.
- Ad-hoc and schema-free with a flat address space.
- Distributed, featuring robust, incremental replication with bi-directional conflict detection and management.
- Query-able and index-able, featuring a table oriented reporting engine that uses Javascript as a query language.
- A relational database.
- A replacement for relational databases.
- An object-oriented database. More specifically, CouchDB is not meant to function as a seamless persistence layer for an OO programming language.
InfoQ had the chance to catch up with Anthony and talk about RDDB, CouchDB and RDBMS.
- Documents are simply collections of name/value pairs.
- Views can be defined with Ruby code.
- A reduce block can be defined to reduce the initial mapped data from a view.
- Views can be materialized to improve query performance.
- Datastores/Viewstores/Materialization stores are pluggable. Current implementations are RAM, partitioned files/file system and Amazon S3.
- Distributed materialization may work, but it's going to be rewritten.
5 Ways to Ensure Application Performance
The Role of Open Source in Data Integration
Usage Landscape: Enterprise Open Source Data Integration
Heya, nice article & interview! I'd like to point out that the image above and the linked slides are subject to the creative commons license (specifically http://creativecommons.org/licenses/by-nc-nd/2.0/). See http://jan.prima.de/~jan/plok/archives/105-Slides-From-the-Latest-CouchDB-Talk-at-PHP-UG-FFM.html for the original publication and extensive commentary. And commenting to Anthony's take on CouchDB. Although he doesn't state it explicitly, I'd like to point out (and I think he implies that) CouchDB is not as Ruby-Hacker-friendly as RDDB, but it might have advantages when scaling up and out over multiple machines and locations. Cheers, Jan --
@Jan: Thanks for the link to the slides and blog entry. I removed the image - it's better to see it in the context of your slides anyway.
Oh, a copyright note would have been enough, you can sure use the image! :-) Still, thanks for being quick about it.
Jan, CouchDB might have indeed have advantages when scaling up and out over multiple machines and locations at this point and time, especially given the support for interprocess communication both locally and over the net that is built into Erlang, however I would argue that the same thing can be accomplished in Ruby through use of libraries now. The approach in RDDB will be to use EC2 and S3 or possibly Rinda or maybe some other sort of messaging system. I also think it will be interesting to see what Matz does in the next couple of years since he brought up his interest in this subject and RubyConf this year. -Anthony
Heya Anthony, thanks for chiming in! There are advantages for both systems; just like CouchDB does not want to replace the RDBMS, as there's room for both. I'll definitely have a closer look at RDDB now ;-) Cheers, Jan --
IMO what Document Oriented Distributed Database are evangelizing is just a set of services that can work with the current RDBMS. For example in Java world most of these service have been standardized through a JSR (JSR-170) on top of physical storage.
See: Data should be available for easy retrieval, integrate simple reporting methods and provide a (fulltext) search.
I would say that the relational model is very good fit for reporting. And the most of the current RDBMS are already providing support for fulltext indexing.
Secure: Compartmentalization of data [...]
I assume this means veritical/horizontal partitioning, which is another feature that current RDBMS are providing or at least starting to consider.
Concluding, I wouldn't say that RDBMS are having problems in some of these directions. Indeed, as the requirements of the today apps are very high, we are waiting for the RDBMS providers to try to catch up with the latest requirements on their side.
./alex
--
.w( the_mindstorm )p.
Alexandru Popescu
Senior Software Eng.
InfoQ Techlead/Co-founder
> I assume this means veritical/horizontal partitioning, which is another feature that current RDBMS are providing or at least starting to consider. No, that means that data you put in in the name of user X is not readable by requests from user Y, for example. Also, by no means, traditional RDBM systems are not good at what's mentioned in the four pillars. The pillars are, in fact, a guide to help comparing different data storage systems. When it comes to sharing though, most traditional systems fall flat on the face, sorry :-) Again, CouchDB is not here to replace anything, it is just another tool worth considering for certain types of problems. Cheers, Jan --
Thanks for the clarification Jan. So, "data compartmentization" is in fact value-level ACL. That's an interesting concept. Till now I was thinking about having column level ACL and I am wondering if such a fine grained ACL wouldn't result (if extensively used) in weird reporting results. Lets say you are querying for some data (in the name of user X that might have hidden values), how do you make the difference between a NULL value and a forbidden value? (will you introduce a new NULL-like value?) On a different level, this feature kind of ruines the possibilities to cache things. ... Well, I think it is too fine grained for me :-). ./alex -- .w( the_mindstorm )p. Alexandru Popescu Senior Software Eng. InfoQ Techlead/Co-founder
You should check out XIC from Xcalia (http://www.xcalia.com). It's another solution to the problem of heterogeneous data access, only it uses transparent persistence APIs (JDO, JPA, SDO) to solve the problem. Of course, that does mean that you're talking about a Java (or Groovy) solution, but that serves a lot of applications out there. You can think of XIC as an extension of object-relational mapping to object-service mapping, only with XIC, there is no explicit mapping. Check out the slides at http://www.xcalia.com/technology/circumvent-SOA-design-trade-offs-with-Object-to-Services-mapping.jsp for more info. Very much worth a look-see. -matthew
Thanks for the clarification Jan. So, "data compartmentization" is in fact value-level ACL.
Well, not quite! :-) It is document-level ACL if you will. A document is roughly a data-record. They will have ownership and permissions and all that.
Cheers,
Jan
--
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.
This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.
This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.
This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.
After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.
IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.
Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.
10 comments
Watch Thread Reply