Martin Fowler Sees a Thaw in Frozen Thinking about Data Storage
In a recent blog post, Martin Fowler, a renowned software thought leader, observed at last week's QCon that the deep freeze in thinking about databases in application architectures is thawing. The world has been stuck using RDBMS databases for every application use case, but the time has come to also consider RISC RDBMS or distributed document-oriented databases. QCon had a keynote by Tim Bray about the changing storage spectrum and how it affects application architectures, as well a whole track on distributed document-oriented databases.
After noting the failure of ODBMS databases, Martin expressed his opinion on why RDBMS succeeded: “their [RDBMS] dominance is due less to their role in data management than their role in integration”. Continuing on:
For many organizations today, the primary pattern for integration is Shared Database Integration - where multiple applications are integrated by all using a common database. When you have these IntegrationDatabases, it's important that all these applications can easily get at this shared data - hence the all important role of SQL. The role of SQL as mostly-standard query language has been central to the dominance of databases.
The Internet is changing the landscape by offering new integration solutions:
The heating of the database space comes from the presence of alternatives to integration - in particular the rise of web services. Under various banners there's a growing movement for applications to talk to each other by passing text (mostly XML) documents over HTTP. The web, both in internet and intranet forms, has made this integration mode even more prevalent than SQL. This is a good thing, I've never liked the approach of multiple applications tightly coupled through a common database - you can't get bigger breach of encapsulation than that.
HTTP will affect the way databases are used, according to Martin:
If you switch your integration protocol from SQL to HTTP, it now means you can change databases from being IntegrationDatabases to ApplicationDatabases. This change is profound. In the first step it supports a much simpler approach to object-relational mapping - such as the approach taken by Ruby on Rails. But furthermore it breaks the vice-like grip of the relational data model. If you integrate through HTTP it no longer matters how an application stores its own data, which in turn means an application can choose a data model that makes sense for its own needs.
While Martin does not think RDBMS will disappear any time soon, he points out the a number of possible alternatives that Tim Bray had mentioned:
- Drizzle is a form of relational database, but one that eschews much of the machinery of modern relational products. I think of it as a RISC RDBMS - supporting only the bare bones of the relational feature set.
- Couch DB is one of many forays into a distributed key-value pair model. Although a sharply simple data-model (nothing more than a hashmap really) this kind of approach has become quite popular in high-volume websites.
- Gemstone was one of the object database crowd, and I found the Gemstone-Smalltalk combination a very powerful development environment (superior to most of its successors). Gemstone is still around as a niche player, but may gain more traction through
- Maglev - a project to bring its approach (essentially a fusion of database and virtual machine) to the Ruby world.
Martin is careful to conclude that RDBMS are not going away and are "the right choice for many situations." His blog does suggest however that given the increase in options these days, "application developers should think about what the right option is for their needs. As non-relational projects grow in popularity and maturity, more and more will go for other options." What do you think?
Depends on the situation
by
Peter Veentjer
For example: If a RDBMS is configured correctly, it is great for doing batch processing. The concurrency mechanisms are well documented in most cases. I don't see the need for a different database mechanism in these cases.
But distributed memory (often ACID with the D) is a lot easier to scale than database. So if you have key/value based searchs, I would hava a look at Terracotta/Coherence and use the database purely as backup mechanism (the D).
And what about TRDBMS
by
Francisco Jose Peredo Noguez
Can non-relation database avoid the problem of Shared Database Integration?
by
Zhang Joey
data semantics
by
Techno Modus
HTTP cannot replace SQL
by
Frank Silbermann
We will not be able to replace integration databases with application databases until someone invents and implements an equally flexible and efficient application query language. HTTP that provides access to a small, canned application API will not suffice.
HTTP is not analogous to SQL; rather, it is analogous to the code that implements a networked database driver; the driver would be useless without the ability to run arbitrary SQL commands when the request arrives at the database machine.
Using the RDMS as the integration point results in a star-shaped topology -- with the RDBMS at the center. Too much reliance on web services for application integration can easily result in a spaghetti-shaped topopology.
Re: Depends on the situation
by
Mark N
For example: If a RDBMS is configured correctly, it is great for doing batch processing. The concurrency mechanisms are well documented in most cases. I don't see the need for a different database mechanism in these cases.Actually, IMS is probably better for batch processing. Either way, getting rid of or minimizing the batch processing should be the primary objective.
Re: HTTP cannot replace SQL
by
Mark N
Re: HTTP cannot replace SQL
by
Francisco Jose Peredo Noguez
Re: Depends on the situation
by
Francisco Jose Peredo Noguez
Re: HTTP cannot replace SQL
by
Francisco Jose Peredo Noguez
I agree with you that HTTP that provides access to a small, canned application API will not suffice, you really understand my point, HTTP is only transport, it has no ability to deal with queries, and you need them to actually manipulate data.
The spaghetti-shaped topopology of WebServices can be avoided with an ESB, but that still does not solve the need to have a relational model based language to manipulate data.
What about XML databases?
by
Miguel Vitorino
Databases like eXist and Mark Logic support both structured and semi-structured data and minimize the typically necessary data transformations (relational <-> object <-> XML/JSON...).
They can store documents, hierarchical data and very strong typed data.
We can more easily support schema versioning.
The query languages (XQuery and XPath) and schema languages (XSD/DTD) are standard.
Replication and clustering are performed more naturally.
They discourage monolithic database designs.
They scale better to a web world because of its native integration with HTTP and greater record granularity...
Any thoughts?
</-></->
XML Databases
by
Miguel Vitorino
Also, I would like to make clear that I have no affiliation with either eXist or MarkLogic. I'm merely interested in following their progress.
Re: What about XML databases?
by
Francisco Jose Peredo Noguez
Data trasnformations can be built easly as a thing layer that exposes the output of you relationa queries as XML/JSON... I do not see the big deal here...
What is "very" strong typed data? what is the difference with "plain" strong typed data?
Why do they more easily support schema versioning?
And XQuery and XPath are inmmune to Sql flaws?
Why are replication and clustering are performed more naturally? What does XML have to do with it?
Why do they discourage monolithic database designs?
I don't see how their native integration with HTTP and greater record granularity make them scale better... can you explain?
Re: What about XML databases?
by
Miguel Vitorino
Schema versioning is easier for the simple reason that schema is not required at all in a XML database. You can choose to use schema, and have several options for that, or you can store raw data and still be able to query that data. Can you do that in a relational database?
Replication and clustering are done more easily because you store related data logically and physically closer. With XML you work at an aggregate level, not at the record/tuple level.
If access to XML databases is designed to be inherently RESTful, it will naturally scale better.
The easiest way to do this, is to embrace the benefits of HTTP - which many of them already do. Of course this will only work with greater record granularity (you don't wanna repeat the same mistakes from CORBA...).
That "thin" layer you refer to only performs format conversions. I believe that is different from data transformations. And no matter how "thin" that layer is, it always has to be there, and usually done by hand, if you have an underlying relational data source that does not quite have the same data representation capabilities your destination formats do.
Educational Content
Writing Usable APIs in Practice
Giovanni Asproni May 19, 2013




Hello stranger!
You need to Register an InfoQ account or Login to post comments. But there's so much more behind being registered.Get the most out of the InfoQ experience.
Tell us what you think