InfoQ

News

Martin Fowler Sees a Thaw in Frozen Thinking about Data Storage

Posted by Abel Avram on Nov 25, 2008

Community
Java,
.NET,
Architecture,
Ruby
Topics
Data Access
Tags
Object Databases ,
Relational Databases ,
GemStone ,
MagLev ,
CouchDB

In a recent blog post, Martin Fowler, a renowned software thought leader, observed at last week's QCon that the deep freeze in thinking about databases in application architectures is thawing. The world has been stuck using RDBMS databases for every application use case, but the time has come to also consider RISC RDBMS or distributed document-oriented databases.   QCon had a keynote by Tim Bray about the changing storage spectrum and how it affects application architectures, as well a whole track on distributed document-oriented databases

After noting the failure of ODBMS databases, Martin expressed his opinion on why RDBMS succeeded: “their [RDBMS] dominance is due less to their role in data management than their role in integration”.  Continuing on:

For many organizations today, the primary pattern for integration is Shared Database Integration - where multiple applications are integrated by all using a common database. When you have these IntegrationDatabases, it's important that all these applications can easily get at this shared data - hence the all important role of SQL. The role of SQL as mostly-standard query language has been central to the dominance of databases.

The Internet is changing the landscape by offering new integration solutions:

The heating of the database space comes from the presence of alternatives to integration - in particular the rise of web services. Under various banners there's a growing movement for applications to talk to each other by passing text (mostly XML) documents over HTTP. The web, both in internet and intranet forms, has made this integration mode even more prevalent than SQL. This is a good thing, I've never liked the approach of multiple applications tightly coupled through a common database - you can't get bigger breach of encapsulation than that.

HTTP will affect the way databases are used, according to Martin:

If you switch your integration protocol from SQL to HTTP, it now means you can change databases from being IntegrationDatabases to ApplicationDatabases. This change is profound. In the first step it supports a much simpler approach to object-relational mapping - such as the approach taken by Ruby on Rails. But furthermore it breaks the vice-like grip of the relational data model. If you integrate through HTTP it no longer matters how an application stores its own data, which in turn means an application can choose a data model that makes sense for its own needs.

While Martin does not think RDBMS will disappear any time soon, he points out the a number of possible alternatives that Tim Bray had mentioned:

  • Drizzle is a form of relational database, but one that eschews much of the machinery of modern relational products. I think of it as a RISC RDBMS - supporting only the bare bones of the relational feature set.
  • Couch DB is one of many forays into a distributed key-value pair model. Although a sharply simple data-model (nothing more than a hashmap really) this kind of approach has become quite popular in high-volume websites.
  • Gemstone was one of the object database crowd, and I found the Gemstone-Smalltalk combination a very powerful development environment (superior to most of its successors). Gemstone is still around as a niche player, but may gain more traction through
  • Maglev - a project to bring its approach (essentially a fusion of database and virtual machine) to the Ruby world.

Martin is careful to conclude that RDBMS are not going away and are "the right choice for many situations." His blog does suggest however that given the increase in options these days, "application developers should think about what the right option is for their needs. As non-relational projects grow in popularity and maturity, more and more will go for other options."   What do you think?

Related Sponsor

Dynamic Application Infrastructure delivers the innovation, performance and scalability to build, deploy and manage all types of highly robust applications.

15 comments

Watch Thread Reply

Depends on the situation by Peter Veentjer Posted Nov 25, 2008 6:04 AM
Re: Depends on the situation by Mark Nuttall Posted Nov 26, 2008 9:34 PM
Re: Depends on the situation by Francisco Jose Peredo Noguez Posted Nov 26, 2008 10:11 PM
jMaglev by ARI ZILKA Posted Nov 25, 2008 10:17 AM
And what about TRDBMS by Francisco Jose Peredo Noguez Posted Nov 25, 2008 10:38 AM
Can non-relation database avoid the problem of Shared Database Integration? by Joey Zhang Posted Nov 26, 2008 2:59 AM
data semantics by Techno Modus Posted Nov 26, 2008 5:12 AM
HTTP cannot replace SQL by Frank Silbermann Posted Nov 26, 2008 8:30 AM
Re: HTTP cannot replace SQL by Mark Nuttall Posted Nov 26, 2008 9:38 PM
Re: HTTP cannot replace SQL by Francisco Jose Peredo Noguez Posted Nov 26, 2008 10:10 PM
Re: HTTP cannot replace SQL by Francisco Jose Peredo Noguez Posted Nov 26, 2008 10:17 PM
What about XML databases? by Miguel Vitorino Posted Nov 28, 2008 5:11 AM
Re: What about XML databases? by Francisco Jose Peredo Noguez Posted Dec 3, 2008 3:08 PM
Re: What about XML databases? by Miguel Vitorino Posted Mar 22, 2009 9:20 AM
XML Databases by Miguel Vitorino Posted Nov 28, 2008 8:40 AM
  1. Back to top

    Depends on the situation

    Nov 25, 2008 6:04 AM by Peter Veentjer

    I think it really depends on the situation.

    For example: If a RDBMS is configured correctly, it is great for doing batch processing. The concurrency mechanisms are well documented in most cases. I don't see the need for a different database mechanism in these cases.

    But distributed memory (often ACID with the D) is a lot easier to scale than database. So if you have key/value based searchs, I would hava a look at Terracotta/Coherence and use the database purely as backup mechanism (the D).

  2. Back to top

    jMaglev

    Nov 25, 2008 10:17 AM by ARI ZILKA

    Check out jMaglev before Maglev, IMO:

    fabiokung.com/2008/11/22/play-with-jmaglev-your...

  3. Back to top

    And what about TRDBMS

    Nov 25, 2008 10:38 AM by Francisco Jose Peredo Noguez

    I am really dissapointed to read this, I do not think that RDBMS are going to be used less, on the contrary, someday people are going to realize that we have not even started to use them, that day we will drop SQL and its many flaws and use a really relational language,an industrial D, as proposed in the Third Manifesto

  4. Integrating applications directly with common database may break the encapsulation of each application, but I don't think it's a problem of only relational database. Even use the new non-relational database, if adopt database integration architecture, we still need to face the problem that the way one app stores its data may impact other apps integrated.

  5. Back to top

    data semantics

    Nov 26, 2008 5:12 AM by Techno Modus

    I think managing data semantics is currently one of the most important issues in data modeling. It is as important as semantics in Semantic Web. Recently I have found new interesting emerging approaches which could solve some problems in data semantics like associative model of data and concept-oriented model: Informal Introduction into the Concept-Oriented Data Model, Informal Introduction into the Concept-Oriented Programming. There is also an interesting paper by Michael Stonebraker on this topic: One Size Fits All: An Idea Whose Time Has Come and Gone

  6. Back to top

    HTTP cannot replace SQL

    Nov 26, 2008 8:30 AM by Frank Silbermann

    When another application uses your data, it probably uses it in a manner you did not forsee. It's not necessary to forsee all the ways your data will be used, because a RDBMS provides a flexible query language, SQL. One reason for the failure of object oriented DBMS was the lack of a standard, flexible and efficient ODBMS query language.

    We will not be able to replace integration databases with application databases until someone invents and implements an equally flexible and efficient application query language. HTTP that provides access to a small, canned application API will not suffice.

    HTTP is not analogous to SQL; rather, it is analogous to the code that implements a networked database driver; the driver would be useless without the ability to run arbitrary SQL commands when the request arrives at the database machine.

    Using the RDMS as the integration point results in a star-shaped topology -- with the RDBMS at the center. Too much reliance on web services for application integration can easily result in a spaghetti-shaped topopology.

  7. Back to top

    Re: Depends on the situation

    Nov 26, 2008 9:34 PM by Mark Nuttall

    For example: If a RDBMS is configured correctly, it is great for doing batch processing. The concurrency mechanisms are well documented in most cases. I don't see the need for a different database mechanism in these cases.
    Actually, IMS is probably better for batch processing. Either way, getting rid of or minimizing the batch processing should be the primary objective.

  8. Back to top

    Re: HTTP cannot replace SQL

    Nov 26, 2008 9:38 PM by Mark Nuttall

    Using the RDBMS at the center creates bottle necks and dependence on db vendors. There are other solutions than web services like transmitting required information to other systems (aka Loosely Coupled systems).

  9. Back to top

    Re: HTTP cannot replace SQL

    Nov 26, 2008 10:10 PM by Francisco Jose Peredo Noguez

    And what language will you use to specify what information you want to be transmitted? most likely something based on relational algebra... the relational model is the best tool for the job, but SQL is a bad implementation of it, with lots of flaws, what we need is a D.

  10. Back to top

    Re: Depends on the situation

    Nov 26, 2008 10:11 PM by Francisco Jose Peredo Noguez

    And how would you do that? (How would you getting rid of or minimize the batch processing?)

  11. Back to top

    Re: HTTP cannot replace SQL

    Nov 26, 2008 10:17 PM by Francisco Jose Peredo Noguez

    A more flexible and efficient query language has alredy been invented, and it is of course also based on the relational model, you can read about it in The Third Manifesto, it is called D and it is what Sql should have been.

    I agree with you that HTTP that provides access to a small, canned application API will not suffice, you really understand my point, HTTP is only transport, it has no ability to deal with queries, and you need them to actually manipulate data.

    The spaghetti-shaped topopology of WebServices can be avoided with an ESB, but that still does not solve the need to have a relational model based language to manipulate data.

  12. Back to top

    What about XML databases?

    Nov 28, 2008 5:11 AM by Miguel Vitorino

    We see more and more data being transmitted over the wire in XML formats (and, yeah, the protocol is mostly HTTP...but we can't query anything with HTTP alone).

    Databases like eXist and Mark Logic support both structured and semi-structured data and minimize the typically necessary data transformations (relational <-> object <-> XML/JSON...).
    They can store documents, hierarchical data and very strong typed data.
    We can more easily support schema versioning.
    The query languages (XQuery and XPath) and schema languages (XSD/DTD) are standard.
    Replication and clustering are performed more naturally.
    They discourage monolithic database designs.
    They scale better to a web world because of its native integration with HTTP and greater record granularity...

    Any thoughts?

    </-></->

  13. Back to top

    XML Databases

    Nov 28, 2008 8:40 AM by Miguel Vitorino

    For those of you who may be curious, checkout an instance of the Mark Logic database at markmail.org.

    Also, I would like to make clear that I have no affiliation with either eXist or MarkLogic. I'm merely interested in following their progress.

  14. Back to top

    Re: What about XML databases?

    Dec 3, 2008 3:08 PM by Francisco Jose Peredo Noguez

    And we can't query anything with XML alone either. You need a program, written in something else. You need a query language, preferably one with strong relational theory supporting it.

    Data trasnformations can be built easly as a thing layer that exposes the output of you relationa queries as XML/JSON... I do not see the big deal here...

    What is "very" strong typed data? what is the difference with "plain" strong typed data?

    Why do they more easily support schema versioning?

    And XQuery and XPath are inmmune to Sql flaws?

    Why are replication and clustering are performed more naturally? What does XML have to do with it?

    Why do they discourage monolithic database designs?

    I don't see how their native integration with HTTP and greater record granularity make them scale better... can you explain?

  15. Back to top

    Re: What about XML databases?

    Mar 22, 2009 9:20 AM by Miguel Vitorino

    Do we absolutely need a strong relational theory behind a database? Is that the _only_ way to have nice performance? Or do you admit there may be other options? Do you see any relational theory behind Google's BigTable or Amazon's Simple DB?

    Schema versioning is easier for the simple reason that schema is not required at all in a XML database. You can choose to use schema, and have several options for that, or you can store raw data and still be able to query that data. Can you do that in a relational database?

    Replication and clustering are done more easily because you store related data logically and physically closer. With XML you work at an aggregate level, not at the record/tuple level.

    If access to XML databases is designed to be inherently RESTful, it will naturally scale better.
    The easiest way to do this, is to embrace the benefits of HTTP - which many of them already do. Of course this will only work with greater record granularity (you don't wanna repeat the same mistakes from CORBA...).

    That "thin" layer you refer to only performs format conversions. I believe that is different from data transformations. And no matter how "thin" that layer is, it always has to be there, and usually done by hand, if you have an underlying relational data source that does not quite have the same data representation capabilities your destination formats do.

Educational Content

How HTML5 Web Sockets Interact With Proxy Servers

Peter Lubbers explains in this article how HTML5 Web Sockets interact with proxy servers, and what proxy configuration or updates are needed for the Web Sockets traffic to go through.

Rails in the Large: How Agility Allows Us to Build One Of the World's Biggest Rails Apps

Neal Ford shows what ThoughtWorks learned from scaling Rails development: infrastructure, testing, messaging, optimization, performance.

Stuart Halloway on Clojure and Functional Programming

Stuart Halloway discusses Clojure and functional programing on the JVM in depth, and touches on the uses of a number of other modern JVM languages including JRuby, Groovy, Scala and Haskell.

Oren Teich and Blake Mizerany on Heroku

Oren Teich and Blake Mizerany talk about the technology behind Heroku and the benefits of the new add-on system.

Security for the Services World

Chris Riley presents security issues threatening service based systems, examining security threats, presenting measures to reduce the risks, and mentioning available security frameworks.

Navigating The Rapids:Real-World Lessons in Adopting Agile

This talk investigates technical issues encountered when moving to an Agile process.

Codename "M": Language, Data, and Modeling, Oh My!

Don Box and Amanda Laucher present “M”, a declarative language for building data models, domain models or external DSLs. Don Box's demos show some of M’s features and latest changes of the language.

SOA Manifesto - 4 Months After

It is four months since the SOA manifesto was announced; InfoQ interviewed the original author’s to get insight into the motivations and the process behind the initiative.