BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Is the NoSQL Meeting Announcing the End of the RDBMS Era?

Is the NoSQL Meeting Announcing the End of the RDBMS Era?

Bookmarks

The NoSQL meeting tried to raise the awareness towards the opportunity of using non-relational databases which promise to be cheaper, simpler to administer and maintain, and offering superior scalability. Michael Stonebraker, co-creator of Ingres and Postgres, thinks that the end of RDBMS era is close, while others think that we are not there yet.

The first NoSQL meeting held in San Francisco, US, in June was joined by architects and developers from companies like LinkedIn, Facebook, SpringSource, Google, and was centered around a number of presentations on non-RDBMS data stores. The organizer, Johan Oskarsson, a developer for Last.fm and committer for Apache Hadoop and Hive, posted the slides and videos:

Voldemort - Jay Kreps, Linkedin (slides pdf ppt, video1, video2)
Cassandra - Avinash Lakshman, Facebook (slides pdf ppt, video)
Dynomite - Cliff Moon, Powerset (slides, video)
HBase - Ryan Rawson, Stumbleupon (slides, video)
Hypertable - Doug Judd, Zvents (slides pdf ppt, video1, video2)
CouchDB - Chris Anderson, couch.io (slides, video1, video2)
VPork - Jon Travis, SpringSource (slides, video)
MongoDb - Dwight Merriman, 10gen (slides, video)
Infinite Scalability - Jonas S Karlsson, Google (slides, video)

ComputerWorld reported that Jon Travis, principal engineer at SpringSource, said during the meeting:

Relational databases give you too much. They force you to twist your object data to fit a RDBMS… NoSQL-based alternatives "just give you what you need".

Oskarsson said:

Many had even dumped the open-source MySQL database, a long-time Web 2.0 favorite, for a NoSQL alternative, because the advantages were too compelling to ignore.

But he admitted that even the company he is working for is not yet using a no-SQL database for production, and we are not there yet:

It's true that [NoSQL] aren't relevant right now to mainstream enterprises, but that might change one to two years down the line.

Michael Stonebraker, co-creator of Ingres and Postgres back in 70’s, has predicted the demise of RDBMSes for many years. He has reiterated his position lately, giving some market reasons why we are approaching the end of Relational DMBS era:

In the data warehouse market, a column store beats a row store by approximately a factor of 50 on typical business intelligence queries. The reason is because column stores read only the columns of interest to the query and not all of them. In addition, compression is more effective in a column store. Since the legacy systems are all row stores, they are vulnerable to competition from the newer column stores.

… In the online transaction processing (OLTP) market, a lightweight main memory DBMS beats a row store by a factor of 50. Leveraging main memory and the fact that no DBMS application will send a message to a human user in the middle of a transaction, allows an OLTP DBMS to run transactions to completion with no resource contention or locking overhead.

… In the science DBMS market, users have never liked relational DBMSs and want a non-relational model and query facility.

… Text applications have never used relational DBMSs. This was pointed out to me most clearly by Eric Brewer nearly 15 years ago in the early days of Inktomi. He wanted to use a relational DBMS to store the results of Web crawling, but found RDBMS to be two orders of magnitude slower than a home-brew system. All the major Web-search engines use home-brew text software to serve us search results. None use relational DBMSs.

Stonebraker suggest following the next path for superior performance:

Use a non-relational data model:

If the user’s data is naturally something other than tables and if simulating his natural data model on top of tables is awkward, then chances are that a native implementation of the natural data model will significantly outperform a conventional RDBMS. This is certainly true in scientific data.

Use a different implementation of tables:

If something other than a row store accelerates the user’s queries, then a direct implementation of the relational model using non-row store technology will run circles around a conventional RDBMS. This is true in the data warehouse marketplace.

Use a different implementation of transactions:

Current row stores give you a “one size fits all” implementation of transactions. This can be radically beaten if a user has lesser requirements or if the system can take advantage of workload specific features. This is true in the OLTP marketplace.

BJ Clark, a developer for Grasshopper and a consultant, considers that ending the SQL game is not so easy being way too premature for that. Besides, the main advantage of column databases, key/value stores or document databases is considered to be scalability, but he has reviewed several of these and not all of them scale as promised. For example, in his evaluation, Tokio, a key/value store with full text search, Redis, another key/value store, MongoDB, a document DB, do not scale, at least not yet. Amazon S3 and Voldermort scale well, according to his findings though he is not presenting data to back up his claims but only conclusions resulting from his team looking for the best data store solution for their project. His conclusion is:

So, does RDBMS scale? I would say the answer is: not any worse than lots of other things. Most of what doesn’t scale in a RDBMS is stuff people don’t use that often anyway. And does NoSQL scale: a couple solutions do, most don’t. You might even argue that it’s just as easy to scale mysql (with sharding via mysql proxy) as it is to shard some of these NoSQL dbs. And I think it’s a pretty far leap to declare the RDBMS dead.

It is obvious that RDBMSes are no longer the main keepers of the data, and that is especially true with some of the large companies that have risen during the Internet era: Amazon, Google, Facebook, LinkedIn, and others. But it is also true that many have invested heavily in Oracle, DB2 or MS SQL, and the truth is those databases are still serving their needs. It is completely unlikely relational DBs to disappear any time soon, but it is possible to see a gradual move towards open source non-SQL data stores for costs, simplicity and scalability reasons.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Hyperbole

    by Jim Leonardo,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I read this as "It isn't the right tool for our job, so it is dead". When are we going to realize we can't have a toolbox full of hammers?

  • can rdbms scale

    by dwight merriman,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I disagree with the RDBMS can scale just as well quote in the article. To take a couple of mature examples, it is very clear that Google Bigtable and Amazon Dynamo scale better than RDBMSes. Of course, they also do less -- that is why. Right tool for the right job...
    Dwight/mongodb.org

  • Hyperbole

    by Geoffrey Wiseman,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    It's obvious that RDBMSes are definitely the main keepers of our data after 'filesystems', and are likely to be for many years to come, although it's certainly true that the NoSQL approaches are gaining some momentum, particularly for the extra-large web applications.

  • NoSQL I agree, no RDMBs... well, I still have to see a True RDBMS

    by Francisco Jose Peredo Noguez,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    NoSQL I agree, SQL is full of flaws, but I do not think that the end of RDBMS is near, in fact, they are just beginning.


    I guess one of the problems is that since most of currently pseudo relational databases use SQL, people just believes that SQL = Relational, and that is just not true.


    All this non-relational databases sound nice (COBOL sounded nice to a while ago) and they sure seem to be good at storing a huge amount of data, but, what about actually working with that data in a consistent predictable way? (As in if this business processes is not done accurately the government/customers will sue us to bankruptcy/jail).


    I mean, for heuristic stuff, this non relational databases are fine, but if you really need control of the data, you need a RDMBS

  • Not so fast

    by Billy Newport,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This is kind of silly. RDBMD will be around for a long time to come. This technology is interesting in that it's being used for places where the databases don't work for cost/scale or availability reasons and also to front existing databases for customers with smaller requirements. I think the growth area for this scale out dbms stuff is warehousing. I can see hadoop type systems being used as the primary dataset for warehouse datasets and have it push snapshots or processed sets to conventional warehouses where this makes sense.

  • I think you mis-understood Stonebraker?

    by Andy Ellicott,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    He's not saying that Relational databases (RDBMS) are dead/dying. I think he was saying that the era of using one-size-fits-all RDBMS (like Oracle) for all database applications was ending and that a new era of specialized DBMS usage (for data warehousing, stream processing, text search, scientific workloads) would emerge.


    In fact, he's still inventing SQL RDBMSes. Vertica is a specialized MPP/columnar SQL RDBMS for handling analytic/data warehousing workloads. VoltDB is a new SQL OLTP RDBMS with a new way of guaranteeing transactional ACIDity for large-scale, clustered, high-transaction volume workloads (versus the slow/non-scaling ACIDity approaches of the 30-year-old OLTP RDBMSs). Disclaimer: I work for VoltDB.


    Off the top of my head, I know less about the scientific DBMS he's invented SciDB. I don't believe it's relational though. It's open source and worth reading about; lots of cool design concepts in it.


    Thanks,

    Andy

  • Re: Not so fast

    by Abel Avram,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Billy,
    Most of the article contains what others think on the issue.
    My closing statement was: "It is completely unlikely relational DBs to disappear any time soon, but it is possible to see a gradual move towards open source non-SQL data stores for costs, simplicity and scalability reasons."

  • Re: I think you mis-understood Stonebraker?

    by Abel Avram,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Andy,
    it may be possible I misunderstood M.S. After reading his article in the ACM Blog, entitled The End of a DBMS Era (Might be Upon Us), my impression was that he predicts Relational DBMS will be replaced by other DBMSes: column, document, etc. And that's my article about. The only twist is that NoSQL people do not like to use the term database, but their prefer data store or something like that, because usually databases are considered by most as RDBMS: Oracle, DB2, MS SQL.

  • Re: can rdbms scale

    by Rhys Parsons,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Google and Amazon are hardly typical. They require very high levels of scalability to a level that is extremely rare.

    For most applications, RDBMSs can be made to scale sufficiently for the job at hand.

    It will be interesting to see what Microsoft and Oracle come up with to try to keep their products competitive as different technologies emerge.

  • Re: can rdbms scale

    by Francisco Jose Peredo Noguez,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Exactly, Google and Amazon use data in a way that is not very common, they handle huge amounts of information in a heuristic way, they do not care about control, if the exact same query produces to different results, they do not care, as long as it is "more/less" relevant for the user, but inside an enteprise, things are pretty different: if you are processing the payroll of you company you want it to be perfectly predictable an consistent up to the last cent, and if you write a query asking for all the people that have earned USD 10,000 in the last 3 months you want the result to be exactly accurate not "more/less" relevant.

    So one big question here is: how many companies are going to need this kind of "fuzzy logic/ heuristic" stuff to handle their data in a scalable "eventually consistent, smart but uncontrollable" way...

  • Re: Not so fast

    by Tiberiu Fustos,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Agree with "not so fast". One important aspect that is easily forgotten is that "data outlives the applications". Some of the given examples of major users of non-relational DBs are also known for pionieers of the "perpetual beta". I can imagine however a world where the "core" of the enterprise data is managed as before in RDBMSs along with non-RDBMS representations of it fine-tuned for special purposes. Already today companies spend a lot of money just for keeping data in-sync between the different business domains. At every domain boundary there are already huge amounts of transformations "SQL" <-> Document (be it in EAI or SOA architecture styles).

    I can imagine an enterprise with an "ACID" Core for each major business object and some "eventually consistent" technologies on the edges. An authoritative source of truth is however a must. Otherwise we will end up building home-made distributed transaction management to keep things up-to-date, not a nice perspective :-)</->

  • Re: Re: can rdbms scale

    by Daniel Ribeiro,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Using a NoSQL database does not mean you are using eventually consistency. In fact, Google's BigTable enforces consistency. The details are a bit complicated (I wrote a short article about it), but, in summary, the thread-offs distributed databases (whether the use SQL or not is not relevant in fact) have to make are pretty much constrained to the same laws. Not all choose the same trade-offs.

  • Hype Cycle

    by Wanderson Santos,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I think NoSQL is a new approach to use (which still needs a heavy cleaning of it's own scenarios), but it's definitely not the approach to override RDBMS. It's just another tool approach.

    As in every IT topic: context is the key!

    By the way, I think NoSQL are just getting "through of disillusionment" moment of it's own hype cycle:

    www.gartner.com/technology/research/methodologi...

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT