Is the NoSQL Meeting Announcing the End of the RDBMS Era?
The NoSQL meeting tried to raise the awareness towards the opportunity of using non-relational databases which promise to be cheaper, simpler to administer and maintain, and offering superior scalability. Michael Stonebraker, co-creator of Ingres and Postgres, thinks that the end of RDBMS era is close, while others think that we are not there yet.
The first NoSQL meeting held in San Francisco, US, in June was joined by architects and developers from companies like LinkedIn, Facebook, SpringSource, Google, and was centered around a number of presentations on non-RDBMS data stores. The organizer, Johan Oskarsson, a developer for Last.fm and committer for Apache Hadoop and Hive, posted the slides and videos:
Voldemort - Jay Kreps, Linkedin (slides pdf ppt, video1, video2)
Cassandra - Avinash Lakshman, Facebook (slides pdf ppt, video)
Dynomite - Cliff Moon, Powerset (slides, video)
HBase - Ryan Rawson, Stumbleupon (slides, video)
Hypertable - Doug Judd, Zvents (slides pdf ppt, video1, video2)
CouchDB - Chris Anderson, couch.io (slides, video1, video2)
VPork - Jon Travis, SpringSource (slides, video)
MongoDb - Dwight Merriman, 10gen (slides, video)
Infinite Scalability - Jonas S Karlsson, Google (slides, video)
ComputerWorld reported that Jon Travis, principal engineer at SpringSource, said during the meeting:
Relational databases give you too much. They force you to twist your object data to fit a RDBMS… NoSQL-based alternatives "just give you what you need".
Many had even dumped the open-source MySQL database, a long-time Web 2.0 favorite, for a NoSQL alternative, because the advantages were too compelling to ignore.
But he admitted that even the company he is working for is not yet using a no-SQL database for production, and we are not there yet:
It's true that [NoSQL] aren't relevant right now to mainstream enterprises, but that might change one to two years down the line.
Michael Stonebraker, co-creator of Ingres and Postgres back in 70’s, has predicted the demise of RDBMSes for many years. He has reiterated his position lately, giving some market reasons why we are approaching the end of Relational DMBS era:
In the data warehouse market, a column store beats a row store by approximately a factor of 50 on typical business intelligence queries. The reason is because column stores read only the columns of interest to the query and not all of them. In addition, compression is more effective in a column store. Since the legacy systems are all row stores, they are vulnerable to competition from the newer column stores.
… In the online transaction processing (OLTP) market, a lightweight main memory DBMS beats a row store by a factor of 50. Leveraging main memory and the fact that no DBMS application will send a message to a human user in the middle of a transaction, allows an OLTP DBMS to run transactions to completion with no resource contention or locking overhead.
… In the science DBMS market, users have never liked relational DBMSs and want a non-relational model and query facility.
… Text applications have never used relational DBMSs. This was pointed out to me most clearly by Eric Brewer nearly 15 years ago in the early days of Inktomi. He wanted to use a relational DBMS to store the results of Web crawling, but found RDBMS to be two orders of magnitude slower than a home-brew system. All the major Web-search engines use home-brew text software to serve us search results. None use relational DBMSs.
Stonebraker suggest following the next path for superior performance:
Use a non-relational data model:
If the user’s data is naturally something other than tables and if simulating his natural data model on top of tables is awkward, then chances are that a native implementation of the natural data model will significantly outperform a conventional RDBMS. This is certainly true in scientific data.
Use a different implementation of tables:
If something other than a row store accelerates the user’s queries, then a direct implementation of the relational model using non-row store technology will run circles around a conventional RDBMS. This is true in the data warehouse marketplace.
Use a different implementation of transactions:
Current row stores give you a “one size fits all” implementation of transactions. This can be radically beaten if a user has lesser requirements or if the system can take advantage of workload specific features. This is true in the OLTP marketplace.
BJ Clark, a developer for Grasshopper and a consultant, considers that ending the SQL game is not so easy being way too premature for that. Besides, the main advantage of column databases, key/value stores or document databases is considered to be scalability, but he has reviewed several of these and not all of them scale as promised. For example, in his evaluation, Tokio, a key/value store with full text search, Redis, another key/value store, MongoDB, a document DB, do not scale, at least not yet. Amazon S3 and Voldermort scale well, according to his findings though he is not presenting data to back up his claims but only conclusions resulting from his team looking for the best data store solution for their project. His conclusion is:
So, does RDBMS scale? I would say the answer is: not any worse than lots of other things. Most of what doesn’t scale in a RDBMS is stuff people don’t use that often anyway. And does NoSQL scale: a couple solutions do, most don’t. You might even argue that it’s just as easy to scale mysql (with sharding via mysql proxy) as it is to shard some of these NoSQL dbs. And I think it’s a pretty far leap to declare the RDBMS dead.
It is obvious that RDBMSes are no longer the main keepers of the data, and that is especially true with some of the large companies that have risen during the Internet era: Amazon, Google, Facebook, LinkedIn, and others. But it is also true that many have invested heavily in Oracle, DB2 or MS SQL, and the truth is those databases are still serving their needs. It is completely unlikely relational DBs to disappear any time soon, but it is possible to see a gradual move towards open source non-SQL data stores for costs, simplicity and scalability reasons.
can rdbms scale
NoSQL I agree, no RDMBs... well, I still have to see a True RDBMS
Francisco Jose Peredo Noguez
I guess one of the problems is that since most of currently pseudo relational databases use SQL, people just believes that SQL = Relational, and that is just not true.
All this non-relational databases sound nice (COBOL sounded nice to a while ago) and they sure seem to be good at storing a huge amount of data, but, what about actually working with that data in a consistent predictable way? (As in if this business processes is not done accurately the government/customers will sue us to bankruptcy/jail).
I mean, for heuristic stuff, this non relational databases are fine, but if you really need control of the data, you need a RDMBS
Not so fast
I think you mis-understood Stonebraker?
In fact, he's still inventing SQL RDBMSes. Vertica is a specialized MPP/columnar SQL RDBMS for handling analytic/data warehousing workloads. VoltDB is a new SQL OLTP RDBMS with a new way of guaranteeing transactional ACIDity for large-scale, clustered, high-transaction volume workloads (versus the slow/non-scaling ACIDity approaches of the 30-year-old OLTP RDBMSs). Disclaimer: I work for VoltDB.
Off the top of my head, I know less about the scientific DBMS he's invented SciDB. I don't believe it's relational though. It's open source and worth reading about; lots of cool design concepts in it.
Re: Not so fast
Most of the article contains what others think on the issue.
My closing statement was: "It is completely unlikely relational DBs to disappear any time soon, but it is possible to see a gradual move towards open source non-SQL data stores for costs, simplicity and scalability reasons."
Re: I think you mis-understood Stonebraker?
it may be possible I misunderstood M.S. After reading his article in the ACM Blog, entitled The End of a DBMS Era (Might be Upon Us), my impression was that he predicts Relational DBMS will be replaced by other DBMSes: column, document, etc. And that's my article about. The only twist is that NoSQL people do not like to use the term database, but their prefer data store or something like that, because usually databases are considered by most as RDBMS: Oracle, DB2, MS SQL.
Re: can rdbms scale
For most applications, RDBMSs can be made to scale sufficiently for the job at hand.
It will be interesting to see what Microsoft and Oracle come up with to try to keep their products competitive as different technologies emerge.
Re: can rdbms scale
Francisco Jose Peredo Noguez
So one big question here is: how many companies are going to need this kind of "fuzzy logic/ heuristic" stuff to handle their data in a scalable "eventually consistent, smart but uncontrollable" way...
Re: Not so fast
I can imagine an enterprise with an "ACID" Core for each major business object and some "eventually consistent" technologies on the edges. An authoritative source of truth is however a must. Otherwise we will end up building home-made distributed transaction management to keep things up-to-date, not a nice perspective :-)</->
Re: Re: can rdbms scale
As in every IT topic: context is the key!
By the way, I think NoSQL are just getting "through of disillusionment" moment of it's own hype cycle: