Hasso Plattner Touts Highly Parallel Computing and Column-Oriented Databases
Hasso Plattner, co-founder of SAP, explained at the Sapphire keynote last week that:
any business query in a company the size of SAP [should] be returned and presented in an industry standard format, such as Microsoft Excel in less than a second.
Plattner suggests that by using a combination of in-memory databases, multi-core and vertical storage, this could be possible. According to him:
column databases have considerable advantages over conventional storage methods:
- No redundant data, so less data administration
- No redundant software codes, so easier upgrades
- Data feeds directly into algorithms
- Greater flexibility
- Easy to add new fields in the customer database
- Column databases have no free or blank cells, bringing data content down to a twentieth of its original size, compared with conventional storage formats.
Column-oriented databases are not new, as a matter of fact, they have been around for 15 years or so:
- Column databases store data based on a per-column basis, as opposed to on a per-row basis
- Because similar data is close together, column databases minimize disk read time for many types of queries (e.g. data warehouse queries)
- Google's BigTable is a column-oriented database which powers many Google applications (e.g. Google Maps and Google Reader)
Columnar storage is much faster than records-based storage, according to Plattner, since it allows for 10 times more compression and because database updates will be done as insert-only operations, i.e., simply adding fields to strings.
We will soon have, on one board, eight CPUs, and each of those CPUs will have 16 cores," he predicted. "This will mean 128 computing units on one board, called a blade. Today a blade can hold 144 GB of memory and costs about $6,300. Now, if you take 100 blades, you'll have 12,800 computing units for less than $1 million. That is not much money, relatively speaking, for all that computing power
These technologies seem to be ideal for Cloud Computing. Will this kind of power and performance drive Cloud Computing adoption and accelerate the conversion of large packaged applications to a multi-tenant pay-per-use model? What's your take on it?