High-end big iron storage arrays have long owned the transaction processing market. The big relational database systems need all the I/O and availability you can give them.

But what if we didn’t need big relational databases? What then?

RDBMS – RIP?
On his ACM blog, Michael Stonebraker, a database guru, says that relational databases may be nearing the end of the line. He says that their one-size-fits-all philosophy and 1980s code base is at the end of its useful life.

Why? Quoting Stonebraker:

If we examine the non-trivial sized DBMS markets, it turns out that the current relational DBMSs can be beaten by approximately a factor of 50 in most any market I can think of.

In the data warehouse market, a column store beats a row store by approximately a factor of 50 on typical business intelligence queries. . . .

In the online transaction processing (OLTP) market, a lightweight main memory DBMS needs a row store by a factor of 50. . . .

In the science DBMS market, users have never liked relational DBMSs and want a non-relational model and query facility. . . .

Text applications have never used relational DBMSs. This was pointed out to me most clearly by Eric Brewer nearly 15 years ago in the early days of Inktomi. He wanted to use a relational DBMS to store the results of web crawling, but found relational DBMSs to be two orders of magnitude slower than a homebrew system. . . .

Even in XML, where the current major vendors have spent a great deal of energy extending their and engines, it is claimed that specialized engines, such as Mark Logic or Tamino, run circles around the major vendors according to a private communication by Dave Kellogg.

In summary, one can leverage at least the following ideas to get superior performance:

A non-relational data model. . . . .

A different implementation of tables. . . .

A different implementation of transactions. . . .

Mr. Stonebraker’s comments have interesting storage implications. First, big iron storage arrays may not have the relational database management market to rely on much longer.

Second, what happens to storage system engineering when we no longer have one basic data management model to design for? And that is without considering the effect of a 50 times faster database on applications.

In the hardware world a 50 times speed up has 2 major effects: existing problems increase their resolution to absorb the additional compute cycles; and new applications – both low and high end – become economically feasible.

Is there 50 times more data we would collect from existing applications if we had a 50 times faster database? Or will we be running enterprise data management applications on hardware with the power of a netbook? Great power savings. Not so great for hardware vendors.

Mr. Stonebraker theorizes that the DBMS replacement will be a collection of vertical market specific engines. Each, no doubt, with its own storage I/O profile.

The StorageMojo take
Just as the ground has shifted under storage vendors in the last decade, it may be that DBMS vendors face the same ~~problem~~ opportunity in the coming decade.

If past experience is any guide, the storage industry will face multiple challenges supporting these new data management models, even as their high performance and lower (relative) costs drive new waves of application invention and adoption.

Only one thing is certain: much more data will be collected and, therefore, stored. The opportunities keep on coming, whether we are ready for them or not.

Courteous comments welcome, of course. For an interesting dissent, check out Daniel Lemire’s blog.

3 Comments

Wes Felter on Tuesday, 15 September, 2009 at 1:17 pm

Of course, if Vertica and VoltDB are 50x the price of MySQL then you haven’t saved much money.
Daniel Lemire on Wednesday, 16 September, 2009 at 11:14 am

My latest blog post is an answer to your post, you may like it:

http://www.daniel-lemire.com/blog/archives/2009/09/16/relational-databases-are-they-obselete/
Joe Kraska on Thursday, 17 September, 2009 at 3:51 pm

I think that the idea seems to be that since big highly visible companies (e.g., Google) are popularizing distributed keystores (etc), RDMBesses will go away. This is obviously untrue. But like many hysterically stated things, there’s kernels of truth lurking. I think you may see an uptake in keystores and distributed keystores; basically designers who realize that they have problems that are amenable to distributed keystores are going to be more likely to use them as a consequence of the “distributed keystore revolution”. So to speak.

Joe Kraska
San Diego CA
USA

Trackbacks/Pingbacks

Challenges ahead for RDBMS systems « The "Present" I live in - [...] is amazing… couple of related links I found:Relational Databases: Are they obsolete ? RDBMS: going the way of the…