High-end big iron storage arrays have long owned the transaction processing market. The big relational database systems need all the I/O and availability you can give them.
But what if we didn’t need big relational databases? What then?
RDBMS – RIP?
On his ACM blog, Michael Stonebraker, a database guru, says that relational databases may be nearing the end of the line. He says that their one-size-fits-all philosophy and 1980s code base is at the end of its useful life.
Why? Quoting Stonebraker:
If we examine the non-trivial sized DBMS markets, it turns out that the current relational DBMSs can be beaten by approximately a factor of 50 in most any market I can think of.
In the data warehouse market, a column store beats a row store by approximately a factor of 50 on typical business intelligence queries. . . .
In the online transaction processing (OLTP) market, a lightweight main memory DBMS needs a row store by a factor of 50. . . .
In the science DBMS market, users have never liked relational DBMSs and want a non-relational model and query facility. . . .
Text applications have never used relational DBMSs. This was pointed out to me most clearly by Eric Brewer nearly 15 years ago in the early days of Inktomi. He wanted to use a relational DBMS to store the results of web crawling, but found relational DBMSs to be two orders of magnitude slower than a homebrew system. . . .
Even in XML, where the current major vendors have spent a great deal of energy extending their and engines, it is claimed that specialized engines, such as Mark Logic or Tamino, run circles around the major vendors according to a private communication by Dave Kellogg.
In summary, one can leverage at least the following ideas to get superior performance:
A non-relational data model. . . . .
A different implementation of tables. . . .
A different implementation of transactions. . . .
Mr. Stonebraker’s comments have interesting storage implications. First, big iron storage arrays may not have the relational database management market to rely on much longer.
Second, what happens to storage system engineering when we no longer have one basic data management model to design for? And that is without considering the effect of a 50 times faster database on applications.
In the hardware world a 50 times speed up has 2 major effects: existing problems increase their resolution to absorb the additional compute cycles; and new applications – both low and high end – become economically feasible.
Is there 50 times more data we would collect from existing applications if we had a 50 times faster database? Or will we be running enterprise data management applications on hardware with the power of a netbook? Great power savings. Not so great for hardware vendors.
Mr. Stonebraker theorizes that the DBMS replacement will be a collection of vertical market specific engines. Each, no doubt, with its own storage I/O profile.
The StorageMojo take
Just as the ground has shifted under storage vendors in the last decade, it may be that DBMS vendors face the same
problem opportunity in the coming decade.
If past experience is any guide, the storage industry will face multiple challenges supporting these new data management models, even as their high performance and lower (relative) costs drive new waves of application invention and adoption.
Only one thing is certain: much more data will be collected and, therefore, stored. The opportunities keep on coming, whether we are ready for them or not.
Courteous comments welcome, of course. For an interesting dissent, check out Daniel Lemire’s blog.