The performance increase in individual CPUs is slowing to a crawl. All the easy wins – higher clock speeds, wider datapaths, more DRAM, larger registers and caches, 2-4 cores – have been exploited.
Doctor, is there any hope?
In the recent PCAST report on Federal technological initiatives (see Fed funding for our digital future) (pdf) one sidebar suggested “Progress in Algorithms Beats Moore’s Law.”
. . . in many areas, performance gains due to improvements in algorithms have vastly exceeded even the dramatic performance gains due to increased processor speed.
The algorithms that we use today for speech recognition, for natural language translation, for chess playing, for logistics planning, have evolved remarkably in the past decade. It’s difficult to quantify the improvement, though, because it is as much in the realm of quality as of execution time.
In the field of numerical algorithms, however, the improvement can be quantified. Here is just one example . . . a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later – in 2003 – this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. Of this, a factor of roughly 1,000 was due to increased processor speed, whereas a factor of roughly 43,000 was due to improvements in algorithms!
The StorageMojo take
Let’s file this one under “Wishful thinking” along with “US housing prices will never decline.” Piecemeal enhancements of specific application areas cannot replace the generalized performance improvements we’ve seen for decades.
No doubt there are important algorithmic improvements to be made. And that in certain problem spaces those speedups will far exceed Moore’s Law – even though the Law is about transistor count, not performance.
That doesn’t change the fact of computation today: the era of predictable and rapid performance improvement is over. Like a vein of rich ore that thins out, our computers will still improve, but the effort needed to do so is rising fast.
Cheap(er) SSDs, larger memories and caches are helping mask the performance plateau by increasing system performance, but reduced I/O latency and increased bandwidth will only take us so far. The way forward is a game of wringing out single-digit percent improvements, not the 2-3 year doubling of the last 60 years.
Courteous comments welcome, of course. The professor whose work the PCAST quote refers to is Martin Grötschel of Konrad-Zuse-Zentrum in Berlin. He’s been doing brilliant work on optimization problems– including the traveling salesman problem and data network design – for decades.
Web applications do not benefit at all from algorithmic improvements. Database algorithms did not change significantly. This prediction was surely made by a phd who does not know anything about what software gets actually written in 99% of all cases.
Well, looks like Intel is indeed having issues with 32nm process as I have speculated before, they are even willing to run a story about some stupid chipset “error” to hide the 32nm sandybridge production yield issue. That means Westmere-EX aka Xeon E7 also will have yield issues. And to speculate further, if Intel’s having 32nm issues, Global Foundry also would face issues. I wonder if that has something to do with Dirk’s departure. Moore’s wall is here TODAY! Algorithms can’t do jack. Speculative algorithms might speed up things, but not in terms of performance per watt per dollar. Google is right that, we are not there yet to optimize our algorithms based on performance per watt per dollar, in fact we are a long way off.
As tobi describes, algorithm improvements (strictly speaking) will be limited. However, *design* improvements, including algorithms, are enabling huge performance increases for more problems.
For example, contrast one computer serially searching the web for “Britney Spears” versus thousands of Google computers doing the same search. Splitting search across thousands of computers is a design improvement, including algorithm improvements to make that design work.
Robin’s basic point is that the no-effort performance improvements of the past 50 years are gone. FORTRAN programs I wrote in 1979 run much faster now, without any changes. But their performance has barely changed for the last 5 years.
Big performance improvements will not come from generalized schemes that speed up all work. Instead, particular problems will speed up due to design changes. Making those design changes will take a lot of work, and won’t necessarily help other problems.
Rewriting 32 year old FORTRAN programs to run on multiple cores or widely distributed systems takes a lot of work. Some source-code analysis tools and advanced compilers help, but not by much.
And some problems will always run into Amdahl’s Law.
This makes me think flippantly of the obscene modern question about how you dump some HTML out of a database: “What stack are you using?”
Ah, right, so you want me to use PHP, plus a PHP engine, plus a bunch of OS dependancies, plus some plugins, plus some additional language to talk to the PHP engine, and a whole web server and object broker, so you can talk to a half – wit, non – compliant SQL interpreter not worthy of any name, to call a store which is a amateur rehash of ideas anyone should have had down pat for 40 years, to call a few tables, but it still messes it up? Hmmm.
If that is what “getting better alogrithms” means for performance, Sure Thing Fella, i’m sure we’ll see a 40* improvement . . .
Geez.
I think the domain problem needs better definition!
Is not algorithm the implementation of the design *intent*?
cheers,
– john
I see this as a temporary issue. Most problems are inherently parallel…..at least when you leave some error margin.
Human brain is a cluster of 10^11 computers, each with a 300 hz CPU and everything human brain can solve is inherently parallel.
The challenge is how to make computers work this way…but I’m sure we’ll find out when the need is burning enough.
All my problem are parrallelizable. I simply need more staff!
The corollary to Amdahl, is that all problems expand to fit only the third centile of human bodies available. As a race, we seem very good at keeping those first 30% occupied, and wasting the remainder.
This is my gripe with a lot of open source code. Equally my gripe with the big technological hegemonies. Probably everyone on this planet understands how to make a wheel or will rather quickly get it. That’s useful. They wear out. A decent RDBMS? Want to see growth? Trust bust that. I suppose this is why some economies choose to plagiarize / reverse engineer. Scary for the balance of payments, but utterly human behaviour. “God gave you eyes, plagiarize.”
Academia has a comparable habit, because there are only so many subjects where the language is so specific, you can’t get a normal translation. Einstein didn’t put any math in his famous paper, he just explained what he meant. That was what was powerful. But academics (I’m just ratting out my big bro, since i can remember!) like to couch it all in ways which have huge impedance mismatches even within their field. No! You may not use my thoughts!
As you can possibly tell from my calling out on open source, i believe they’ve built another cathedral. I fear that what was once just the geek kids, has become cultism. There is a VC crowd, priesting over their baby converts, right now. Or maybe i grew up, and my Uncle told me to get into computers in the way Plastics might have been advised. Though Uncle was ahead of himself, as you’d expect a government physicist to be.
However it is, and I am trying hard not to come across as a McCarthy target, the most efficient way to get things done, is to explain oneself clearly, and parcel out the work. To whoever. If you do that right, you can also retain decent amounts of security as to your specific aim.
This, of course, is at odds with the doctrine of ownership in most companies. But as I bleat, as and when I get the chance, if you’re going for the big time, then this matters. If you really don’t want to be the next Larry, don’t fret it. Funnily enough, whilst we all dream, most of us don’t want to be Larry. But i fear it’s a cursed spell on youth, led by wandering minstrels. “Fear My Wrath! Thou Shalt Not Get A 8 Digit Series C. And So I Smote Mine VC Enemies, Because I Have Salughtered Their Youth.”
I shall probably now get arrested for fly posting Codd’s papers on the metro! 🙂
All meant in serious humor. I’ll read again that paper. I was basically born beause of, and bred on that kind of big initiative, and I want it to work. Before we have the rest of the world realizing it’s a passable diplomatic initiative to scream “But your SQL 92 just ain’t mate, it doesn’t work!”
thanks,
– john
Maybe PCAST is right about “speech recognition, for natural language translation, for chess playing, for logistics planning … and LP” improving with better algorithms (and maybe it’s just the availability of more RAM that really made those faster) but in the workaday world of software, new algorithms have close to zero impact in the past 20 years.
However, what has made a large performance impact is hash key lookups rather than array traversal. All major scripting languages have had this feature for 15 years or more, and it’s trivial to implement in other languages like C/C++.
Standardized libraries (CPAN, PEAR, BOOST, etc.) have also greatly improved reliability of code, and also usage of the latest algorithm when available.
Having said that, any bioinformatics scientist will tell you there’s 500 years of computer science problems waiting to be solved in genomics.
There’s a limit to how much any technology or methodology can be tweaked, until it plateaus and a the next generation comes from a fresh direction. Algorithms could be that fresh approach (linkages between synapses/hemispheres vs. the raw synapse count), but I tend to think they’re just another tweak. Clever coding has been around since paper tape and vacuum tubes, but between increasingly powerful systems and pre-fab app modules it’s been too easy for inexperienced coders to ship sloppy work–or even for expert ones to have to accept bloat vs. R&D hours.
There was a paper out a year back or so.
The ultimate physical limit for Moore’s Law would be reached in 75 years given the traditional 18 months doubling period.
This is based on quantum limits of information processing, regardless of what tech is used.
Still, if we could get even 50 years we are talking about serious performance – like your PC same processing power as all human brains kind of thing.
The next possibility seems to be neuromorphic hardware, doing things like the brain opens up a lot of possibilities. I’m cautiously optimistic that will keep the ball rolling.
– analog/digital hybrid
– massively parallel
– ultra-low power
– slower clock
Or maybe some other stuff in the lab like plasmonics, memristors, …
In your “Moore’s Wall” article, you write: “[…]right now many of the brightest minds in computer science are struggling with the problem of getting usable work out of 8, 12 or 16 core CPUs.”
Aren’t these the “algorithmic improvements ” that can keep us in double-digit territory? Maybe we need to look at the kind of work whose performance we want to push… not desktop interfaces — they’re fast enough already. What, then? Building large volumes of software, predicting weather, looking for oil (well, that’s on *my* mind this week ;-), searching large volumes of data, that kind of stuff?
I’ll grant that forward progress may not be easy, but all of this class of problem seems as if it should be able to be attacked with concurrency.
Where does this thought process break down?
I am so very happy with today’s computers that I believe it would be no practical loss if no further hardware improvements came down the pike. I see 8 fast processors within a single desktop PC and a worldwide failure of imagination for how to put it to good use.