As the world’s largest consumer of information technology, the US government has driven IT R&D for over a century. And they’ve done a pretty good job. What’s next?
In Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology (pdf) Eric Schmidt (Google), Craig Mundie (Microsoft) and assorted science, engineering and academic heavies lay out a proposal.
There’s a lot more than can be quickly summarized. Among the big picture items these 3 caught my eye:
- A science of social computing. Can the few successful examples of crowdsourcing be analyzed to create engineering principles for designing large-scale, human-driven, information production systems?
- Automatic extraction of information. As our computer-mediated world produces more data, the need to extract actionable information without human latency grows.
- Autonomous robots. Reduce the need for fine-grained control of automated devices by people through better visual recognition, more flexible orientation, navigation, manipulation, and interaction, and new learning algorithms for intelligent behavior.
Storage issues
The report spends a fair amount of time on storage – as well they should. Once crowd-sourcing, auto info extraction and sensor networks get rolling, our current rates of data production will look pathetic.
This list covers most of the storage issues in the report:
- Data collection, storage, and management. Standards that allow different organizations to create software tools that generate, manipulate, and analyze societally important data.
- Data quality. Detecting and correcting errors or inaccuracies in the data – automatically.
- Data privacy. Access limitations, retention requirements, reducing the risk of data loss or damage.
- Data stewardship. Tracking how, where, and when data are created and modified.
- Data integrity. Ensuring that data are not corrupted either accidentally or maliciously.
- Data storage engineering. Reliability, power consumption.
- Data management. Management across multiple storage technologies and multiple hierarchies, and with replication across multiple geographic locations.
- Change. Adapting new technology (e.g., nonvolatile RAM), performance requirements, and the need to provide consistent views of data worldwide.
- Data preservation. Long term data access and preservation beyond the durations of research grants.
- Scale-out systems. Internet-scale systems could provide powerful capabilities for scientific research, for making government data available to citizens, and for national security.
- Machine learning. People don’t scale. We’ll be working with information generated by machines and we need machines to connect the dots.
- Cross-media information extraction. Understanding speech, images, video, and unstructured data; translating speech and text to other languages. New data-driven (i.e. more storage) approaches promise to be effective.
- Data presentation and visualization. Humans process images much faster than words or numbers – so let’s get better at presenting information that way.
But wait, there’s more
Educational initiatives for K-12; specific verticals that need focus; and improving budgeting and accounting. For the latter it turns out that much of the money called “R&D” goes to fund application maintenance and agency planning – nothing that looks like R&D from an engineering perspective.
The StorageMojo take
The Fed’s record on technology is good – is there another country who’s done better? – and for IT it’s extraordinary. A few highlights:
- 1st big customer for Herman Hollerith – who later founded a forerunner company to IBM – and his punchcard accounting machines for the 1890 census.
- Funded the world’s first interactive real-time computer system (Project Whirlwind) back when all computing was batch oriented.
- Big customer for early transistors and ICs for military use.
- ARPAnet, the prototype Internet.
- Supercomputer funding that has enabled many of the Internet-scale technologies.
America’s unique system of public & private tech investment has driven the world’s most dynamic IT industry. But America’s current enthusiasm for spending cuts and the Republican anti-science bias threatens the investments that will power the next century of IT innovation.
The report lays out a strong rationale for investment, but will anyone on Capitol Hill listen? Watch your representative and senators to see that they do.
Even better, write a letter – yes, on paper! – to tell them of your concern. Especially if you have a Republican rep or senator.
Comments welcome, of course. I’m in San Diego for a few days. Happy New Year!
I’d like to thank you for this article. I’ve been following your blog for some more than a year and got a lot of thought from it. But they were not really deep ones, ones that made me rethink the role of a group of people I’m somewhat identifying myself with. Scientists.
“# Machine learning. People don’t scale. We’ll be working with information generated by machines and we need machines to connect the dots.”
There will be many deep thoughts enabled by pointing me to these few words. Thank you.
Today all large corporations R&D is called “acquisition”. They let smaller corporations develop their new ideas and capture some market share. Once they reach a level where they become a treat they simply acquire them. Less risk and impact on revenue day one.
Much simpler for stock market analysts who just understand numbers and not technologies.
Wow, a lot of that looks like AI Grand Challenge all over again.
Far too much of what i see as recent progress is to my mind optimized heuristics.
Some items however: privacy, stewardship, integrity e.g., are IMO educational / cultural problems.
NYT has a story about how a MD was restricted to 1K char for a critical patint assesment form. See what I mean about cultural education, it works two ways, and in the middle we have a vast societal fog.
If I may, the TL;DR version is “Geez, we need automated description object file systems which can multiple( inherit in flow context (and then we’d also know what to darn well archive properly too)”
I had a half waking dream the other night, having gone to bed after a holiday diving into societal pressures on information, (*cough* trying to understand what the youngsters were up to!) that what we consider the internet, and even IT, is on it’s way out now, fast. That it was a blip (so far a generational blip, depending when you were born) in terms of brief ascendancy of fairly deeply compute savvy people. I mean in terms of the succesive revolutions we’ve had since the 40s, each of which made disinct marks on how people live. My thought was that now this is a ubiquitous facility, artificial narrows (as in Rokefeller/ Standard Oil’s “control the narrows” in a rich field) will multiply. Facebook is just the obvious example. Exactly how does all this information discovery tech boosting work on what are proprietary pools of private, indentured, information?
Yes, i was sweating a little when i came to.
Equally, I came across a cool game for the kids, called Minecraft, and checking into it first, found out there was a whole bunch of fans decompiling and hacking the java binaries in really sophisticated ways (OpenGL, even I/O mods, e.g.). Guess at average age, from board chat: 15.
Sorry for the rambling, trying to make up for being pretty off with the “NFS problem” thread the other day, and feeling a little bad / philosophical.
On the “collecting information” thing: I’ve lost touch, but a tech lead I knew with a three letter who did search projects v. early 90s considered it “weaponized” in the terms of “must be in the right hands”.
There’s a lot in that which i don’t think the generation in their teens now are going to find easy to suss, because they’ll be asking their grandpa for “What was it like to live in the Cold War?”, which is crucial to understanding why there’s a big initiative like this. Again.
My concern is the commercial incentive to divide and rule data, and this is now “social” data, providing streams of Id reinforcing loops, making it harder to think out of the “group box”.
Damned Fine Thing, this project is being done, though. But Boy,does it have Scope . . .
– j
Speaking of AI, I now have to find a television* to watch or someone to tape Jeopardy from Feb 14th on.
Looks like IBM have a machine in the game (sorry if this is old news):
http://www.forbes.com/forbes/2011/0117/technology-ken-jennings-watson-ibm-plays-jeopardy.html?feed=rss_technology
Interesting bit: this machine apparently fits in 6 cabinets. (plus cooling reflow / exchangers / power conditioning, presumably) OK, IBM use bigger, non standard cabs. But just 6 for compute? This is small lab capacity.
I doubt, from 3 days of questions, obviously not all buzzed in, we’ll be able to catch any algo “cheats” in the paths selected, but it might be instructive, from a speculative point of view. Primetime Turing Test!
Or (based on the config size and a unashamed ton of conjecture) IBM are showing off Power7’s latest vector – scalar stuff. Showoffs! But very cool thinking. Whilst i think only NEC (in consortium IIRC) perseveres with straight vector monsters, Big Blue is chucking the same into a hyper clocked core within a general purpose MCM. Maybe they’re playing already with Stride – N Prefetch from the 2.06 spec. Shipping Power 7 is 2.05. I am way behind on all of this so E&EO, but reading the 2.06 docs, a little part of me keeps yelling “stack machine”. The matrices look all too familiar. Forgive me, i tingle a little bit when i think i might see something good but neglected come back.
– j
(*Haven’t had withdrawal symptoms in two decades :-))
Jean,
too much govvy funding has gone under the banner of “R&D” without defined, even non – specific goals.
That’s a truism, but you get buy -in for things like job creation – at the wrong level. Instead of doing it top down, goal oriented, things like local tax breaks happen, and a bit of research gloss often seals the deal.
Corps just love to play State vs. State.
The absolute classic is how Walt Wriston cobbled together the modern Citibank. Big tome, but “Wriston” is a gripping account of how Citi circumvented every Fed law, until the Feds gave in. Also explains petrodollars, from POV of who what where why.
On the other side, particuarly since late Web 1.0, the tax breaks have been eaten by VCs.
I agree, i think, with your general idea, that competition != research.
This is why you need a whip hand.
Where i disagree with you is the idea that small companies are great innovators. Having founded one with a supposedly neat idea, the admin and compliance burden is horrific, and, isolated for commercial privity, you get out of loop and start to reinvent. Moreover, any useful IP you generate, which is not patented (and much is not, the best example I can think of is expanded polystyrene manufacture, check that out) disappears in insolvency. (not my story, phew) Patent law encourages process reinventon: look at the pharma lot, they reinvent ways of making old drugs, to beat generics, consuming vast percentages of R&D just there. I guess i dislike VC broadly (never needed it, mind, so it’s not bitterness) because they want to take punts on a research *area*, go for execution (again, problematic depending on maturity of target problem) and expect to roll it up whilst making out like bandits. Sure, they take risks, but along the way they eat tax dollars in every conceivable offset. No Bleeding Way would VC exist without these offsets. So why do they exist? Because their history is intertwined with early Valley (see Steve Blank) which was, err, spinning off a metric unspeakable load of big scale govvy sponsored war effort.
I’m ranting, but i think my drift is i think coherent, by way of rebuttal. (Larry might give me a gold star on my homework, eh?) . Point being, we work too small, in terms of with whom we rub shoulders and share, truly share. Old ideas are being regurgitated too often. Sure, COTS was a revalation to the defence boys. What they get? Insecure mail systems, that’s what. (geez, when was x.400 / 500 done now?) I hope all this information gathering isn’t plugging a leak created by poor end to end messaging design.
gotta go,
all best,
– john