Update
Back in 2002 I met several times with the CTO of a large defense contractor to discuss how my company could help them build a “network intrusion detection system”. He described a system that would take in about 5 TB of data daily from about 500 network monitoring points and load it into a flat file. That file would be massaged down to about 1 TB per day as unneeded datapoints were discarded. The SI on the job, who I also met with later, is the same one who’s done all the work on the discontinued TIA project and most of the other domestic spying jobs, Hicks and Associates. So I’ve concluded that “detection system” was in fact the NSA project, and that with 500 network monitoring points it is much larger than anyone in Congress knows about.

Also, it turns out that they aren’t limiting themselves to the scenario that Bruce Schneier so ably demolishes below: they are investigating people based on what they do, rather than who they know. Learn more at DefenseTech.org

This is getting weirder and weirder. Good for the storage industry until it stops though.
End Update

A recent article in Popular Science talks about the redoubtable Gordon Bell’s proposal for MyLifeBits, a machine that in just 15 years will allow you to capture, analyze, classify, store and search your entire life, like ILM, only it will actually work. Capturing video of every waking moment, listening to all your phone calls, archiving all the IM, mail, TV, calendars, meeting notes, every single boring banal moment caught in HD video and DTS 6.1 surround sound, transcribed, indexed, tagged and searchable. Bell, a famously forgetful and brilliant engineer who ran DEC’s R&D for 23 years, is quoted saying

Having a surrogate memory creates a freeing and secure feeling. It’s similar to having an assistant with perfect memory.

I’m not sure it would be a total panacea, if, like Bell, you habitually become so entranced with a brilliant new thought halfway through a sentence that you forget what you began to say. Instant replay, perhaps?

The relevance to the storage industry is, of course, that such devices would require terabytes of capacity every month. We could each have a Symmetrix mounted on a Segway tethered to our personal SenseCam following us about.

Or we could simply blast all the data wirelessly to a large, secure, data facility dedicated to keeping the data forever. Such as the National Security Agency, or NSA.

In an amazing bit of technical serendipity, while Bell is developing MyLifeBits to record all of our life’s data, the busy gnomes at the Intelligence Community’s Advanced Research Development Activity (ARDA) have a program called Novel Intelligence from Massive Data (NIMD). Novel intelligence refers to “actionable information not previously known”, a deliciously suggestive phrase for patriotic American’s proud of a knowing, action oriented government. As opposed to the ignorant and slothful government that mishandled Katrina, bungled the occupation of Iraq, and can’t balance the budget. You know, the one that exists outside the Intelligence Community.

NIMD is a data-mining program. And do they have data to mine. ARDA’s no longer available website stated

some intelligence data sources grow at the rate of four petabytes per month now, and the rate of growth is increasing.

Only four petabytes a month? Obviously they need more data. You can’t keep America safe with just 4 PB of data a month. And they need to store it. Forever.

Like Bell’s MyLifeBits, the NIMD data mining program is tackling the tough problems of data analysis and classification, looking at unstructured text, spoken text, audio, video, graphs, diagrams, images, maps, equations, chemical formulas, tables and so on.

Are they succeeding? They aren’t about to say.

Yet if you listen to the respected security expert Bruce Schneier, CTO of Counterpane Internet Security, it is an open question whether data mining can possibly succeed. As he noted in a Wired commentary

When it comes to terrorism, however, trillions of connections exist between people and events — things that the data-mining system will have to “look at” — and very few plots. This rarity makes even accurate identification systems useless.

Let’s look at some numbers. We’ll be optimistic — we’ll assume the system has a one in 100 false-positive rate (99 percent accurate), and a one in 1,000 false-negative rate (99.9 percent accurate). Assume 1 trillion possible indicators to sift through: that’s about 10 events — e-mails, phone calls, purchases, web destinations, whatever — per person in the United States per day. Also assume that 10 of them are actually terrorists plotting.

This unrealistically accurate system will generate 1 billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999 percent and you’re still chasing 2,750 false alarms per day — but that will inevitably raise your false negatives, and you’re going to miss some of those 10 real plots.

So maybe it isn’t about data mining. Yet this expensively acquired technology can still help build a better, more secure America.

MyLifeBits, NIMD and MIT’s new Speechome project, where a professor is having his home wired for audio and video to create a 1 PB+ data store recording his child’s first nine months is certainly suggestive of a brave new world of full employment for storage professionals.

Mr. Bell, of course, thinks of MyLifeBits as a personal memory bank, a PDA for the ages. Yet once everyone is wearing one we can create an America where no crime goes unnoted — or unpunished.

We Can Still Monetize Our National Security Investment
Link all the surveillance cameras, implant RFID so people can be positively identified, capture the MyLifeBits data and run it all through the Son of NIMD, and we can have an America where everyone is responsible for their behavior 24 hours a day. No more security through obscurity or anonymity. Add GPS to automobiles and every incident of speeding can be tagged and fined.

Most crimes are misdemeanors, which means that mildly criminal behavior is easily monetized through fines. With lots of bahavioral data available we will be able to identify early warning signs of criminal behavior or psychiatric disorder. Perhaps we’ll even be able to implement the “pre-crime” units envisioned in the movie “Minority Report”, cops arriving just in time to stop the drug buy, the political bribe, or the underage actress drinking at a bar.

We’ll drastically reduce crime and reduce taxes as well. Isn’t that what we all want?

If you have nothing to hide, why would you object? Don’t you trust the government?

The Founding Father’s didn’t. Maybe we know something they didn’t. Or not.

Update
Fine post over at CIO.com blogs by Ben Worthen about a poor sapsucker whose name matches the name of some sleazoid criminal in another state. He can never get on an airplane without the 3rd degree; the DMV hassles him all the time. It really isn’t about trust. It’s about competence. As the biggest organization in our society, the government is only as good as we are. Sorry.