Making data Vanish

by Robin Harris | Friday, July 9, 2010 | Cloud computing & storage, Clusters, Future Tech, Security & Public Policy | 6 comments

Given how hard it is to save data you want (see The Universe hates your data) to keep, losing data on the web should be easy. It isn’t, because it gets stored so many places in its travels.

Problem
But the power of the web means that silliness can now be stored and found with the speed of a Google search. You don’t want sexy love notes – or pictures – to a former flame posted after infatuation ends.

Or maybe you want to discuss relationship, health or work problems with a friend over email – and don’t want your musings to be later shared with others. Wouldn’t it be nice to know that such messages will become unreadable even if your friend is unreliable?

Researchers built a prototype service – Vanish – that seeks to:

. . . ensure that all copies of certain data become unreadable after a user-specified time, without any specific action on the part of a user, without needing to trust any single third party to perform the deletion, and even if an attacker obtains both a cached copy of that data and the user’s cryptographic keys and passwords.

That’s a tall order. Their 1st proof-of-concept failed. But they are continuing the fight.

Vanish
In Vanish: Increasing Data Privacy with Self-Destructing Data Roxana Geambasu, Tadayoshi Kohno, Amit A. Levy and Henry M. Levy of the University of Washington computer science department present an architecture and a prototype to do just that.

Ironically, the project utilizes the same P2P infrastructures that preserves and distribute data: BitTorrent’s VUZE distributed hash table (DHT) client.

The basic idea is this: Vanish encrypts your data with a random key, destroys the key, and then sprinkles pieces of the key across random nodes of the DHT. You tell the system when to destroy the key and your data goes poof!

They developed a data structure called a Vanishing Data Object (VDO) that encapsulates user data and prevents the content from persisting. And the data becomes unreadable even if the attacker gets a pristine copy of the VDO from before its expiration and all the associated keys and passwords.

Here’s a timeline for that attack:

DHT overview

A DHT is a distributed, peer-to-peer (P2P) storage network. . . . DHTs like Vuze generally exhibit a put/get interface for reading and storing data, which is implemented internally by three operations: lookup, get, and store. The data itself consists of an (index, value) pair. Each node in the DHT manages a part of an astronomically large index name space (e.g., 2¹⁶⁰ values for Vuze).

DHTs are available, scalable, broadly distributed and decentralized with rapid node churn. All these properties are ideal for an infrastructure that has to withstand a wide variety of attacks.

Vanish architecture

Data (D) is encrypted (E) with key (K) to deliver cyphertext (C). Then K is split into N shares – K₁,…,K_N – and distributed across the DHT using a random access key (L) and a secure pseudo-random number generator. The K split uses a redundant erasure code so that a user definable subset of N shares can reconstruct the key.

The erasure codes are needed because DHTs lose data due to node churn. It is a bug that is also a feature for secure destruction of data.

Prototype
They built a Firefox plug-in for Gmail to create self-destructing emails and another – FireVanish – for making any text in a web input box self-destructing. They also built a file app, so you can make any file self-destructing. Handy for Word backup files that you don’t want to keep around.

The major change to the Vuze BitTorrent client was less than 50 lines of code to prevent lookup sniffing attacks. Those changes only affect the client, not the DHT.

The Vanish proto was cracked by a group of researchers at UT Austin, Princeton, and U of Michigan. They found that an eavesdropper could collect the key shards from the DHT and reassemble the “vanished” content.

Who is going to collect all the shard-like pieces on DHTs? Other than the NSA and other major intelligence services, probably no one. For extra security the data can be encrypted before VDO encapsulation.

The StorageMojo take
The Internet is paid for with our loss of privacy. Young people may think it no great loss, check back in 20 years and we’ll see what you think then.

It is slowly dawning on the public that their lives are an open book on the Internet. Expect a growing market for private communication and storage if ease-of-use and trust issues can be resolved.

You don’t have to be Tiger Woods to want to keep your private life private. I hope the Vanish team succeeds.

Courteous comments welcome, of course. Figures courtesy of the Vanish team.

6 Comments

Taylor on Friday, 9 July, 2010 at 8:06 pm

It’s an interesting idea. There is nothing, however, that prevents an agent from making copies of the files (assembling them before expiration, then simply copying the assembled bits to a regular file) *before* the expiration.
Visiotech on Saturday, 10 July, 2010 at 9:24 am

Data life is much higher today than it used to be on mainframe. Private data is high demand for various firms such as marketing, financial, medical, insurance, security agencies. They all understand what private data can do for them.

With social network privacy has opened eyes to few peoples. Looking at few other like Google street grabbing private information, like your Wifi SSID and location, several security firms and agency are better armed than they used to be. If you are using Wifi at home, now they have a way to pinpoint your IP location on Google Map.

Looking at another Google feature such as Picasa, who’s capable to do facial recognition of your entire photos, who knows what they do once you upload them on Google Picasa. Are they used by these firms and security agencies? Looking at their Terms & Conditions they can share all we place on these social and search engines.

Having no secure way to erase you shared data on these social networks and few others, it becomes important to share wisely.

I do not want to be involved into a criminal investigation or any similar datawarehouse searches because one member of my social network got involved into a security problem or see my shared photos, emails using these “free” services. Like you said. Nothing is free on the internet.
Anonymouse on Saturday, 10 July, 2010 at 8:01 pm

Compete Snake oil.

First, you can’t have security if you rely on untrusted parties doing something (even if it is deleting data).

Second, it’s simpler to use OTR. It ensures “This conversation didn’t happen” even if your computer and your private key get compromised later.
http://en.wikipedia.org/wiki/Off-the-Record_Messaging

Third, if “your friend is unreliable”, perhaps you shouldn’t tell him your thoughts? After all, he can always tell others what you said. Vanish and OTR make it so nobody (even your friend) can prove you said it. But OTR does it much more effectively, and much, much faster. (Keys are rendered useless instantly after every message exchange, not days or hours later.)
Visiotech on Sunday, 11 July, 2010 at 8:29 am

Here is another good one this weekend on this subject.
http://www.pcworld.com/article/200868/dont_look_now_but_googles_following_you.html?tk=rss_news

Free browsers and other software “might” capture and/or allow sharing your privacy without big warning… Nothing new…since freeware and shareware exist. That is prior internet. BBS time. Internet just added data sharing and exchange happening faster and more silently.
Athanasios Douitsis on Tuesday, 13 July, 2010 at 6:10 am

Reading this post I remembered an occassion some years ago where I had the rare luck to watch this presentation by Radia Perlman. The title was “Data: How to keep it when you want it and lose it when you want it gone”. With a little search I also located this paper which describes this concept.
Darren McBride on Wednesday, 14 July, 2010 at 5:57 pm

I have a copy of a nice little piece of software from Techsmith called Snagit. I use it to do screen grabs of all kinds of things quickly. I use another of their products (Camtasia) to grab youtube videos that can’t be downloaded easily. In general if it can come up on my screen and be read I can snap shot it to a graphic file. Admittedly, not as useful a format as having a stored email or text file, but if I want to save a message that displays on my screen with the most sophisticated encryption known to man (and reprint or resend it later) I don’t have to be a security guru to do it. Or forget the fancy software… Hitting the Print Screen button and pasting (CTRL-V) the results into MS-paint should do the trick.