Categories
Computing Technology

Put it in your pipe and store it – thoughts on data storage

Roelof Temmingh introduced a fascinating idea in his talk on tea at Zacon II yesterday, and I woke up this morning with some free time and an iPad handy, so I decided to explore the concept of using a “series of tubes” as a storage medium a little closer. At this stage it’s just a gathering of thoughts but hopefully I’ll take the time at a later stage to throw some code behind the idea too.

So all around us we have this connectivity between digital devices. These “pipes” take a number of different forms – the interwebs uses mainly copper and fiber, cell phones using allocated GSM spectrum, WiFi and WiMax similarly using “air” as their means of transport, satellite, fibre-attached storage, USB cables, IP over power lines, the list goes on. The important thing here is that when data enters one of these transport mediums it is effectively stored within that medium during transit. It no longer needs to exist at the source or destination, but out of tradition and habit it usually does. A great example of this transitory data storage is cellular voice traffic – it is effectively split into packets which are sent over the “air” as fast as possible, never remaining stationary until they reach the destination where they are effectively “deleted” when they are consumed.

But what if the data being transferred is never allowed to reach a termination point? It could be looped and for as long as the signal stays alive in the medium that data will be stored there. Could it be possible to send a file into GSM spectrum – the air – and keep it there? And what exactly are the data storage limitations of “air”?

Considering this idea was pitched at a security conference, we need to examine how the security and integrity of the data can be maintained in a rather intangible medium. Presumably encryption of the data before transmission solves part of the problem, with hash sums helping out too. But do we need to firewall this? After all, data in transit is ripe for interception – Bug #0 in IT security is the fact that only immovable data can be 100% secure, but data is useless until it is moved. And how do you firewall a copper cable? A GSM signal? Something to think about…

Another good analogy is to think of treating data more like electricity than we traditionally have. Electricity doesn’t get syn/acked to your plug when you switch it on – it’s stored in the medium.

The simplest implementation I can think of is a peer to peer setup, where Alice drops her file onto the wire, with Bob as the “destination” (sorry I’m pissing all over the crypto guys here). But all Bob has is a reflector running which bounces that data straight back at Alice, who in turn has a reflector which pushes it back to Bob again. Introduce more reflector peers as you desire. A lot of Worms succeed by working in similar fashion. Of course Alice needs a way to pull her file back. Some sort of signal can be pushed onto the wire, chasing the file and telling the reflectors to pass it through Alice’s node for a read/write transaction. Each node, whether it be a router or reflector, of course introduces a security risk that needs to be mitigated. (insert previous paragraph on security here)

The problem with the scenario above is that it becomes very easy to saturate a link. Ask any bittorrent user about their experience in destroying someones network. We still need the links to be links – we’re not trying to build copper hard drives here… Contrary to everything previously thought about bandwidth and speed, we could actually have the data moving relatively slowly between nodes, which should reduce the aspect of link saturation.

The principle of data stored in the pipe can also be pretty robust. It aligns itself well with the original premise of a “nuclear-event proof” Internet.

This is also a pretty good covert channel. If I store nothing sensitive on my local machine, but rather push it onto the grid, I can effectively detach myself and a sound infrastructure would guarantee my ability to retrieve that data anywhere in the net. No data would be retrievable even from forensic analysis of my machine (assuming secure practices on my side).

There’s loads more of this concept to explore, but the summary I have so far is:
1. It can be done – it’s actually kind of happening right now.
2. There is a very large, though not infinite, capacity.
3. It can be secured.
4. It can be robust.
5. It’s way more fun than hard drives.

I hope to explore this idea much further. Even if just for shits and giggles.

Your thoughts?

Leave a Reply