For All The Hype, Cloud Storage May Be Mostly Vapor

rated by 0 users
This post has 1 Reply | 0 Followers

Top 10 Contributor
Posts 25,690
Points 1,156,390
Joined: Sep 2007
News Posted: Mon, Oct 12 2009 6:26 PM
It's impossible, these days, to swing a dead cat more than about six inches without running into "the cloud" in one form or another. Whether we're discussing cloud computing as a storage concept, server provisioning, or consumer appliance, moving data into or out of the cloud has been a major topic over the past year. For all the traction cloud computing has found as an idea, however, it may end up being practically useful to a relatively small subsection of the total storage market.

In theory, Cloud Computing could free data centers from needing to keep quite so much storage around—provided they've got the time and bandwidth to upload it in the first place.

A quick glance at the broadband speeds typically available to both commercial businesses and residential consumers illustrates the problem. ADSL service in my area tops out at 512Kbps upstream and in my own experience, delivers about 60 percent of that at best. Cable is theoretically stronger on this point, but even if InsightBB is capable of delivering 100 percent of its advertised 2Mbit upload speed to business customers, that relatively wide pipe could pale in comparison to even modest needs. At top speed, that's 21GB a day, but still 49 days per TB. With consumer 1TB drives now hovering just above eight cents per gigabyte, the need to back up data in such volumes is no longer an enterprise-only consideration.

Short of a sea change, available storage capacity at any given price point is going to continue to grow markedly faster than available upload bandwidth, which could put practical constraints on just how large the "cloud" can get. As a storage facility for essential files or a few shared photo albums, the cloud works well. Whether or not it will ever evolve into the "one-size-fits-all" business solution its been touted as in certain circles is very much an open question.
  • | Post Points: 20
Top 10 Contributor
Posts 5,053
Points 60,700
Joined: May 2008
Location: U.S.
3vi1 replied on Wed, Oct 14 2009 9:42 AM

I've got an idea that I think fixes the upstream bandwidth problem and makes this workable.

First: On the client side, my app would let you define directories and/or file-types to be common/public files, or private/sensitive files. By default almost everything would be common, except for known extensions for Tax data, address books, etc..  For security purposes, files classified as private/sensitive would not use the hashing scheme I'm about to go into. Instead, private files would be encrypted and stored individually using the standard methods employed today.

Now, here's the good part:  The client app installs a service that computes SHA-2 hashes for all your files.  Then, when you start your backup, the software sends only those hashes to the server to build a list of unknown files. By using the hash, the filename doesn't even matter.  If it is known, you never even have to transfer the file to the server. The server just adds a pointer to the existing known file in it's "backup".

This means that most user's backups would occur lightning fast. One person backs up Windows 7 = All other users can back up Win 7 a minute, even on dial-up. As more and more users join the service, everything except the truly unique and private files are instantly known and marked as backed-up. It should even be possible to implement the hashing at a block level and make use of it to some extent for partially unique files.

Also, from that point on, all backups are incremental.  Even your unique/private files will have hashes that the cloud service deems private for your account, so they can be skipped when they don't change.  Running every day, your backups would take just a few seconds.

This is somewhat similar to what BackupPC for Linux allows you to do in your own network with MD5 hashes, except without the public/private security additions I added to prevent future collision attacks (there are none known for SHA2 at this time) and the block level hashing.

Anyone got a ton of money and need a Senior Technologist for their startup?  :)

What part of "Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn" don't you understand?


  • | Post Points: 5
Page 1 of 1 (2 items) | RSS