

|
Via: Enterprise Storage Forum | News Archive
| Tags:
Yahoo,
Cloud computing,
Storage,
Google,
Gmail,
Cloud Storage
|
|
I've got an idea that I think fixes the upstream bandwidth problem and makes this workable. First: On the client side, my app would let you define directories and/or file-types to be common/public files, or private/sensitive files. By default almost everything would be common, except for known extensions for Tax data, address books, etc.. For security purposes, files classified as private/sensitive would not use the hashing scheme I'm about to go into. Instead, private files would be encrypted and stored individually using the standard methods employed today. Now, here's the good part: The client app installs a service that computes SHA-2 hashes for all your files. Then, when you start your backup, the software sends only those hashes to the server to build a list of unknown files. By using the hash, the filename doesn't even matter. If it is known, you never even have to transfer the file to the server. The server just adds a pointer to the existing known file in it's "backup". This means that most user's backups would occur lightning fast. One person backs up Windows 7 = All other users can back up Win 7 a minute, even on dial-up. As more and more users join the service, everything except the truly unique and private files are instantly known and marked as backed-up. It should even be possible to implement the hashing at a block level and make use of it to some extent for partially unique files. Also, from that point on, all backups are incremental. Even your unique/private files will have hashes that the cloud service deems private for your account, so they can be skipped when they don't change. Running every day, your backups would take just a few seconds. This is somewhat similar to what BackupPC for Linux allows you to do in your own network with MD5 hashes, except without the public/private security additions I added to prevent future collision attacks (there are none known for SHA2 at this time) and the block level hashing. Anyone got a ton of money and need a Senior Technologist for their startup? :) |