Wednesday, November 4, 2015

Partly Cloudy -- ramblings on cloud storage

I've been using SugarSync as a cloud storage service for a few years.  It works well enough but with so many cloud storage providers these days, native support for it (ie, inbuilt support in iOS apps or other third-party clients) isn't there.  It doesn't offer webdav or third party client support (more on these later) and it's expensive.

So I've been look at alternatives and made a fairly knee-jerk decision and bought 1 TB of storage from DropBox.  They're supported nearly everywhere there's a cloud storage option, and I think 1 TB was like $99 for a year, which only buys 250GB from SugarSync.  I probably should have considered Amazon cloud storage for its unlimited $69 cost, but I didn't find out until later, I doubt it will stay unlimited and I've been reading they are throttling some clients due to API use overwhelming the platform.  It might be useful in the future.

The feature I will kind of miss that SugarSync had was the ability to share arbitrary folders (ie, it didn't have to all be under one folder, like DropBox).  I'm kind of simulating this with DropBox by using Windows Symbolic Links in my DropBox folder.  I've read there are limits, but it remains to be seen if these will be annoying (ie, lack of refresh without restarting Dropbox client).  It does now have selective sync per client, so I can do partial syncs on small storage devices without clobbering all of them with 1 TB of junk.

One alternative I looked at was OwnCloud.  It works pretty well -- although I did download someone else's prebuilt Vmware VM for it.  They did a slick job of it (with only a couple of minor criticisms).  It's webdav, so most third parties can be clients, it seems to be fairly feature rich, and it's obviously very private (no worries about leaking data).

The downside is, well, hosting my own data defeats some of the purpose.  I'm burning double the storage if I'm actually syncing my true source data (ie, my data on my workstation and data synced to owncloud).

Like many other useful applications, it follows the open source clusterfuck of modules model -- you need a web server, you need php, you need a database, you need a supported host OS.  A lot of complex moving parts to maintain and secure.  The third party preconfigured VM is great and fairly simple to get going, but suffers from the not-really-an-appliance problem where no matter how slickly packaged, at its core its not an appliance by intent, unlike pfsense or nas4free or other similar "appliance" installs.  I have kept it running, however, because it does seem to work pretty well and it has the advantage of near zero upload time from my workstation.

My gripes about the prepacked install are small -- it's a vmx/vmdk, not an OVA.   The creator helpfully has steps for expanding the storage, but didn't think that maybe the smartest way to do this would be to keep the data storage as a separate virtual disk, so you could expand that but keep the system disk separate.

My latest foray is into third party cloud storage *clients*.  Normally (and probably always on some of my systems) I would just run the vendor's native sync client.  Free, built to work with their service and generally stable.   But I've discovered third party clients that let you run less software overall and link up to multiple cloud storage devices.

Netdrive seems like the best choice from what I've seen, although I have a hard time differentiating them from ExpandDrive.  I think neither one does actual sync, which is a feature for using them on systems with small local storage (or where you don't want sync at all).  They both cache, but neither seems to give you total control of caching (ie, pin some files as cache always or other behavior).  Netdrive is better because it lets you control cache placement and size, but ExpanDrive is supposed to have some fancy caching algorithm that's more aggressive.

StableBit CloudDrive is an interesting one because it does encryption, but really what it does is create a file-backed virtual disk on your cloud storage.  It's caching is block level, which is more sophisticated, and it works with non-cloud-storage providers, like local disks and smb shares.  Downside is no webdav access.  And in some ways, the purpose of a cloud storage is easy access to files from multiple locations.   This won't work well for that due to encryption and probably multiple simultaneous system access.

All are kind of expensive.  I think Netdrive is locked to a specific computer, and at $45 per system that's too expensive.  They need a different model, preferably one that lets me use it anywhere for less money, since as far as I can tell there's little penalty using multiple native sync clients simultaneously and they're all free with the service.

StableBit's CloudDrive is the most innovative and can be used with another product of theirs, DrivePool, a storage aggregation product.  The combination is interesting and it would be interesting to see the features combined in a future multi-cloud client.

The kind of features I would like to see in a multi-cloud client:


  • Better caching control and support
    • Pin files or directories to cache
    • Selective synchronization of directories or files
    • Cache placement and size control
  • Selective syncing between cloud storage accounts
    • Use a large-volume paid account for main storage, but selectively sync files/directories to a secondary account.  My use case is I have some updates/free tools/whatever I want to be able to give access to other people without worrying about compromising my primary storage.  This would keep "key patches/tools" in sync with a disposable account.
  • Selective encryption -- I like StableBit's encryption, but encrypting everything limits the multi-point access utility of typical cloud storage.  I'd like to be able to encrypt at the folder level (which could be the file-based blob storage).
  • RAID-like storage among cloud storage accounts
    • Complete mirroring would make cloud storage more highly available if there was a loss of connectivity or problems with any one provider
    • Possible performance benefits (if any single provider was rate limiting)
    • Parity style RAID among providers would provide both redundancy and a measure of security since access to any one provider wouldn't be enough to use any data.  A dedicated parity store could be kept local, improving performance and securing the actual data further.
  • LAN sync -- Dropbox does this now, but they all should do it, *and* they should allow LAN only sync folders.