Previously on StorSimple deepdive
So we’ve talked about the underlying hardware solution that StorSimple provides at in my first deep dive here, then we moved onto the storage efficiencies, life of a block and cloud integration here so in my third, and for now, final deep dive post I’m going to touch on how StorSimple provides the mechanism to efficiently backup, restore and even provide a DR solution without the need for secondary or tertiary sites and data centres.
Fingerprints, chunks and SnapShots…we know where your blocks live
StorSimple fingerprints, chunks and tracks all blocks that are written to the appliance. This allows it to take very efficient local snapshots that take up no space and have no performance impact. It doesn’t have to go out and read through all meta data to work out what blocks or files have changed, like a traditional backup. Reading through file information is one of the worst enemies of backing up unstructured data, if you have millions of files (which is common) it can take hours just to read through the data to work out what files have changed before you back up a single file. So StorSimple can efficiently give you local points in time for quick restores which are near instant to backup and restore from.
I disagree! A snapshot doesn’t count!
However it is my opinion that a snapshot is not a backup…so why the heck is my blog title about backup, restore and DR?! It is because I also believe that a snapshot is a backup if it is replicated. StorSimple provides another option for snapshots called “Cloud Snapshots”. This takes a copy of all the data on a single volume, or multiple volumes, up to Windows Azure, including all the metadata. Obviously the first cloud snapshot is the whole data set, we make this easier as all the data is deduplicated, compressed and protected with AES 256 bit encryption. After this first baseline only unique block changes, which are optimized with dedupe and compression and then encrypted, are taken up to Windows Azure. These cloud snapshots are policy based and can be kept for hours, days, weeks, months or years as required.
Data is offsite and multiple points of time are available and generally backup windows are reduced. Once data gets into Azure we further protect your information. Azure storage, by default, is configured with geo-replication turned on. This means that three copies of any block of data are copied to the primary data centre and three copies of any block of data are also copied to the partner data centre, even if you turn it off you still have three copies of your data sitting in the primary data centre. This means at least three, but generally six, copies of all data reside in Azure.
So we have simple, efficient and policy driven snapshots and all snapshot data replicated six times, across different geographies…I think I can safely call this a backup and probably with more resiliency than most legacy tape or local disk based backup systems customers are using now.
And now how do I restore my data?
So the scenario is someone requires some files back from months ago, or even a year ago. It is maybe a few GB at most but we still want to get it back quickly and easily, and the user also wants to search the directory structure too for some relevant information.
StorSimple offers the ability to connect to any of the cloud snapshots, create a clone and mount it to a server. This clone will not pull down any data, apart from any metadata which is not already on the StorSimple solution, so is extremely efficient. All data however will appear local and you can browse the directory structure and only copy back the files that are required….and all the blocks that constitutes these files is deduplicated, compressed AND only the blocks which are unique and not already located on the StorSimple solution are required to be copied back.
The process is as simple as going to your cloud snapshots in the management console, selecting the point in time you wish to recover and selecting “clone”. You will then be prompted for a mount point or drive letter and within seconds the drive is mounted up. Couldn’t be simpler!
How does this provide a DR solution?
Cloud Snapshots can be set up with an RPO as low as 15 minutes (rate of change and bandwidth dependent). In the event of a DR where your primary DataCentre is a smoking hole in the ground, or washed away in a cataclysmic tidal wave/tsunami, another StorSimple appliance can then connect up to any one of those cloud snapshots and mount it up. All it needs is an internet connection, the Azure storage blob credentials and the encryption key that was used to encrypt the data.
The StorSimple solution then only pulls down the metadata, which is a very small subset and very quick to download, and bingo all your data and files can be presented and appear to be local. Then as the users start opening their memes of their cats and other images to create YOLO memes the optimised blocks are then downloaded and cached locally on the StorSimple appliance. In this fashion the StorSimple appliance starts re-caching all the hot data which is requested and doesn’t have to pull down data which is cold as well.
My personal opinion is that we will only see enhancements to this solution; imagine being able to do this DR scenario all from out of Windows Azure, suddenly having a physical DR site and hardware no longer matters….now that would be cool
Extra benefit…faster tiering to the cloud!
In the previous deep dive here I talk about how StorSimple tiers data to the cloud based on it’s weighted storage layout algorithm but tries to keep as much data as possible locally so it provides optimal performance for hot and warm data. In the event that you want to copy a large amount of data to a StorSimple appliance, more than the available space left on the StorSimple appliance, you won’t have to wait for data to be tiered to be moved to the cloud if you have been taking cloud snapshots.
Where this already a copy of a block of data in the cloud, from a cloud snapshot, and it has to be tiered up only the metadata will change, to point to the block in the cloud, and no data will have to be uploaded letting you have your cake and eat it too. You get the efficiencies of tiering cold data to the cloud but the ability to still copy large amounts of data to the appliance without large data transfers immediately following the process.
Have your say
Don’t agree with me or agree with me about a snapshot being backup? Don’t like me using stupid sayings? Give your opinion below