Its been nearly six years since Microsoft released Windows Server 2003 and its accompanying Volume Shadow Copy Service (VSS). Most people think of VSS as Microsoft’s storage volume snapshot. Yes it includes snapshot, but it’s much more than just snapshot, it is a snapshot framework that includes a rich set of APIs to allow third-party software integration. This enables storage array vendors such as EMC or Network Appliance the ability to integrate their array based snapshot solutions with VSS. It also enables application vendors the ability to construct their applications to participate in the snapshot process so that the volume snapshot of that application’s data is a consistent image. Here’s an architectural diagram of VSS (courtesy Microsoft):
The framework of VSS is the most important aspect of this feature. It forms the basis for enabling data protection using server-less backup. In the architectural diagram, applications are the “Writers” and server-less backup software or snapshot management consoles are the “Requestor.” Microsoft includes providers for its storage subsystem and storage vendors can plug-in their own providers into the framework.
The nice thing about VSS is that applications, especially databases, are given a signal to make their data in storage consistent. They can flush buffers and transactions as if they were cleanly shutting down and then signal back that they are ready. The snapshot then gets a good looking image of the application’s data, one that the application knows it can easily access without requiring lengthy repairs or playing re-do logs.
So now let’s go back to the beginning of my blog. It’s been six years since Microsoft released this feature. Storage demands have continued growing and don’t look to stop anytime soon. The number of server-less backup products available for Windows Server continues to grow and enterprises have been deploying these products over the past couple of years to shorten the dreaded “backup window.”
But where’s Linux?
Linux distributions don’t have any equivalent feature. Yes, modern Linux distributions include the dm-snapshot module, but that just does the snapshot, there is no framework for application developers, storage vendors, or backup vendors to integrate with. The result has been a few one-off solutions focused on a particular application stack, but for the most part applications just suffer from inconsistent snapshots on Linux. These are what we call crash consistent, meaning that the application is responsible for figuring out the state of its data within the snapshot and getting it to a position it can begin running. Most databases can take upwards of twenty minutes to mount a crash consistent snapshot.
So this is a call to the Linux community: Please band together and architect a snapshot framework for all Linux application developers, storage developers, and users to benefit from.
[Posted by: Richard Jones]


See: LVM Snapshots...and have been around since forever. VSS is just a snapshot manager, and linux already has that facility with having a real volume manager built in.
And Linux had this ability about 4 years before VSS in Win2003.
Not sure what else your waiting for. BTRFS maybe which is now in Linux?
Posted by: Han Solo | February 03, 2009 at 10:58 AM
As I mention at the bottom end of the blog, Linux has had the dm-snapshot capabilities and ability to manage snapshots and restoring from snapshots. However, it lacks any framework for application integration to signal apps to "freeze" their data for a snapshot and then to "thaw" after the snapshot has occurred.
A discussion with RedHat engineers on this subject prompted this blog. We are not aware of any such frameworks for Linux in the open source. Veritas has their closed source Application Quiescence product that offers such a framework, but that's it.
Posted by: Richard Jones | February 03, 2009 at 11:50 AM
Hey Richard, you remember we wrote up a list of requirements for Data Center Linux (DCL), and published them by working with the Open Source Development Lab (OSDL). We included the need for a Freeze/Thaw framework (like NetWare added due to SAN integration experience). The Linux kernel community, in keeping with the philosophy of mechanism not policy, has now added a freeze/thaw API in 2.6.29 [1]. With a frozen file system, one can then snapshot the backing storage; as the previous poster noted either using Linux host based dm suspend/resume and snapshot etc [2], or the external SAN's snapshot feature. This all has to be coordinated of course, with open APIs that allow integration with 3rd party components in the whole stack. That is the policy part of an overall solution, that builds on the mechanisms present in the Linux kernel. I guess your point is that Linux should provide that part of the solution too. With respect to API support, you might be interested to know that Novell's SLES11 provides some of the SMI host profiles that are necessary to Orchestrate an end to end solution: e.g. including freezing the higher layer applications running in the same address space as the host OS, or in their own address space, is a distributed management problem that requires coordination of multiple layers including the Linux kernel (again, either in the hosting or hosted OS when running apps in VMs). So one might argue that the Linux kernel APIs are present, what's actually missing is policy based Orchestration of the entire storage stack, driven by knowledge of the distributed components in that stack. Some vendors are working on this [3] ;-)
Cheers,
Robert
[1] http://www.kernel.org/pub/linux/kernel/v2.6/testing/ChangeLog-2.6.29-rc1
[2] http://linux.die.net/man/8/dmsetup
[3] http://debaer.org/blog/wp-content/uploads/2008/02/osm.pdf
Posted by: Robert Wipfel | February 10, 2009 at 08:41 PM
Thanks Robert! Looks like progress is being made. A framework to orchestrate the entire stack is what I'm referring to. Sounds like any distribution built on 2.6.29 or later will have the foundation to build the framework for the rest of the stack. In my post, I was outlining that this is what Microsoft did with VSS in Win 2003. VSS includes the foundation plus the framework to for whole stack integration, and Microsoft (of course) integrated their own applications and within about 3 years, third party vendors began to integrate their apps. I remember talking to a number of customers that were looking for similar support in Linux back in 2006 and it just wasn't there - they opted down the Windows path just because the third party solution integration was lacking in Linux at the time to enable an orchestrated whole stack snapshot.
Posted by: Richard Jones | February 11, 2009 at 05:49 AM
(I know this is an old entry, but):
This is something where D-BUS can be very useful, as it provides another piece of the puzzle in the form of inter-application and applicationsystem signaling. The trouble is that, like with most things in the Linux-related world, some kind of agreement needs to be reached by interested parties.
All it really takes for app consistency is two d-bus events:
1) "Prepare for snapshot - pause activity, fsync, etc."
2) "Snapshot taken/failed/cancelled, safe to resume"
... though there needs to be some aliveness-checking and time-limiting in place so that a broken application can't cause indefinite outages in other apps by not responding to a snapshot message.
Off the top of my head, products that'd benefit from this include Cyrus IMAPd, PostgreSQL, MySQL, any other database you care to name...
I've recently fallen "in envy of" VSS. Other than the unfortunate naming clash with Visual Source Safe, it's a perfect piece of sysadmin heaven. By contrast on Linux:
- Snapshots require you to use LVM. LVM is badly broken with respect to write barriers, and can trash your data if your app/file system expects write barriers to be honoured.
- Snapshots are block-level not file-system-level, so the file system isn't aware of the snapshot being taken.
- Because snapshots are block-level and not filesystem-aware, the snapshot must track even low-level file system activity like file defragmentation, erasing free space, etc. This means they grow fast and have a higher impact on system I/O load.
- Accessing a snapshot requires mounting it manually to another location and faffing around with the changed paths. There's no way for an app to simply request that it sees a consistent point-in-time view of the FS instead of the current state, as in VSS.
- The file system being mounted has to be able to cope with being mounted read-only in crashed state - it can't do journal replay/recovery etc. LVM doesn't even give the *filesystem* a chance to make its self consistent before the snapshot.
- Snapshots don't age out particularly gracefully. You have to guess how much space you'll need to store changes, as LVM isn't smart enough to just use free space in the VG until it runs out. The snapshot's change tracking space is reserved and can't be used for anything else (even another snapshot) until the snapshot is freed. If you over-estimate you can have fewer snapshots and need to keep more free space around. If you under-estimate your snapshot may die before you finish using it. Argh!
So: even with the ability to pause app activity, there's unfortunately a lot more to be done at the LVM level before anything even half as good as VSS is possible. LVM is more than a bit half-baked.
Posted by: Craig Ringer | April 15, 2010 at 10:28 PM
Thanks Craig! Excellent comment!
Posted by: Richard Jones | April 16, 2010 at 06:23 AM