WinFS Backup/Restore at VSS Plugfest

On Tuesday morning, I presented at VSS Plugfest, an event hosted by Microsoft where around 100 backup/restore application developers come together at our Redmond campus. The goal of my presentation was to describe the infrastructure that we are building for backing up and restoring WinFS stores and items, and get some feedback on the platform. Of course, before I could do this, I needed to explain WinFS: the vision, the scenarios, the data model, etc. Ordinarily presenting all this information in a short period of time is a tad challenging, but an achievable task (as we saw from our successful PDC sessions). However, I had only one hour to present everything!

I created a slide deck by borrowing from our PDC presentations, and then adding my WinFS backup/restore slides. I started the talk by showing the IWish video from the PDC and then summarized the recent WinFS announcements: Beta 1 shipped, we’ll RTM as an out-of-bound Windows component (as part of WinFX), we’ll have more betas, we’ll align with the rest of the Microsoft data stack, etc.

From there we went into an overview of the value props (unify, organize, explore, and innovate, oh my!) and how the WinFS architecture provides them. Of course, since this was a presentation from the backup app developer point of view, I had to pick and choose which parts of WinFS to drill down into and what to fly past. At one point, I think I said something along the lines of, “We also have these really cool and important Metadata Handler services, but I won’t talk about them today.” :-)

Understanding the difference between a file and an item is fairly important for a backup app developer, so I spent some time explaining how items will become the new center of user operations, and how associations relate items with either links or common value relationships. The next topic that I drilled into was how WinFS sits on NTFS, by introducing the idea of stores and shares, and how file backed items and their streams are stored on the user’s disk.

The last topic that I drilled into as part of the overview was the WinFS type system. I explained how a schema definition is used to generate code for the developer to program against and how this is packaged and deployed into a user store. I also discussed how schema associations can organize data and briefly touched on why a developer would want to extend a schema.

Tons of information already, but we’re only halfway through. :-) Most of the audience seemed really excited by WinFS (and maybe a little overwhelmed with the fire hose of information). Since my talk was immediately before lunch, there were maybe 1 or 2 folks that seemed distracted by the aroma of lunch that came wafting into the room. I can’t really blame them; it did smell pretty appetizing. 

From here, we moved into the meat of presentation: the backup and restore platform. In addition to the platform we’re building, we are planning to integrate with available in-the-box OS backup applications. Store level backup will be available on both Vista and XP and Item level backup will be available on Vista. For store level backup, WinFS is providing a VSS writer that will expose WinFS Stores as components. The implication, of course, is that the backup granularity is the entire store. Users can’t directly restore individual items from a VSS snapshot. When a store is restored from backup, WinFS will detect this and perform any necessary processing.

Because item level granularity is important, we are building extensive item level backup and restore support. The item level backup allows user to revert a change to an item (“Oops, I didn’t mean to save that!”). On the other hand, the store level backup is more for the scenario of a broken hard drive and a user needing all his files back. The item level backup platform, at RTM, will be managed client side APIs. For Beta 1, we have stored procedures you can prototype with. However, by RTM, we may deprecate these procedures in favor of the managed API.

When we discuss item backup, we need to describe the item boundary. In other words, if all our data is related and linked, where does one item end and another item start? A WinFS item serialization contains the core item, outgoing links (just the link entity, not the linked item itself), extensions, embedded items, and incoming link IDs. We use the outgoing links and incoming link IDs to fix up the links between items at restore time. Backing up a folder is just like backing up any other item; it is important to note that serializing a folder does not serialize its contained items. This is because a folder item is just like any other item.

The last thing I touched was how WinFS provides the infrastructure for incremental backup of items. Essentially, a developer can easily get an enumeration of items that have changed since a given watermark and use this list to fetch items for backing up.

There were a bunch of good questions throughout my talk and during the small group discussion at the end. A lot of folks asked, “So are you a file system or a database?” In reality we’re both: “a relational file system”.  Related to this question, some people asked, “Can a WinFS drive can be converted back to an NTFS drive?” There is no such thing as a “WinFS drive”. Since WinFS is built on top of and dependent on NTFS, we can’t replace it. One way to think of this is that WinFS is a subsystem on top of NTFS; you can have your WinFS store on the same drive as normal NTFS files.

I’ll be producing a whitepaper on WinFS Backup/Restore that will have all the details about the VSS Writer and the item level backup APIs. You can expect to see a blog post and link for it. In the meanwhile, if you have any questions, please let me know.

Author: Vijay Bangaru