Thoughts about WinFS and related technologies

There’s been a lot of discussion about the recent decision not to ship WinFS as a distinct product, but instead to incorporate its technologies into ADO.NET and SQL Server.  I don’t have much to contribute to the discussion about WinFS itself since I didn’t have anything to do with it.  I do find some of the commentary thought provoking about some larger XML-related questions, however.  OK, I guess I do feel the need to get a couple of things off my chest that I haven’t seen beaten to death in other posts.

  • I liked Stephen O’Grady’s dissection of the announcement itself: “There’s very little contrition in the post, very little apology or reflection. … When you’re announcing the death of a project that has consumed thousands of man hours of time, and affected the future of other Microsoft projects, I would have expected the post to be more retrospective…”

  • Since WinFS was  not a “filesystem” (I think “FS” has stood for Future Store for awhile now), comparison with the new Solaris ZFS is a bit pointless.  That does have some promising-sounding features — 64 bit checksums, orders of magnitude more capacity, transactional approach to make the order of low-level I/O operations irrelevant  — but I don’t think that even the most fervent hype for WinFS ever promised that kind of stuff.

  • The biggest reason for cancellation was probably simply that after all these years, no MS products had taken a dependency on WinFS.  All those wonderful things that it might be able to do … if only someone would sign up to be the guinea pig.  Bootstrapping that was presumably why Project Orange was started. 

This leads to the main point I’d argue as far as the relevance of this announcement for what I actually do for a living: No matter how great your technology, it’s not going to be a success until it solves more problems than it creates.  It’s hardly an original observation to compare WinFS and the Semantic Web  but proponents of the latter should now be pondering the lessons of the former. The main reason that WinFS became “the biggest example of scope creep”   AFAIK is that simpler text indexing / search technologies based on statistics rather than semantics have been plucking the low hanging fruit faster than WinFS could mature; it had to get ever more ambitious to be differentiated from Google desktop search, Apple Spotlight, etc. etc. etc. This is probably equally true of the Semantic Web that had much of its vision implemented by Google and responded by raising its ambitions.  So far, people trying to solve the problems that WinFS set out to solve have done without having to build the conceptual infrastructure that it demanded,

But as much as I like to rub salt in their wounds, the people who ask us to start with an ontology or an entity-relationship model may well be able to keep going after the the simpler search models hit the wall. That remains to be proven, of course, but it’s interesting how the WinFS data model lives on (in an evolved form) in the ADO.NET Entity Framework. It still takes some thought and work to develop the entity data models that power this approach (or the ontologies that power the semantic web), but that up-front work allows more flexibility in object mapping, database evolution, etc. than the simpler currently dominant approaches do. 

What I like best about the recent announcement is that it should offer many of the technologies developed for WinFS as evolutionary features on top of the existing ADO.NET and SQL Server products. What’s more, since LINQ to Entities puts the entity framework within a continuum that starts from LINQ itself for objects and BLinq for simple websites, developers can go however far down the path to abstraction and conceptual complexity they need to solve their problems. If the simplest thing that could possibly work is all that is needed, stick with LINQ or LINQ to XML and the filesystem; add support for ADO.NET web servers, LINQ to SQL, or LINQ to Entities as the problem demands … or don’t if they turn out to create more problems for an application than they solve.

I think this is the basic reason why Microsoft is offering multiple LINQ-based technologies that do overlap somewhat, and may fit into a single pigeonhole in other people’s product categorizations.  One can see them as different ways to do object-relational mapping, but I prefer to think of them as different stops along the continuum from direct mapping of fairly concrete classes and tables to a flexible and abstract mapping. That additional complexity will offer some developers more pain than gain, but it will allow others to abstract away irrelevant details for the relatively small price of some up-front modeling.  Customers should choose which meets their needs, and we’ll all just have to see what actually works best for which scenarios. 

Comments (8)

  1. Quentin Clark has posted an update to the WinFS update and answers some of the questions that came since…

  2. MKane91301 says:

    However, there’s a key part of O/R mapping (or O/XML mapping) that isn’t clearly addressed by any of the LINQ To * technologies (it may be addressed by them, but not clearly addressed): how do you easily go from a graph of objects to a hierarchical or relational representation and back to a graph again? When multiple objects reference a single object, it can get goofy. This is something that has been handled for a very long time when marshalling an object graph across the wire, but how does LINQ handle it?

  3. MCChampion says:

    LINQ per se doesn’t handle going from graphs to relations and back to graphs, but the Entity Framework does. See  "Now, if this new conceptual schema is different from the logical schema in the actual database, how does the system know how to go back and forth between schemas? The answer is “mapping”."

    Does that answer your question?

  4. MKane91301 says:

    The ADO.NET Entity Framework adds a big piece of the puzzle for going between objects and relational tables, but I don’t see any way this fits into LINQ to XML. If I have a graph of objects, in which multiple objects reference the same object, I don’t see any easy way of going to and from XML without having to write code to manage object identity. The old .NET Remoting SOAP serializer did this for me, but I’m sure you’re well aware of all the reasons we wouldn’t want to use that any more. This seems like a big open hole waiting for some framework piece to plug it.

  5. MCChampion says:

    Right, there’s no ADO.NET XML story for Orcas that I know of. There is some interesting work being done for the future, but I don’t think we’re ready to talk about it. ["Underpromise and Overperform" is the post-WinFS motto around here :-)] Sorry if I missed that point in your question.

  6. Soumitra Sengupta says:

    I think he is referring to serializing an E-R graph and not modeling XML in EDM.