Performance, Performance, Performance

You guessed it right from the title.  This posting is all about WinFS performance and will answer the following questions:  What does it mean for WinFS to perform?  How do we measure performance?  And most importantly how can you help improve WinFS performance?  Before we dive into these questions let me introduce myself.  My name is Tony Voellm and I am the lead of the WinFS performance team.  In one way or another I have been working on making code perform since writing my first program at an early age.  Have you ever wondered how to make “10 Print “Tony”, 20 goto 10” run faster? I have.

Now let’s look into the first question.  What does it mean for WinFS to perform?  Performance means different things to different people.  Some people look at benchmarks in order to define if a piece of code is performing.  The better the score the better they think the code performs.  Others define performance in terms of consumption of resources such as I/O, CPU, and memory.  In other words a performant piece of code uses fewer resources.  Both of these definitions can be useful.  I prefer to look at performance from a user expectation viewpoint.  Human perception is around 20ms (30 frames / sec for TV = 33ms) so any computer response than happens in less than the human perception time could be perceived as performing.  However there is a catch.  It might be the operation happens in less than 20ms but leaves the machine in such a state that subsequent operations that used to return in less than 20ms now take longer or use more resources.  This means the operation needs not only to return in less than 20ms, but it also needs to leave the machine is such a state as to not adversely affect other operations.  How does this apply to WinFS?  When we look at WinFS performance we are driving it not only to minimize resource consumption and impact on the machine but also to meet user expectation.

Now it’s time for the second question: How do we measure WinFS performance?  Some have asked how does WinFS perform against NTFS on Winbench?  This is a good question but it might also be equally valid to ask how does WinFS perform on a common database benchmark like TPC-C.  In my opinion neither of these benchmarks is suitable for WinFS because they don’t look at enough of what WinFS can do.  An example query that neither benchmark captures is “Show me all email from people I am meeting with this week.” In today’s systems email is typically part of the mail application and not the file system so Winbench is not sufficient.  Its not that WinBench couldn’t be extended to include query tests but rather file systems like NTFS can’t even execute the query. At the same time TPC-C is an order processing benchmark and not really focused on rich query.  Given that today’s Benchmarks don’t really measure what WinFS can do, how do we measure performance?  The answer is to drive performance based on customer scenarios like “copy 100 photos from a digital camera into WinFS” and look how it impacts the machine and compares to user expectations.  Another scenario focused on query is the one mentioned above, “Show me all email from people I am meeting with this week.”

This leads us to the last question.  How can you help improve WinFS performance?  Simple as 1, 2, 3.
1) Download WinFS Beta 1 from (MSDN Subscribers & PDC attendees only). 
2) Upload your examples to
3) Post your comments here on what your experience was or how you might use it in the future. 
I’m not asking for your company secrets so please don’t post any confidential information!  Rather we want to know what types of operations you are trying to perform with WinFS. Are they more insert based like inserting contacts or query based like finding email?  Are you building a viewer app where notifications might be important or how about peer to peer sharing apps that take advantage of WinFS multi-master sync infrastructure?  The more we find out what you plan to do and what you expect in terms of throughput and impact on the machine the better we can make WinFS fit your expectation.  We have done a lot of work in this area to deliver a usable Beta 1.  Now we need your help to take it to the next level.

Have fun with the bits and in the future you’ll see posts looking at all types of performance numbers and topics.  I love the numbers…

Author: Tony Voellm

Comments (7)

  1. Arian Kulp says:

    I have a performance question: In one of the PDC sessions it was mentioned that underlying files placed in WinFS are actually stored in NTFS streams. I was wondering if this was mostly a simple method of obfuscation (since most users will still access their non-WinFS folders directly), or if there was a performance benefit to storing the data this way.


  2. Tony Voellm says:


    WinFS files streams are stored in NTFS to reduce the storage and retrieval overhead. Image trying to store a 2GB movie in a conventional database – there is a lot of data page encoding and transformation that occurs. Once a file in WinFS is opened the Read / Write path is essentially the same as NTFS which means there is almost no perf hit.

    Some questions that come up are should icons be file backed items or native WinFS items? How about documents? There are definitely trade offs and it will depend on how the user will use your application.

    The advantage of putting files in WinFS is the ability to leverage our Search, Sync and Query API’s.


  3. Arian Kulp says:

    Thanks for your response! Actually, it didn’t completely answer my question (though it was insightful nonetheless!). I can understand why file-backed items are not stored in the relational DB itself, but I’m wondering why in streams vs. NTFS files. Are streams more performant, or is it just to effectively shield those files from users to prevent problems?


  4. Tony Voellm says:


    The term NTFS Streams was used but the File Data is actually stored in NTFS files. NTFS does support multiple "streams" per file but few applications use it. Generally the default stream (0) is the only one used. In WinFS we dont do anything special here other than storing the data in a file that has special ACLs to keep most users from finding it and being able to modify it.


  5. Ian Thomas says:

    NTFS filestreams are of great interest to me, since I’ve used the "alternate data stream" to attach metadata descriptions to some files (used in GIS – geographic info systems) so that the metadata don’t get "lost". Of course, they will get lost if the user copies the ‘parent’ file to a non-NTFS volume or to a CD or DVD, but NTBackup handles them appropiately.

    My concern has been the tendency for some anti-virus software to search out ADS (they call them "hidden streams") because they’ve been used occasionally by rojan / virus / malware writers.

    So, my questions:

    What is the detail of the special ACLs used by WinFS (and where can I find a good technical description)?

    And what guarantees that anti-virus / anti-spyware software doesn’t interfere with the WinFS streams that we’re talking about here?

  6. Hello,

    I’m developing a custom Windows Shell Namespace Extension for WinFS. It places in My Computer folder under the "Contacts" name. The reason is to provide me for the future some GUI and API to work more easily with my contacts stored in WinFS Store. The problem is that such excellent thing as Project method to retrieve custom data from selected container from WinFS works very slowly. I work with about 300 of sample contacts records and Windows Explorer slows for about 1,5 min while loading them all into the folder view. What are the ways to make perfomance of this command actually great?

    I’m retrieving only DisplayName, ContactCards.EAddresses, ItemIds and that is all. What shall I do to get them faster?


    Daniel A. Kornev,

    Microsoft Student Partner.