Great WPF Applications #15: Microsoft Indexing Service

[Update: Happy April Fool's Day, everyone! For the avoidance of doubt, this was a hoax...]

At the start of a new month, it's time for me to unveil a stunning example of how WPF can be used not just for dynamic, immersive user experiences, but also for even mission-critical services like the indexing engine that sits at the heart of Windows Vista.

As you know, the role of this engine is to watch files in common directories like the Documents folder and store a keyword cache that can be quickly accessed for search purposes. It turns out that several of the lead developers on the indexing team were interns on the Avalon team in its early prototyping days. Having spent their entire professional career up until that point with WPF, it's not surprising in a way that they saw the data binding innovations in our UI technology as applying across to the file system infrastructure.

Under the covers, the indexing service heavily utilizes the FileSystemWatcher class present in the .NET Framework. To avoid undue complexity in the code, there's one single file system watcher, pointing at the root of the %SystemDrive%, with an event handler set up to fire each time a file is touched; other drives are silently mapped to a hidden folder in the root directory (a bit like the old MS-DOS subst command line utility).

The moment an event fires, the value of WPF is immediately apparent. By using an ObservableCollection of File objects with dependency properties set for all 200+ major metadata attributes, it's possible for other applications to immediately be notified via the data binding engine of any changes detected in the underlying file system. (Obviously, in practice, we create a private subclass of File because of all the undocumented properties we need to store that aren't exposed to casual developers outside of Microsoft.)

The real fun comes, however, with the use of the indexing algorithm itself. It turns out by some coincidence that there's great similarity between the way that keywords are weighted and the way that polygons are sorted in 3D space. You may therefore be extremely surprised to find out that indexing service is actually backed by one extremely large Viewport3D object, that contains an internal representation of all the files as individual GeometryModels. By simply resorting the GeometryModel by z-space as a new query comes in, the indexing engine is able to quickly bubble up the most relevant queries to the top. A change in the sort order can easily be accomplished by simply setting the camera location to a different point in 3D space and resorting the geometry models (a nice benefit that nobody was expecting when the architecture was constructed).

One caveat about the use of WPF for the indexing service is that it can be resource-intensive due to the heavy usage of the GPU. For that reason, we disable indexing from taking place when a laptop is running on batteries. You'll notice that upgrading your graphics card to a newer model will help improve performance of indexing - VRAM actually turns out to be the largest bottleneck in indexing performance, as a result of all the geometry models.

Anyway, I'm delighted to be the first one to break this exclusive; at last, the reason why we dropped WinFS can be revealed - it turns out WPF was doing a more-than-adequate job without the overhead of creating a new file system. In the next release of Windows, we'll work on surfacing the internal 3D representation as a new data visualization, but for now, rest assured that WPF is taking care of your documents.