Why is the FAT driver called FASTFAT? Why would anybody ever write SLOWFAT?


Anon is interested in why the FAT driver is called FASTFAT.SYS. "Was there an earlier slower FAT driver? What could you possibly get so wrong with a FAT implementation that it needed to be chucked out?"

The old FAT driver probably had a boring name like, um, FAT.SYS. At some point, somebody decided to write a newer, faster one, so they called it FASTFAT. And the name stuck.

As for what you could possibly get so wrong with a FAT implementation that it needed to be improved: Remember that circumstances change over time. A design that works well under one set of conditions may start to buckle when placed under alternate conditions. It's not that the old implementation was wrong; it's just that conditions have changed, and the new implementation is better for the new conditions.

For example, back in the old days, there were three versions of FAT: FAT8, FAT12, and FAT16. For such small disks, simple algorithms work just fine. In fact, they're preferable because a simple algorithm is easy to get right and is easier to debug. It also typically takes up a lot less space, and memory was at a premium in the old days. An O(n) algorithm is not a big deal if n never gets very large and the constants are small. Since FAT16 capped out at 65535 clusters per drive, there was a built-in limit on how big n could get. If a typical directory has only a few dozen files in it, then a linear scan is just fine.

It's natural to choose algorithms that map directly to the on-disk data structures. (Remember, data structures determine algorithms.) FAT directories are just unsorted arrays of file names, so a simple directory searching function would just read through the directory one entry at a time until it found the matching file. Finding a free cluster is just a memory scan looking for a 0 in the allocation table. Memory management was simple: Don't try. Let the disk cache do it.

These simple algorithms worked fine until FAT32 showed up and bumped n sky high. But fortunately, by the time that happened, computers were also faster and had more memory available, so you had more room to be ambitious.

The big gains in FASTFAT came from algorithmic changes. For example, the on-disk data structures are transformed into more efficient in-memory data structures and cached. The first time you look in a directory, you need to do a linear search to collect all the file names, but if you cache them in a faster data structure (say, a hash table), subsequent accesses to the directory become much faster. And since computers now have more memory available, you can afford to keep a cache of directory entries around, as opposed to the old days where memory was tighter and large caches were a big no-no.

(I wonder if any non-Microsoft FAT drivers do this sort of optimization, or whether they just do the obvious thing and use the disk data structures as memory data structures.)

The original FAT driver was very good at solving the problem it was given, while staying within the limitations it was forced to operate under. It's just that over time, the problem changed, and the old solutions didn't hold up well any more. I guess it's a matter of interpretation whether this means that the old driver was "so wrong." If your child outgrows his toddler bed, does that mean the toddler bed was a horrible mistake?

Comments (33)
  1. Joshua says:

    [I wonder if any non-Microsoft FAT drivers do this sort of optimization]

    Linux caches above the driver level.

    [This isn't caching. This is data transformation. Are you saying that the answer is "No, Linux drivers do linear scanning"? -Raymond]
  2. Gabe says:

    I don't remember there ever being a FAT.SYS. Does anybody know when FASTFAT.SYS first shipped?

  3. Brian_EE says:

    Plus, don't forget, FAT was written on an airplane! blogs.msdn.com/…/10454920.aspx

  4. Cesar says:

    As @Joshua said: Linux has caches at the VFS level, in particular the dentry (directory entry) cache and the inode cache. They use highly efficient data structures, it's one of the most optimized areas of the kernel (no joke – they have things like lockless tree structures and highly optimized word-at-a-time filename comparison). So, while I haven't looked in detail, the Linux FAT driver probably doesn't have as many optimizations as it would have otherwise.

    I don't know how it is on Windows; does it have caching in the common (VFS-equivalent) layer, or is it all the responsibility of the filesystem driver?

    [Ah, I see. The data transformation happens at a higher level: The low-level component parses file system data structures, and the high-level component organizes them for fast access. -Raymond]
  5. Karellen says:

    For anyone actually interested in looking at what the Linux fat driver does:

    git.kernel.org/…/fat

  6. marcan says:

    Linux has caching at, above, and below the filesystem driver level. There's the disk cache (of course), there's the VFS cache (mentioned above, filesystem independent, and highly optimized), and then the FAT driver also maintains its own cache for the FAT (blockmap) itself that uses very different in-memory data structures (linux/fs/fat/cache.c).

    From a quick skim of the code, I don't think there's FAT-specific directory caching going on, particularly for writes – for example, when you delete a file, the FAT driver does do a linear scan through the on-disk-format target directory to find the right entry to remove. My guess is there are rather few people using Linux to deal with FAT filesystems requiring high performance directory mutations on very large directories (at that point, using a modern filesystem is a much better idea). The driver does take care of caching the FAT for fast allocation and lookup, and VFS takes care of caching directory entries and inodes, which probably covers the vast majority of common use cases.

  7. Azarien says:

    I've never heard of FAT8. Is is still supported? If not, what was the last Windows or MS-DOS version that did?

  8. j b says:

    @Azarien,

    Check wikipedia, en.wikipedia.org/…/File_Allocation_Table has a section on FAT8. Looks like it was a precursor to MS-DOS FAT12/16, but was never supported by neither MS-DOS nor Windows (at least if I understand the Wikipedia article correctly).

  9. Myria says:

    Now if only ExFAT weren't confidential and patented…

  10. Ken Hagan says:

    @Myria: ExFAT can't be both confidential and patented. To get the patent, you have to publish in sufficient detail to let someone else reproduce the invention.

  11. E says:

    FreeBSD presumes caches transformed data at a higher level in order to facilitate even better optimizations.  In particular, it caches at the vfs layer.

    code for FAT: svnweb.freebsd.org/…/msdosfs

    While I don't know the code that well it appears that svnweb.freebsd.org/…/msdosfs_fat.c is perhaps what you are discussing ?

  12. E says:

    I meant to say that "presumes that disk access is slow and caches transformed…"

  13. jonwil says:

    @Ken Hogan: The patents on ExFAT are freely available but the exact implementation details are kept secret (e.g. the exact layout of on-disk data structures).

    Or at least those implementation details WOULD be secret if it wasnt for Samsung who used parts of the Linux Kernel FAT driver to build its ExFAT driver and has subsequently released said driver under the GPL (after some legal pressure from the free software community)

    Microsoft even acknowledges the "open source" stuff by saying on its IP licensing page for ExFAT that "the open source stuff doesn't give you any rights to the ExFat IP"

  14. David Crowell says:

    There is an ExFAT FUSE driver for Linux.  Seems to work.

    SLOWFAT would describe me.  Just sayin'.

  15. I don't really think this is the appropriate venue to complain about IP or patenting issues.  Raymond's not in a position to change these policies, and there are far better places to complain about issues to people who can't do anything about it (slashdot, reddit, etc.).  Or write to Microsoft about it.  Who knows?  They have been more cautiously open to free licensing lately…

  16. Jaime says:

    Novell NetWare had an algorithm called Turbo FAT that indexed directory entries. I learned about it in the mid 1990s, so it's been around since at least then.

  17. Jaime says:

    Also, not exactly the same optimization, but NetWare also did disk request reordering. They called it "elevator seeking technology" because it basically worked like and elevator does. It organized the current request queue from the inside of the disk to the outside and fetched those blocks. Then it organized the next batch of requests from outside to inside and fetched those. It also moved deleted blocks to the outside of the drive to make the active area smaller (NetWare always carefully planned which blocks to delete – the entire drive was always full of either real files or deleted files that were complete and ready to be undeleted, none of this "I found these file fragments maybe you can make sense of it" stuff on undelete). These were some of the reasons NetWare file servers were so fast twenty years ago.

  18. Yuhong Bao says:

    @j b: Yea, I am surprised they even mentioned FAT8.

  19. Joshua says:

    Good guess Raymond it was indeed FAT.SYS. Too many NT4 installs. It's the second to the last driver loaded before "setup is starting Windows"

  20. foo says:

    Nothing is ever marketed as slow when it comes out. That's why any new USB spec 10 years from now will probably be called 'super mega turbo xTreme ultra' speed, and another 10 years after that 'we tried to come up with a suitable term but the word was so xTreme-allistic it collapsed into a black hole' speed.

  21. David Goebel says:

    In 1989 Gary Kimura wrote a simple version of the FAT file system for NT that was used in the earliest stages of bootstrapping.  In early 1990 Tom Miller was creating the NT cache manager and the first consumer of the cache manager was a port of the HPFS file system from OS/2 named Pinball.  As this was still prototype work Gary’s FAT implementation was more stable and used for bootstrapping.

    When I started at MS in the fall of 1990 one of my first jobs was adding cache manager support to FAT based on the work Tom was doing with Pinball.  The word “adding” here doesn’t do the process justice.  In NT the vast majority of file system complexity comes from the interaction between, and the recursive nature of, the memory, cache, and I/O managers.  While not starting from scratch, it was definitely putting the meat on the bones.  As we needed to maintain a stable implementation of FAT for bootstrapping we forked FAT and thus was born FastFAT.  It would be “fast” because it used the new NT cache manager.

    There was never a shipping version of FAT.sys in NT4.  In fact there was never even a FAT.sys ever built, but only FAT.lib as in those early days we were statically linking FAT into the kernel.  In a previous comment someone said they saw FAT.sys but to double check I just went through the history of all changes from day one to Win2k to all "sources*" files looking for anything that built fat.sys and nothing did.  The original NT Setup loader was a scaled down OS unto itself (otherwise known as Bill & Ted’s Excellent Operating System) as it needed to boot from floppy so had to be small.  It’s possible this mentioned something about loading FAT, but no binary named FAT.sys has even been built in the NT tree.

    A few more notes.  NT never supported FAT8.  Just  FAT12 & FAT16 from NT 3.1 and later FAT32 (a misnomer – it should have been called FAT28).  If you’re interested you can look at the fastfat source here:

    code.msdn.microsoft.com/…/SourceCode

    Fastfat source has been publically available, and updated, ever since the first NT IFS was released in April 1997.  The public Fastfat source is nearly identical to what’s actually shipped with NT.

    David

  22. Danny says:

    "Why is the FAT driver called FASTFAT? Why would anybody ever write SLOWFAT?"

    Over the centuries there were only FAT. No matter what nation, you were always FAT and you became FAT at the same rate. Now, in modern days, the Americans became FASTFAT, they gain FAT FAST because of the corn syrup and all those junk food. And now we have SLOWFAT aka Europeans who started to go after Americans junk-food so they become FAT too but at a slower pace, hence SLOWFAT :P:P.

  23. Yuhong Bao says:

    "The original NT Setup loader was a scaled down OS unto itself (otherwise known as Bill & Ted’s Excellent Operating System) as it needed to boot from floppy so had to be small.  It’s possible this mentioned something about loading FAT, but no binary named FAT.sys has even been built in the NT tree."

    Yea, I think it just happens that NT setup referred to the loading of fastfat.sys as loading of the "FAT File System" (if I remember correctly).

  24. 12BitSlab says:

    @ Jaime – There was a version of OS/360 that did reordering of reads for the purpose of reducing arm movement.  It was fairly successful since in an average system, there are far more reads than writes to disk.  A later version also did reordering of reads to a single cylinder in order to avoid allowing the disk to rotate multiple times for several reads.  That was less successful since the compute time needed exceeded the time needed for a single rotation of the disk.

  25. Brian_EE says:

    @Yuhong: I'm not surprised. I've never known Wikipedia to limit itself to only documenting things specific to DOS/Windows. So if there was a FAT that pre-dated MS DOS, it should be documented.

  26. smf says:

    @Brian_EE There is a FAT file system that pre-dated MS DOS, it was used in various forms in Microsoft products that needed a disk file system (for example Microsoft Disk Basic for 8080 and their OS for z80/8080 called MIDAS). I'm not sure they were compatible with each other, which back then wouldn't have been such a big deal.

  27. Brian_EE says:

    @smf – You have to read my comment in context to Yuhong's comment. I know there existed FAT prior to DOS, and it IS documented on the Wikipedia page. That was my point.

  28. 640k says:

    Why hard code the entry size of an allocation table? It should of course be arbitrary big.

  29. Jaime says:

    So you know how big a pointer should be. FAT file allocation tables store the location of the next allocation unit in a chain as content. If you make it arbitrarily big, then you need some way to designate the size, which takes space. Essentially, that "some way to designate the size" is the FAT sub type (FAT12, FAT16, FAT32).

  30. 640k says:

    First sector of each partition stores the fat entry size.

  31. Jim says:

    @David Goebel: Fascinating information. Thanks for sharing it.

  32. cheong00 says:

    Btw, once upon a time the biggest supported size in CMOS setup for harddisks is somewhere around 20Mb only (Remembered that there was 46 predefined disk parameters in CMOS setup page, and type 47-"user defined" was added in a later release of BIOS) . There was no provision for bigger size storage media at all.

    And when low memory usage is tight requirement, you'll want to make it as simple as possible. Hardcoding the value and make it a variant seems to be the way to go in those days.

  33. cheong00 says:

    Not to mention when the CPU were 4/8 bits only, those beyond 8 needs big-integer techniques we use today to work correctly, and it was so SLOWWWWWWWWW that makes it somewhat useless requirement at that time.

Comments are closed.

Skip to main content