Misinformation and the The Prefetch Flag

   Hello!  I haven't updated this blog in a
while; work and other events have conspired to keep me from
writing.  Also, blogs.msdn.com moved internally from .Text to
Telligent Community Server, and my CSS markup was an unfortunate
casualty of the move, so I'm working on redesigning the blog's visual
appearance.  More entries will be coming eventually.  :)

   In the meantime, I want to defuse a long-standing controversy -- the /prefetch flag.

 

   With modern computing, the absolute worst thing
you can ever do for performance is having to touch the hard drive -- or
any non-memory storage for that matter.  The fastest hard drives
on earth are still horridly slow compared to a PC's main memory; even
with solid state drives, in order to access the drive, one has to
jump into system code and drivers, and this will push your own
program's code out of the CPU's L2 cache.  (This is called a
locality loss.)  There's two typical reasons one has to touch the
disk -- the first is when the application requests it explicitly
(Word asks the OS to load blog.doc into memory), and the other is a
"hard fault" -- when the application tries to use memory that has been
paged out to disk via "virtual memory" and needs to be paged back in.

   Now, imagine that a DVD player program always
starts playback by loading a DLL to decode MPEG-2 video.  Wouldn't
it be nice if we could attempt to pre-load the MPEG-2 DLL whenever we
loaded the DVD player's EXE?  That way, when it tries to run code
on that DLL, one doesn't have to hard fault and go to disk for
it!   This is what a prefetcher does: it tracks what
code pages are used by an application, and the next time that
application loads, it loads those pages in advance as soon as it's got
some idle time.  A prefetcher was added to Windows in XP, and is
vastly improved in Windows Longhorn.

   XP systems have a Prefetch directory underneath
the windows root directory, full of .pf files -- these are lists
of pages to load.  The file names are generated from hashing the
EXE to load -- whenever you load the EXE, we hash, see if there's a
matching (exename) -(hash).pf
file in the prefetch directory, and if so we load those pages. 
(If it doesn't exist, we track what pages it loads, create that file,
and pick a handful of them to save to it.)  So, first off, it is a bad idea to periodically clean out that folder
as some tech sites suggest.  For one thing, XP will just re-create
that data anyways; secondly, it trims the files anyways if there's ever
more than 128 of them so that it doesn't needlessly consume space.  So not only is deleting the directory
totally unnecessary, but you're also putting a temporary dent in your
PC's performance.

   Secondly, one can specify a /prefetch:#
flag when launching an app.  Many people have noticed that
auto-generated shortcuts to Windows Media Player do this, and the
number varies depending on what it does.  For example, the
shortcut used by the shell when you double-click a WMV file to play it
has one prefetch number; the auto-run shortcut to play or rip music
that appears when you insert a music CD have other numbers.  Some
sites have guessed that this switch turns on prefetching, and suggest
that you add that to every executable you care about -- this has
appeared on so many, many, many sites to be urban legend.  Other sites
write this off as garbage and guess that it's a switch specific to
Media Player, guessing from references to prefetching in the Windows
driver subsystem.  Both guesses are incorrect.

   The /prefetch:# flag is looked at
by the OS when we create the process -- however, it has one (and only
one) purpose.  We add the passed number to the hash. 
Why?  WMP is a multipurpose application and may do many different
things.  The DLLs and code that it touches will be very different
when playing a WMV than when playing a DVD, or when ripping a CD,
or when listening to a Shoutcast stream, or any of the other things
that WMP can do.  If we only had one hash for WMP, then the
prefetch would only be correct for one such use.  Having incorrect
prefetch data would not be a fatal error -- it'd just load pages into
memory that'd never get used, and then get swapped back out to disk as
soon as possible.  Still, it's counterproductive.  By
specifying a /prefetch:# flag with a different number
for each "mode" that WMP can do, each mode gets its own separate hash
file, and thus we properly prefetch.  (This behavior isn't specific to WMP -- it does the same for any app.)

   This flag is looked at when we create the first thread in the process, but it is not
removed by CreateProcess from the command line, so any app that chokes
on unrecognized command line parameters will not work with it. 
This is why so many people notice that Kazaa and other apps crash or
otherwise refuse to start when it's added.  Of course, WMP knows
that it may be there, and just silently ignores its existence.

   I suspect that the "add /prefetch:1 to make rocket
go now" urban legend will never die, though.  I know that at least
one major company ships products with it in their shortcuts, without
ever asking us... just for good measure, I guess.  :-P  All
it does is change your hash number -- the OS is doing exactly the same
thing it did before, and just saving the prefetch pages to a different
file.

    (ATTENTION: This is merely an informative
article; this information is completely unsupported, and the
functionality may change or disappear entirely in future versions of
Windows or service packs. Furthermore, it is merely a hint for
the XP prefetcher, and it may choose to ignore it if it wishes.)