DLL Preloading Attacks

A DLL preloading attack is something that can get you on a lot of different platforms. One of the first variants I heard about was in an ancient telnet daemon on certain versions of UNIX where you could specify environment variables, and one of the things you could specify was where to look for libraries. Obviously, if you could get the telnet daemon running as root to load your library, it was then your system.

A difference between UNIX-ish systems and systems based on DOS is that the current directory "." is not on the search path for UNIX-ish systems, and it is for DOS systems, which didn't have different users, so there was no need to worry about some of these things. Originally, a Windows system would look for DLLs using the same ordering that you'd look for an executable – as documented in the SearchPath API:

The directory from which the application loaded.
The current directory.
The system directory. Use the GetSystemDirectory function to get the path of this directory.
The 16-bit system directory. There is no function that retrieves the path of this directory, but it is searched.
The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
The directories that are listed in the PATH environment variable.

The attack is that you find some DLL an app needs, make an evil twin, and put it in the same directory as a document, then lure someone who you'd like to have running your code to open the document. This is obviously a problem, and the advice we gave in Writing Secure Code (1&2) was to fully path the library you wanted to access with LoadLibrary. This advice isn't always the best, since if you weren't sure where you were installed, you might use SearchPath to go find it, which looks in the current directory, and now you have a problem again.

What we did to fix it correctly was to make a setting that moved the current directory into the search order immediately before the path is searched, and after everything else. This took effect by default in XP SP2, Win2k3 and later, and was available in Win2k SP4. For the most part, this did get rid of the problem – if it was a DLL in the operating system, that got searched well before the current directory and all was good.

Unfortunately, this isn't a complete fix in all cases – there are some times that we'd like to test to see if a DLL is present, and then do something special if it is. Even with the current directory moved to the end of the search order, if it isn't there, we'll still look in the current directory. So code that looks like this:

hMod = LoadLibrary("Foo.dll"); // check to see if Foo is present

Will be dangerous. You have a couple of good options in dealing with this. If you never have a need to load a DLL from the current directory, just call SetDllDirectory with an argument of "". This is something I discovered by playing with the API, went and looked at the code and found that it was an intended use of the function, logged a bug, and now it's documented behavior you can depend on. If you can do this, it's best – you don't have to put a lot of overhead around every LoadLibrary call, and you're safe. The API is available in XP SP1 and later, which is pretty safe as a minimum platform these days. A second approach that would involve a bit of work on your part would be to implement only the bits of SearchPath that you need. Here's what I'd do:

  1. Search your app's directory – you can find this with GetModuleFileName using NULL as the first parameter.
  2. Look in the system directory as above
  3. Look in the Windows directory as above

There could be some wrinkles around side-by-side DLL's, and I haven't looked closely at this aspect of the problem – perhaps someone who has could comment. An option that I'd tend to discourage would be to load the library with LOAD_LIBRARY_AS_DATAFILE as a flag, then using GetModuleFileName to see if it is the one you wanted, or checking somehow to see if it is the one you wanted using some form of checksum. The first problem is that this is a lot of overhead, and the second is that if there's a path to parse, odds are you'll do something wrong and break when the format of the return changes, or you'll foul up and make the wrong decision, since making decisions based on names is hard. Checksums are easily defeated, and real cryptography is computationally expensive.

I've been meaning to write this for a while, as it's one of the portions of Writing Secure Code that I'd like to update -