The path-searching algorithm is not a backtracking algorithm


Suppose your PATH environment variable looks like this:

C:\dir1;\\server\share;C:\dir2

Suppose that you call LoadLibrary("foo.dll") intending to load the library at C:\dir2\foo.dll. If the network server is down, the LoadLibrary call will fail. Why doesn’t it just skip the bad directory in the PATH and continue searching?

Suppose the LoadLibrary function skipped the bad network directory and kept searching. Suppose that the code which called LoadLibrary("foo.dll") was really after the file \\server\share\foo.dll. By taking the server down, you have tricked the LoadLibrary function into loading c:\dir2\foo.dll instead. (And maybe that was your DLL planting attack: If you can convince the system to reject all the versions on the PATH by some means, you can then get Load­Library to look in the current directory, which is where you put your attack version of foo.dll.)

This can manifest itself in very strange ways if the two copies of foo.dll are not identical, because the program is now running with a version of foo.dll it was not designed to use. “My program works okay during the day, but it starts returning bad data when I try to run between midnight and 3am.” Reason: The server is taken down for maintenance every night, so the program ends up running with the version in c:\dir2\foo.dll, which happens to be an incompatible version of the file.

When the LoadLibrary function is unable to contact \\server\share\foo.dll, it doesn’t know whether it’s in the “don’t worry, I wasn’t expecting the file to be there anyway” case or in the “I was hoping to get that version of the file, don’t substitute any bogus ones” case. So it plays it safe and assumes it’s in the “don’t substitute any bogus ones” and fails the call. The program can then perform whatever recovery it deems appropriate when it cannot load its precious foo.dll file.

Now consider the case where there is also a c:\dir1\foo.dll file, but it’s corrupted. If you do a LoadLibrary("foo.dll"), the call will fail with the error ERROR_BAD_EXE_FORMAT because it found the C:\dir1\foo.dll file, determined that it was corrupted, and gave up. It doesn’t continue searching the path for a better version. The path-searching algorithm is not a backtracking algorithm. Once a file is found, the algorithm commits to trying to load that file (a “cut” in logic programming parlance), and if it fails, it doesn’t backtrack and return to a previous state to try something else.

Discussion: Why does the LoadLibrary search algorithm continue if an invalid directory or drive letter is put on the PATH?

Vaguely related chatter: No backtracking, Part One

Comments (36)
  1. Ivo says:

    The real reason the server goes down at night is because the cleaning crew unplugs it to plug in the vacuum cleaner.

  2. Larry Hosken says:

    "Why does the LoadLibrary search algorithm continue if an invalid directory or drive letter is put on the PATH?"

    If the user's aiming the gun footward, engage the safety.

  3. Adam Rosenfield says:

    Putting a server share in your PATH is just asking for trouble.  Aside from the security issues, do you really want to be hitting the network for every call to LoadLibrary or CreateProcess?

  4. "Why does the LoadLibrary search algorithm continue if an invalid directory or drive letter is put on the PATH?"

    All the software uninstallers that would break your system because they forget to clean up PATH.  And at first you'd think that this is safe – if it's a local directory that doesn't exist, you should still get 100% reproducible behavior, right?  I.e. if C:dirFfoo.dll does not exist, it will never exist today or tomorrow, so keep searching.

    Specifically, LoadLibrary thinks it has established with complete certainty that the file does NOT exist, instead of an error determining whether the file is usable, or even if it is present or missing in the first place.

    But by the logic in Raymond's article, this is a bug if LoadLibrary does this.  Because a missing local file does NOT establish with certainty that a file does not exist.  What about mapped network drives?  What about removable USB devices?  What about removable drives mounted into NTFS paths on the main volume?  The local file system can be very intermittent, too!  Servers go up and down and so do local drives.

  5. Jon says:

    If you're intending to load "C:dir2foo.dll"   then you call LoadLibrary("C:dir2foo.dll").

    If you call  LoadLibrary("foo.dll")  then you don't care where it comes from and loading  C:dir2foo.dll  instead of \serversharefoo.dll is perfectly acceptable.

    [What if you don't control the call to LoadLibrary? (Implicit dependency.) -Raymond]
  6. Dan Bugglin says:

    I note that if the network share is UP but the attacker can remove foo.dll from it, the file in C:dir2 will be loaded.  A different sort of attack, but this time LoadLibrary has no way of knowing anything is wrong.  Or the attacker could always take down the server as in Raymond's example, and then stick their own server up containing whatever files they want the user's computer to load on demand.

    Of course, you shouldn't be setting up permissions/env variables to allow non-admins to plant or remove DLLs from folders on the PATH anyway, I would think.  Or even putting network shares in PATH to begin with.

  7. Mark says:

    "And maybe that was your DLL planting attack: If you can convince the system to reject all the versions on the PATH by some means, you can then get Load­Library to look in the current directory, which is where you put your attack version of foo.dll."

    Isn't that just a 'other side of the airtight hatch' problem though? If you have the permission to sabotage the PATH locations, you have permission to inject your DLL high in the PATH order already (either through the filesystem or by changing the environment variable). I can agree with the rest of your article about predictability, but I can't see the security hole.

  8. J says:

    @Mark: If LoadLibrary kept going when the network server was unreachable, then you could force a failure there (which just requires some control of part of the network in between) to cause the application to load from the current directory instead. You wouldn't need any control over the path or any folders higher in the path than the network share.

  9. Cesar says:

    @Mark:

    PATH=C:dir1;\server1share;\server2share;C:dir2

    Suppose your DLL is on server1. An attacker on the local network (but outside your machine and with no write access to server1, thus outside your security context) could take down server1 (by flooding it, for instance), and thus make LoadLibrary read from server2 (which it might have write access to).

  10. Joshua says:

    GAK! Don't put network shares in your path.

  11. Paul says:

    I see a lot of people with the overriding mentality "errors are bad, so stop them happening". Let it look in other paths, suppress exceptions, try your best despite bad input, etc… It is important to think about the consequences of this! That is why I love your blog so much, Raymond, you're making these people think. That is, if they read it.

  12. blah says:

    +100 Joshua.

    This should not even be supported. From the pitiful speed of the network redirector to the Hoover Dam-sized security hole this creates… facepalm.

  13. John says:

    "Why does the LoadLibrary search algorithm continue if an invalid directory or drive letter is put on the PATH?"

    So that Windows 8 will still run Lotus 1-2-3?

  14. Windows should stop supporting PATH for new software. Want new features – stop using hacks. Otherwise this stuff will go on. The ISVs are retarded.

  15. Phil W says:

    @alegr1: Completely agree, PATH is a terrible non-deterministic way to find files.

    @mark: If you are a limited user who can copy a Dll to a location that's in the path of a program that an administrator runs, then you can get your Dll code running as administrator.

  16. Ray says:

    But, a user can change their own copy of the path:

    cd pathtobaddll

    path=%cd%;%path%

    Doesn't that mean

    program_to_run.exe

    will load foo.dll from pathtobaddllfoo.dll if it exists?

  17. Joshua says:

    @algerl: PATH is important to software development utilities.

    @PhilW: The non-determinism is important here. It's the only way that scripting languages can deal with utilities being installed in different locations on different machines. This is not UNIX nor Plan9. We can't put everything in /bin even if we wanted to.

  18. "Putting a server share in your PATH is just asking for trouble.  Aside from the security issues, do you really want to be hitting the network for every call to LoadLibrary or CreateProcess?"

    When I started as a WinNT sysadmin (holiday job while in high school), the existing machines were all diskless (Win3.1 booted from NetWare). Life in that scenario without running software from the network is rather boring, not to mention unproductive.

    Even now, we have centrally-stored applications run from the servers. The obvious things like Office and Firefox are local, but do you really want AutoCAD and Visual Studio on the Modern Languages machines' hard drives? It varies per process of course, rather than system-wide, but yes, even now quite a few of our programs run that way.

    Ray: Now you're on the same side of the security barrier. Of course, the ability to disable/disconnect "server" (from Raymond's machine's perspective at least) is pretty close to the ability to impersonate it instead: if I can connect, I can name my machine server and create a public share with my malicious DLL on. I recall one compromised machine, around 2000/2001, where the attacker configured it to hijack every single IP address on the subnet for nefarious purposes!

  19. anton says:

    @Joshua

    This is mistake. Take for example path=c:python

    What if I setup python only to support specific web server, and I have two or three such web server on the same machine? Definitely I dont want any of three pythons to appear on path: there is no way to tell which is which.

    It is always better to setup relative path if possible.

  20. Don Reba says:

    This just in: Kernel32 is implemented in Prolog!

  21. Evan says:

    I'm surprised at the number of people who view "don't put network paths in %PATH%" as a no-brainer. I'm in a different environment (mostly Linux-based, using AFS which has fairly agressive client-side caching, and working from the shell means the shell keeps a hash of the location common commands, so it can be incorrect), but probably half the software I run is over the network. I have 18 directories in $PATH, and 6 of those are network drives. (One doesn't exist, and one is '.'.) That includes all of the 3rd through 6th items in the path, all of which actually appear before /usr/bin.

  22. Adam Rosenfield says:

    @Evan: But do you also set $LD_LIBRARY_PATH to include non-local AFS directories?  DLLs/shared libraries loading from a network share are a much bigger danger, especially on Windows.  %PATH% on Windows serves double-duty for locating executables and DLLs, while Linux separates those out into $PATH and $LD_LIBRARY_PATH.

  23. Pal says:

    Why does LoadLibrary search the PATH in the first place ? PATH is modified by lot of installers – it is a very brittle and a huge security hole to load dlls using PATH (and current directory)

  24. J says:

    The argument here is that an attacker could take down a network share and thus punt LoadLibrary() down to the next entry in PATH, where the evil DLL is located. The thing is, this problem is just as true if there are no network paths in PATH. Maybe it's slightly easier to cause the network to go down and punt searches down PATH, but it might also be fairly easy to dump a DLL in one of the non-network entries of PATH. This type of "security feature" does not stop those. The fact of the matter is that searching around essentially random directories for libraries is dangerous regardless of whether they are on a network drive or not.

    [You've shown that it's a bad idea to put a world-writable directory on the PATH. -Raymond],/DIV>
  25. Joshua says:

    @anton: Do you think PATH is an environment variable for nothing? This allows configuration of sub-contexts that use other variants of such things. In a sane environment, you'd have 3 Phthons and each web server (or app-pool) would have its own idea of PATH. This is arranged in the site specific configuration to launch the app-pool with a different path than the system one.

  26. Avi says:

    @Ray:

    That doesn't matter.  The user could have just loaded their evil DLL directly using their own stub program.  Or they could have put the evil code directly into an EXE and run it.

    The issues others point out are ways to get *other* users to use the evil DLL.

  27. xpclient says:

    Is there some kind of utility or tool which notifies the user in a balloon tip or some other style when an app modifies the system path?

  28. Ray says:

    That's what I love about the comments here – the core topic is the LoadLibrary function, but all the focus is on the inconsequential tangential implementation detail used in the example (executable code on a network share).

    I can imagine a focus group put together by Ford to discuss the driving ergonomics of their new concept car, and everyne spends all day shouting about the fact that it has been fitted with summer tyres, but it's the middle of December (and then even ignoring the fact that they're in the Southern Hemisphere).

    Maybe the DLL is on the network. Maybe the whole application is network deployed, but the connection goes down while LoadLibrary is called. Maybe it's on a removable drive, or a drive mapped into C:mntDisk1 which is disconnected, or a directory which was deleted by your cat walking over your keyboard… it doesn't matter, that's not the point!

  29. Cesar says:

    @Evan: never put '.' (or an empty value) in $PATH. It is a well-known security risk. One typo while the current directory is writeable by someone else, and you could be running an attacker's code.

  30. Peter says:

    It's not backtracking, but linear search. Backtracking is a recursive algorithm that traverses the search tree; it's exponential in the worst case (like in the article by Eric Lippert). Here you have a linear list of directories that are checked in a loop.

    [We're still exploring a tree. In this case, the tree is fishbone-shaped. If you explore a rib and hit a dead end, the algorithm does not backtrack back to the spine. -Raymond]
  31. 640k says:

    [We're still exploring a tree. In this case, the tree is fishbone-shaped. If you explore a rib and hit a dead end, the algorithm does not backtrack back to the spine. -Raymond]

    You could also define a single directory as a tree of directories. Not useful in any way though.

  32. alexcohn says:

    I can't understand how a disconnected K: network share is different from security POW from a disconnected \srvk network share in my PATH.

  33. alexcohn says:

    @Cesar that's Windows for you: . is always waiting for you on the rear end of the PATH!

  34. Myria says:

    There seems to be an exception to this rule with ERROR_BAD_EXE_FORMAT: if the first .dll it finds is a valid format but doesn't match the architecture, it will keep trying.

    This is unfortunately very important to those of us who have both 32-bit and 64-bit Oracle client applications on our machine – Oracle named the 32-bit and 64-bit DLLs the same thing, and they both need to be on the PATH.

  35. Worf says:

    @Raymond: Why does the LoadLibrary search algorithm continue if an invalid directory or drive letter is put on the PATH?

    This one is obvious – because LoadLibrary can't tell if the error returned by trying to open the DLL was caused by:

    1) The path is invalid (or drive invalid)

    2) The path is valid, but the DLL doesn't exist.

    I'd presume the logic would be that LoadLibrary tries the first directory in PATH, appends the DLL name, then tries to CreateFile() it or similar. CreateFile() would then return the file doesn't exist.

    In the server case, the CreateFile() would a different error (like unable to connect to server) which LoadLibrary can interpret as "I can't determine if the file exists or not".

  36. Evan says:

    @Adam: blah blah LD_LIBRARY_PATH blah blah

    Yes, I also have AFS directories in LD_LIBRARY_PATH. I also fail to see how libraries are more dangerous, at least if someone uses a lot of command line tools. There are several network places where a malicious program could slip an 'ls' and23:26 2/8/2012 have it get picked up before the system one. Some are even writable by me — e.g. I have a ~/bin and ~/lib, both of which appear before the system directories.

    (To be honest I could probably drop a couple of the paths from my LD_LIBRARY_PATH now, but they were necessary at some point.)

    @Cesar: "never put '.' (or an empty value) in $PATH. It is a well-known security risk."

    I know that's good practice. For me though, not having . there is a well-known PITA. The slightly-increased security risk from doing so is worth the convenience. (I *do* follow that when I'm root, though.)

Comments are closed.