Debugging mysterious application crash


Today I was asked to help investigating why an application mysteriously crashes.


The symptom is very interesting. First, navigate to a directory with close to 128 characters long. In that directory, any managed WinForm application will fail to launch, while all other kinds of applications runs fine (including managed console applications, unmanaged console and GUI applications). What is more interesting is that, when you launch the WinForm application from a debugger, it works fine.


My first guess is that CreateProcess failed. I attach the debugger to cmd.exe and break on Kernel32!CreateProcessW. Unfortunately, CreateProcessW returns TRUE, which means the process is created successfully. I then realize that this problem does not repro under debugger, which means anything I do under debugger is not going to help.


I then launch FileMon.exe (downloaded from http://www.sysinternals.com) to monitor the files the application accesses. Sure enough, there are some invalid file names the application tries to access. Those invalid file names probably cause the crash. But I still have no idea why the application tries to access those files.


The next thing I do is to run the application under debugger again (where the application works fine), and look at all the DLLs the application loads. For all the DLLs, I check where it is loaded from. There is one dll loaded from temp directory. This is very suspicious. This is a very basic WinForm application (with an empty Form and does nothing). Why will it load anything from temp directory? This very much looks like a spyware or malware.


That DLL has a version resource, and there is a company name in the description. A web search shows the company has been alleged with spyware, while other sites claim that it is of legitimate use. A further search shows that that DLL is carried by a photo processing software, which is bundled with the digital camera the user purchased and installed in the machine.


Anyway this looks promising enough. I rename the DLL and run the WinForm application again. Nothing. The application still can’t be launched.


 


With frustration, I exam the list of the DLLs the application loads again. All of them are loaded from system32. I carefully check the version resource of every DLL. One DLL catches my eyes. From the description, it is a mouse application and is installed with the driver application comes with the user’s mouse setup CD. I do a tlist on it. Apparently all the GUI applications have it loaded and they all work fine.


 


Desperately, I rename the mouse DLL and re-launch the WinForm application. Boom! A form shows up in the screen. The application works correctly this time! Apparently, the mouse DLL is causing the application crash.


 


After uninstall the mouse software, everything works fine.


 


Two takeaways from this analysis:


 



  1. Handling long path is hard. Win32 has a file name length limit of MAX_PATH (which is defined as 260). People are looking for ways to extend the limit, as the file system supports a file name up to 32767 characters long. Apparently this has huge AppCompat implication. As we can see from this example, current applications may already have difficulty handling file path within the MAX_PATH limit, let alone file name with 32767 characters long.
  2. Injecting code into other people’s process is dangerous. In our example, the mouse application vendor apparently has a very good QA, as all the unmanaged applications work fine. Still, it misses other cases and causes huge user pain.

Comments (10)

  1. Norman Diamond says:

    > Win32 has a file name length limit of

    > MAX_PATH

    Not exactly.  Some ANSI APIs have a limit of MAX_PATH bytes (where one byte = one char datatype in C = one TCHAR in ANSI).  However, one of your colleagues had a long consultation with another one of your colleagues and then assured me that some ANSI APIs are less predictable.  It seems that some ANSI APIs have a limit of MAX_PATH characters (where one character = one or more bytes = one or more char datatypes in C = one or more TCHARs in ANSI), so depending on the individual characters it is possible to exceed MAX_PATH.

    In some Unicode APIs the limit is around 32767, isn’t it?  Surely you can’t get to 32565?  But in some Unicode APIs the limit is still MAX_PATH.

    File systems are a different story.  In NTFS, the maximum length of a single component of a pathname is approximately equal to MAX_PATH, but that might be a coincidence.  There doesn’t seem to be any limit to the total length of a pathname.

  2. junfeng says:

    Norman,

    I think you are confused. The MAX_PATH limit on Win32 layer is all over the places. Look at the shlwapi.dll APIs and Shell32 APIs. Many of them are documented to only accept paths not longer than MAX_PATH. It does not matter whether the API is ANSI or unicode.

    The NTFS layer has a hard limit on the length of the pathname (32767), due to the data structure used internally (UNICODE_STRING).

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/secauthn/security/unicode_string.asp

  3. Norman Diamond says:

    > I think you are confused.  The MAX_PATH

    > limit on Win32 layer is all over the places.

    Two of your colleagues outnumber you.

    Well actually I can’t be completely sure.  One of your colleagues wrote that she had conferred at some length with another one of your colleagues.  I think there is some chance that she was telling the truth, so I think that two of your colleagues outnumber you.

    The MSDN page for CreateFile says that the limit in the ANSI version is MAX_PATH characters.  I first sent e-mail saying the limit in the ANSI version is MAX_PATH bytes, since one TCHAR is one byte in ANSI, and each character can be one or more TCHARs i.e. one or more bytes.  Your colleague replied that in ANSI one character is always one byte.  Now that was false.  After more discussion, I think your colleague understood that in ANSI one character is one or more bytes.  But still the consultation between two of your colleagues yielded a result that the ANSI version of CreateFile can accept more than MAX_PATH TCHARs because it converts to Unicode and has a Unicode buffer for MAX_PATH characters.

    Now maybe I’m confused for not knowing which Microsoft employees to believe.  Of course readers of newspapers and readers of too many MSDN pages wouldn’t believe any Microsoft employees.  Blogs have taught me that some Microsoft employees can be believed, at least sometimes.  Anyway, please see if you and your colleagues can reach some consensus.

  4. junfeng says:

    MSDN is correct.

    CreateFileA has a limit of MAX_PATH because internally it uses a pre-allocated buffer to do the ANSI->Unicode conversion, and the pre-allocated buffer has a MAX_PATH limit.

    This limit can be removed. But apparently there is not enough demand to warrant that change.

  5. Norman Diamond says:

    I didn’t mean that there’s demand to remove the limit in CreateFileA.  I just meant that (having been informed by your colleagues) for some functions the limit isn’t always MAX_PATH.

    Anyway thank you for confirming the situation.

    By the way if we would speak of removing limits, we really need a version 2 of FindFirstFile and its relatives.  Their data structure really is limited to MAX_PATH, and programs can’t even use them to retrieve all existing filenames.

  6. junfeng says:

    I am not proposing any change. Sorry Norman:)

  7. Jay says:

    Norman, about FindFirstFile, it is only returning  directory entry names, not full paths. MAX_PATH is ok there. (I’d call them "leaf" names but that’s not right.)

    Junfeng, you have typos on the value 32767 in your blog.

    As well, this is the limit for full path opens.

    The limit for directory relative opens may be longer.

    Handling long paths is not hard as you say Junfeng. People are simply lazy. The need to put the \? thing in the paths is overly complicated. The APIs ought to just accept the long paths in their more usual form.

  8. Jay says:

    Also, this sounds familiar.

    I debugged something recently where a program was hanging.

    The program runs for a very short amount of time, multiple times.

    It’s a little utility to compare two files and if they differ, copy one over the other. It normalizes out some data like timestamps.

    It reproed readily under a debugger and I could clearly see that it was exiting and there were two threads.

    But the app was single threaded.

    What was happening is that some third party software was calling CreateRemoteThread to inject itself. That has problems, at least when the app is already busy exiting. It was mouse or webcam software the user had installed recently.

  9. junfeng says:

    Thanks Jay. Corrected.