Is there a problem with Create­Remote­Thread on 64-bit systems?

Back in the days when it was still fashionable to talk about the Itanium, a customer reported that the Create­Remote­Thread function didn't work. The customer explained that any attempt to call the Create­Remote­Thread function results in the target process being terminated. When they attempt to create a remote thread in Explorer, then the Explorer process crashes. When they attempt to create a remote thread in lsass.exe, lsass.exe process crashes, and the system restarts. They included a sample program that demonstrated the problem.

// Code in italics is wrong.  In fact, this is so wrong
// I've intentionally introduced compiler errors so you
// can't possibly use it in production.
struct UsefulInfo {
    int thing1;
    int thing2;

DWORD RemoteThreadProc(void* lpParameter)
  UsefulInfo* info =

  blah blah blah
  try {
    blah blah blah
  } catch (...) {
    blah blah blah
  return 0;

// This symbol lets us find the end of the RemoteThread function.
static void EndOfRemoteThreadProc() { }

// Error checking removed for simplicity of exposition.
void InjectTheThread(
    UsefulInfo* info,
    HANDLE targetProcess)
  // Calculate the size of the function.
  DWORD codeSize = (DWORD)EndOfRemoteThreadProc - (DWORD)RemoteThreadProc;

  // Allocate an executable buffer in the target process.
  BYTE* codeBuffer = VirtualAllocEx(targetProcess,
                     codeSize + sizeof(*info),

 // Copy the useful information to the target process
 WriteProcessMemory(targetProcess, codeBuffer,
                    info, sizeof(*info));

 // Followed by the code
 WriteProcessMemory(targetProcess, codeBuffer + sizeof(*info),
                    (void*)RemoteThreadProc, codeSize);
 // Execute it and pass a pointer to the useful information.
                    codeBuffer + sizeof(*info), codeBuffer);

There is so much wrong with this code it's hard to say where to start.

There's no guarantee that all the code in the Remote­Thread­Proc function is contiguous. The compiler might choose to spread it out into multiple chunks, possibly based on Profile-Guided Optimizations.

Similarly, there is no guarantee that the End­Of­Remote­Thread­Proc function will be placed immediately after Remote­Thread­Proc function in memory.

There is no guarantee that the code in the Remote­Thread­Proc function is position-independent.

There is no guarantee that the code in the Remote­Thread­Proc function is self-contained. There may be supporting data in the read-only data segment, such as jump tables for switch statements.

The Remote­Thread­Proc function uses C++ exception handling, but the code didn't inject the C runtime support library or fix up the references to the runtime library.

The code didn't register any exception tables for the dynamically-generated code. x86 is the only architecture that does not require explicit exception vector registration. Everybody else uses table-based exception handling.

Now some ia64-specific remarks.

Function pointers on ia64 don't point to the first byte of code, so subtracting function pointers doesn't give you any information about the size of the function (whatever that means), and copying data starting at the function pointer does not actually copy any code.

Conversely, when you take a pointer to a block of memory that contains code and treat it as a function pointer, you are actually causing the first two 8-byte values at that address to be interpreted as a global pointer and a code address. This results in a garbage global pointer, and code executing from a random location.

The copied code doesn't start at a multiple of 16. Code on ia64 must be 16-byte-aligned.

In general the Create­Remote­Thread function requires deep knowledge of the machine architecture. Its intended audience was debugggers, which are already well-versed in the details of the machine architecture.

We encouraged the customer to avoid the Create­Remote­Thread function entirely. In particular, using it with critical system processes like lsass.exe is a serious issue for system reliability. Faults in that process can bring the whole system down (as the customer observed), or cause other strange behavior like damaging parts of the security infrastructure, which will lead to hard-to-debug authentication problems at best and full-fledged security vulnerabilities at worst. And the system may in the future take stronger steps to prevent code injection and data tampering in critical system processes, so a design based on Create­Remote­Thread is living on borrowed time. It's not clear what the customer is trying to do, but they should investigate whether there are supported extensibility mechanisms that give them what they want.

The customer replied that their product contains important functionality that they have constructed out of the Create­Remote­Thread function, and they cannot afford to abandon it at this point.

Customers like this scare me.

(The customer liaison never revealed the name of the customer, but I did learn that they develop anti-malware software. So now I'm even more scared. Fortunately, fixing this code to work on Itanium became a moot issue, but I still worry about their x64 version, because many of the issues here also apply to x64.)

Comments (35)
  1. Brian_EE says:

    It might be more accurate to characterize them as “they develop malware software” if this is how they are developing their application.

    1. kantos says:

      I’m fairly convinced that most AV could and probably should be characterized as Malware based on how they act. Many did things like kernel patching or proxying every connection on the system via MITM techniques that actually leave the user less secure because they don’t support or properly handle things like TLSv2 + or certificate validation or pinning etc. It’s bad enough that a major browser has sent nastygrams to several of the manufacturers asking them to please stop immediately.

      1. Joshua says:

        If I had my way the browser would check its update site, and make sure the update site was configured correctly. If it wasn’t, bring up a MITM page rather than the home page.

      2. Mason Wheeler says:

        Let he who hunts malware beware, for when you gaze into a void, the void gazes also into you…

    2. Dave says:

      Ugh, don’t talk to me about “AV” software and its malware-like behavior. Our product does network whatevering, and the situation with AV is so bad that if we get one of a set of totally inexplicable network errors like the initial handshake succeeding but all subsequent data transfers failing we scan for the presence of network drivers from two major A/V vendors (Raymond, I assume naming names isn’t allowed?), and report an error along the lines of “You’re running AV Product X aren’t you? You’ll need to disable it in order to get network access that works”. In, oh, about 100% of cases this ends up being the problem.

      1. Yukkuri says:

        Yeah AV software is a constant opponent. What I c really hate though are the “IT professionals” that refuse to disable their snake oil of choice long enough for me to demonstrate that the impossible failures that don’t happen for other users are because of their AV software, not because we couldn’t figure out a basic thing like connecting to SQL server over a LAN…

  2. Martin Bonner says:

    OK, I can believe that people writing anti-malware software might have cause to use CreateRemoteThread – they sometimes have to get down and dirty with the machine. What I find scary is that people writing anti-malware software didn’t *immediately* spot the problem with (at very least) the calls to the run time libraries supporting try/catch. Really, they ought to have known all of that themselves.

  3. Koro says:

    The proper way to do this is to copy hand-crafted assembly code to the target process. It might be possible to have the C compiler “help with generation”, but everything must be reviewed by hand.

    That code should just LoadLibrary some DLL of yours, GetProcAddress an entry point, and jump to it, passing its own address to it. The entry point should free that, and end with a FreeLibraryAndExitThread, for zero leaks.

    Luckily you can assume that KERNEL32 is loaded at the same address in every process, and just pass along the addresses of LoadLibrary and GetProcAddress baked in your thread stub (along with the path of your DLL too).

    Also why do my comments take so much time to show up? I know I changed email addresses a few months ago, but I am definitely not a “first-time commenter”.

    1. I think the system switched over so that all comments are marked as spam unless they come from somebody with an MSDN profile. I have to manually approve 90% of the comments now.

    2. Killer{R} says:

      There is lazy poor man’s way. CreateRemoteThread(…&LoadLibrary, pRemotelyAllocatedPathToLibrary). And to be really poor: CreateThread from its DllMain. But DONT wait it inside DllMain, just remember self hModule in global var before starting thread to use with FreeLibraryAndExitThread later.
      There is obvious question: how pass information to it. Remembering that we’re lazy and poor – forget about named mappings and all that egghead stuff. Encode info in library’s name, and get it with GetModuleFileName.
      No asm, no undocumenteds. Hehehe (evilly)…

    3. henke37 says:

      Nah, just start scraping the loader data from the PEB. No need to send the pointers over. No need to make the assumption. And this is publicly documented, without stability warnings.

  4. Joshua says:

    How to make it less scary:

    1) Get rid of try/catch.

    2) Write the code to embed in assembly.

    3) Check target process architecture (currently needed only on x64)

    4) Don’t inject csrss, lsass, ssms/wininit, or services.

    5) Don’t inject processes from the wrong subsystem.

    6) Have a feature for user-configuration of excluding specific processes /that actually excludes the processes from code injection/. I’ve seen too many that claim to but rather just set some configuration bits that makes the code injection pass-through rather than not done.

    6a) Corollary: If your software requires code injection to all process to function, your software should use a kernel mode driver instead. There is such a thing as a filesystem filter driver.

    7) It’s a lot safer to CreateRemoteThread a process that’s already started and trying to do something than it is trying to CreateRemoteThread a process that is starting up. The process’s dlls might be linked to load right after the .exe and be not relocatable, and if you VirtualAlloc too soon that memory won’t be allocated yet, early dll load or not.

    I found that code injection by CreateRemoteThread, once done properly, was a lot less scary than AppInit dlls. AppInit dlls start too soon, don’t respect memory layout, and are deadlocky. Since VirtualAllocEx takes the lowest available address, memory layout is predictable and failures are essentially predetermined rather than random and easily tracked down.

  5. mikeb says:

    Anti-malware has become a solution that is almost as bad as the problem it’s trying to solve. It actually might be worse.

    1. viila says:

      The only time in the past 20 years that I have been infected with malware was during the couple years around 2008 when I had a paid-for resident AV running (one of the major reputable AV brands). The AV was absolutely no help in detecting, identifying or purging the malware. I noticed it myself via suspicious behaviour (namely the classic Task Manager is autokilled immediately when started) and manually identified it.

      But many times the AV “protected” me against false positives… even up to and including my own executable once that I had just compiled!

      1. Matthew W. Jackson says:

        Why are you compiling viruses? *grin*

  6. Piotr says:

    Can you even inject anything into lsass without tripping the watchdog and causing a bluescreen? Isn’t that the guy who has access to certificate private keys?

    1. Joshua says:

      Unfortunately you are asking the wrong questions. ReadProcessMemory will suffice. But as for the question you should have asked, only csrss is protected against such shenanigans. (csrss is uniquely poised to clobber kernel memory).

  7. matus says:

    But other than that, the code is fine, right?

  8. This blog post reminds me of an xkcd panel. It showed a satirical chart that demonstrated future artificial intelligence development. In that chart, a futuristic government created centrally controlled unmanned weapon systems. In the next stage, these weapon systems became self-aware and started a robot apocalypse. The panel marked that final stage as “the stage everyone is worried about” while marking the prior stage as “the stage I am worried about”. So, Raymond, you are scarred that an AV company wants to use CreateRemoteThread but weren’t scarred when Microsoft created CreateRemoteThread. This beckons the question: Did a customer ever contact you about CreateRemoteThread with a non-scary question about it?

    1. Antonio Rodríguez says:

      Raymond explains in the article that there are legit uses for CreateRemoteThread, such as debugging a process. It’s not like it’s an useless and harmful function created just to undermine the OS reliability and torment the end user. In other words: if I’m driving and run over someone, don’t blame the inventor of the car or its maker.

    2. Aged .Net Guy says:

      This is the xkcd you’re referring to. Prescient indeed. He must think about this topic a bunch since he also published about a month earlier.

  9. James Sutherland says:

    Ten years or so ago, I came across a laptop running Windows XP with some “interesting” behaviour. (Specifically, services.exe was trying to send email. Lots of it.) Someone had built a fairly crafty bit of malware – as I recall, it loaded itself as a Winlogon notification DLL, which in turn spawned a copy of services.exe and dynamically injected itself into that process’s memory space. The exe file’s signature was valid, it didn’t have any strange/suspicious DLLs loaded itself – quite crafty, I thought, for the time.

    Around the same time, I was baffled by some userspace code of mine somehow triggering a BugCheck – via CloseHandle on a file. Eventually, I traced the culprit; the AV product I was using at the time was trying to free some sort of data structure it assumed it had created during the file opening process, but in this particular case that structure was never allocated, so it was freeing an uninitialised pointer in kernel mode, leading rather rapidly to Bad Stuff(TM). Disabling the on-access AV scan avoided the issue, as did switching to an alternative AV product, but I decided to shelve that particular project rather than risk blue-screening other users of that product.

    Quite disturbing that the customer couldn’t/wouldn’t switch to a more robust/sane approach to implementing whatever this functionality was though. Something akin to a call to LoadLibrary would seem like a bit improvement, as Koro suggests, avoiding most of the pain shown here?

  10. There is a lot of gotchas with CreateRemoteThread. I’ve only ever used it to do slightly dubious stuff and inject a dll into the process. ASLR made things a lot more complicated

  11. Quite generally, when customers do incomprehensible things like this, and like several other of your recent posts, Occam’s Razor suggests that they are trying to (A) do some kind of licensing-compliance enforcement, (B) in a manner that is as difficult as possible to reverse-engineer.

  12. Danny says:

    And here it is lads and gents, the number ONE (capital 1 that’s it) the reason why Itanium never lived. That is the reason why Apple MacOS still hasn’t catch with Windows on desktop. Here it is the reason why Linux, on its desktop format never catch either. And same reason why Nokia’s Symbian died, Blackberry too and so on and so forth so many systems.
    Because you need to be freaking developer friendly!!!
    You are nothing without developers. Hence why Android is number 1, even if it came out after iOS.
    iOS, I am first to admit, is the exception to the rule, is still not developer friendly but it has a healthy developers community.
    Oh, and Micro$$$oft cell business died too (sorry Ray, RIP), yet still I hope that one will make a comeback…after all your company has plenty of cash to burn to get it right…eventually :D

  13. Scarlet Manuka says:

    “Function pointers on ia64 don’t point to the first byte of code, so subtracting function pointers doesn’t give you any information about the size of the function (whatever that means), and copying data starting at the function pointer does not actually copy any code.

    “Conversely, when you take a pointer to a block of memory that contains code and treat it as a function pointer, you are actually causing the first two 8-byte values at that address to be interpreted as a global pointer and a code address. This results in a garbage global pointer, and code executing from a random location.”

    Don’t these two issues cancel each other out, in this case? When you copy a block of data starting from the function pointer, it sounds like the first things that should be copied are the global pointer and the code address, so when you treat the copied block as a function pointer it should read the same global pointer and code address as the original, shouldn’t it?

    Not that I’m defending any part of this practice, you understand.

    1. The function descriptor is nowhere near the code bytes.

      1. Scarlet Manuka says:

        Granted – but shouldn’t you still be copying the correct global pointer and code address this way (followed by a chunk of random data)? Or am I still missing something obvious? As far as I can understand what you’re saying, (void*)RemoteThreadProc points to the function descriptor, so the code copies the function descriptor into the remote process followed by a chunk of garbage data. So when we CreateRemoteThread, doesn’t it see the correct function descriptor there? If this is not the case, I don’t understand your sentence “Conversely, when you take a pointer to a block of memory that contains code and treat it as a function pointer, you are actually causing the first two 8-byte values at that address to be interpreted as a global pointer and a code address”, or at least its relevance here. (Obviously in general this is a bad thing.)

        1. If you copy the function descriptor, then yes, you get a copy of the function descriptor. Of course, you didn’t copy any other data or code, so the pointers you copied are pointer to nowhere. If you take a pointer to code and treat it as a function pointer, then you also get garbage. Just saying various ways of generating garbage.

        2. AndyCadley says:

          The subtlety here is that source and destination are in different address spaces.

          1. Scarlet Manuka says:

            Ah – so the copied function descriptor points to the wrong address space, and we do indeed wind up executing garbage. (Actually, I suppose it’s technically the right address space, it’s just not the address space in which the code address part points to the intended function.) Thanks, I think the mention of a global pointer here threw me off.

  14. David Ching says:

    This technique was presented on CodeProject. I used it in 2006 as a simple way to overcome SendMessage’s refusal to proxy wParam/lParam for user messages (i.e. messages with ID’s greater than WM_USER, e.g. common control messages), nothing to do with AV or malware.

    1. Ben Voigt says:

      You don’t need remote code execution to marshal data for `WM_USER`. It’s enough to do the VirtualAllocEx/WriteProcessMemory/ReadProcessMemory steps, without the final CreateRemoteThread.

    2. Joshua says:

      Dude! Use WM_COPYDATA.

      1. David Ching says:

        I created a function called SendMessageRemote(), which people have used over the years especially to query Windows common controls for their states:!search/sendmessageremote/

Comments are closed.

Skip to main content