What happens if you simply return from the thread callback passed to _beginthread and _beginthreadex?

Medinoc asks, "What happens when one simply returns from the thread callback? I'd suspect the code gluing between _beginthread() and its callback calls _endthread() upon return, while the code between _beginthreadex() and its callback calls _endthreadex() instead?"

Yup, that's exactly it. If your thread callback function returns, then _begin­thread calls _end­thread on your behalf, and then _begin­thread­ex calls _end­thread­ex on your behalf. The value passed to _end­thread­ex is the return value of your thread callback function.

In response to the remark "beginthread() initializes the CRT," Cesar asked, "Which CRT? A process can have more than one CRT, and the thread function can even call functions from several different C runtimes."

The _begin­thread function initializes the CRT it belongs to. What other choice does it have? It's not like msvcr80!_beginthread knows how to initialize the data used by msvcr90.dll. If you call msvcr80!_beginthread, then the new thread is initialized for the msvcr80 runtime, since that's the only one it knows about.

If the thread function calls into multiple C runtimes, then that's its decision. If it calls into a C runtime that hasn't been initialized for that thread, then what happens next depends on the behavior of that C runtime. For quite some time now, Microsoft's C runtimes are self-initializing, meaning that the first time you call into them on a thread, they will initialize themselves on the spot. And they will also auto-uninitialize themselves when the thread exits.

Wait, if the C runtime initializes itself on demand and auto-uninitializes, then why bother with _beginthread at all?

Well, the functions are still around because they predated the initialize-on-demand and auto-uninitialize behavior. And they do guarantee that the C runtime will be initialized for the new thread. (If not, then the functions return failure.) If you go for the initialize-on-demand case, and the C runtime cannot initialize itself, then something interesting happens.

  • Some functions will handle the case where the C runtime failed to initialize in some way. for example, _tempnam and strerror will return NULL to report a failure. (Sometimes this failure mode is documented; sometimes it isn't.) Other functions will fall back to a static buffer instead of a per-thread buffer.
  • Other functions will exit the process with the error message "R6016 - not enough space for thread data."

But as Harry Johnston noted, "In practice very few applications will survive running out of memory anyway."

Joshua Shaeffer asks, "Instead of automatically closing the handle, how about automatically never opening the handle?"

Not sure what Joshua is trying to say here, because the C runtime didn't open the handle. The handle was created by the operating system and returned by the Create­Thread function. So the C runtime really doesn't have a choice. The handle gets opened as part of the thread-creation process. All it can do is decide what to do with the handle once it is given one.

Comments (21)

  1. Koro says:

    I’m assuming that the functions are required to properly initialize and uninitialize the CRT should you link statically with it inside an EXE, because then there is no DllMain to hook into for thread attach/detach notifications.

    1. Harry Johnston says:

      Initialization isn’t a problem for static runtimes, because it doesn’t have to occur until the new thread calls a CRT function, so there’s no need to detect thread startup. In fact if the thread never calls a CRT function it would be wasteful for the CRT to initialize.

      Uninitialization is only a problem for static runtimes if you’re running on Windows XP or earlier. From Vista onwards, the CRT uses a FLS callback to detect when a thread exits.

      1. Jan Ringoš says:

        You know, this is peculiar. Everyone operates under the assumption (well documented) that FlsAlloc is available only since Vista, but even on fresh XP SP3 installation, the FlsAlloc is present. If I set Minimal Version to 5.1 in Visual Studio 2017 project, the exe will run, even with v141 toolset (not the v141_xp). I would like to know if it’s just me …or if I have just betrayed some kind of secrete everyone else was on.

        1. Joshua says:

          The ability to use FlsAlloc, FlsGetValue, FlsSetValue, etc. from a thread is not documented.

        2. skSdnW says:

          MSDN documents Win2003 as the minimum version. There has been some cross contamination between XP and 2003 service packs but my XP SP3 does not have FlsAlloc.

  2. Joshua says:

    I interpreted Joshua Shaeffer’s question as “Why not close the handle immediately after CreateThread() returns?”. Then I realized I don’t care about the actual answer.

    1. Alex Guteniev says:

      With such interpretation, there is still a case when such handle could be useful.
      If at some point the thread is waiting for an event, that is fired only after the handle is used, then it’s safe to use such handle.
      Sure you won’t be using it for waiting on a thread, but you can use it otherwise, like set thread priority, obtain thread id, or duplicate that handle (and then wait on a duplicated handle).
      Kinda artificial case, though.

    2. Ben Voigt says:

      “Closing the handle immediately” still leaves a non-zero interval where the handle is valid. Depending on what other threads are doing, this can be a problem. In particular, if another thread is at exactly the right phase of starting a child process, the handle you wanted closed immediately could be duplicated into the child. Inheritable handles are problematic beasts.

      1. Joshua says:

        Wait what? I’m certain handles from.CreateThread aren’t inheritable.

        1. Harry Johnston says:

          Not by default, but (according to the documentation) you can set inheritability via the lpThreadAttributes argument. _beginthread doesn’t do so.

  3. mikeb says:

    These functions could immediately close the handle they get from `CreateThread()`. But the user of `_beginthread()`/`_beginthreadex()` might want to use the handle (it’s how you can know that a thread has exited). So instead of closing the handle, they pass it back to the caller. Of course, `_beginthread()`/`_endthread()` is a little broken in how they handle the handle – that’s one of the reasons for the existence of `_beginthreadex()`/`_endthreadex()`.

    1. IInspectable says:

      _beginthread() isn’t broken. It’s a design decision, that the handle returned by it is owned by the CRT. Callers must not ever use the return value for anything else but checking for success/failure. In contrast, ownership of the handle returned by _beginthreadex() is transferred back to the caller. The difference is one of design, not brokenness.

      1. mikeb says:

        _beginthread() is documented to return a handle to the thread. As far as I know it’s been documented that way since the beginning (it is documented that way in the VS6 docs). I’m pretty sure that the intent was for clients of the API to be able to use that handle to “join” the thread. The VS6 docs for _beginthread() do mention that the caller shouldn’t close the handle because it gets closed by the runtime. There’s no mention about the possibility of the handle becoming invalid (or worse, being reused for another thread), which is mentioned since _beginthreadex().

        Retrospectively, the return from _beginthread() can only be used for a success/fail test. But I’m pretty sure that wasn’t the original intent.

        1. IInspectable says:

          The documentation doesn’t leave much room for interpretation. If the handle is closed by the runtime, there is no conceivable way for a caller to do anything meaningful with it, because the handle can be closed at any time. It’s moot whether the documented contract matched the design goal at one point or didn’t. The important piece of information is, that the return value of _beginthread() is useless beyond comparing it against -1L (see https://blogs.msdn.microsoft.com/oldnewthing/20170929-00/?p=97115).

  4. Stephen Hewitt says:

    Is there a runtime cost to this automatic initialisation?

    1. Richard says:

      Lazy init is (in general) so close to being free as to not be worth bothering about.
      If (!the_pointer_this_function_needs)
      Init ()

      If the cache gets missed, it was going to be missed anyway as the function was about to dereference that point.

      I guess the only exception is when you can’t afford for your first call into the CRT to be significantly slower than later calls – and the solution there is obvious :)

      1. Stephen Hewitt says:

        If binary rewriting was used it could be free (although the initialisation would take slightly longer as the trampolines were removed) but I doubt this approach is used.

        1. Stephen Hewitt says:

          Actually, on further thought, this is not doable.

          1. IInspectable says:

            I’m not sure, what specific counter example you have in mind. I believe this were indeed doable, e.g. by using the same technique that is used for hotpatching (see https://blogs.msdn.microsoft.com/oldnewthing/20130102-00/?p=5663). Although replacing the short jumps with a 2-byte NOP upon initialization wouldn’t lead to zero overhead. Executing the 2-byte NOP still takes time, however little.

          2. IInspectable says:

            Oh, huh, I see now, why this is not possible. The CRT needs to be initialized per thread. Patching the binary would prevent all future threads using the same copy of the CRT from initializing.

    2. kantos says:

      you can see the initialization code yourself in the windows kit that ships with VS2017, for me it’s at C:\Program Files (x86)\Windows Kits\10\Source\10.0.10586.0\ucrt\internal\initialization.cpp it’s pretty straightforward actually.

Skip to main content