Threads, fibers, stacks and address space


style="mso-bidi-font-family: Tahoma">Every so
often, someone tries to navigate from a managed System.Threading.Thread object
to the corresponding ThreadId used by the operating system. prefix = o ns = "urn:schemas-microsoft-com:office:office"
/>


style="mso-bidi-font-family: Tahoma"> size=2> 


face=Tahoma> style="mso-bidi-font-family: Tahoma">System.Diagnostic.ProcessThread exposes the
Windows notion of threads.  In other words, the OS threads active in the OS
process. style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


face=Tahoma>  style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


face=Tahoma>System.Threading.Thread
exposes the CLR’s notion of threads.  These are logical managed threads,
which may not have a strict correspondence to the OS threads.  For example,
if you create a new managed thread but don’t start it, there is no OS thread
corresponding to it.  The same is true if the thread stops running – the
managed object might be GC-reachable, but the OS thread is long gone. style="mso-spacerun: yes"> 
Along the same lines, an OS thread might
not have executed any managed code yet. 
When this is the case, there is no corresponding managed Thread
object. style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


face=Tahoma>  style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


style="mso-bidi-font-family: Tahoma">A more
serious mismatch between OS threads and managed threads occurs when the CLR is
driven by a host which handles threading explicitly.  Even in V1 of the
CLR, our hosting interfaces reveal primitive support for fiber scheduling. style="mso-spacerun: yes">  Specifically, look at ICorRuntimeHost’s
LogicalThreadState methods.  But
please don’t use those APIs – it turns out that they are inadequate for
industrial-strength fiber support. 
We’re working to get them where they need to
be.


style="mso-bidi-font-family: Tahoma"> size=2> 


face=Tahoma>In a future CLR, a host
will be able to drive us to map managed threads to host fibers, rather than to
OS threads.  The CLR cooperates with
the host’s fiber scheduler in such a way that many managed threads are
multiplexed to a single OS thread, and so that the OS thread chosen for a
particular managed thread may change over time.
style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


face=Tahoma>  style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


style="mso-bidi-font-family: Tahoma">When your
managed code executes in such an environment, you will be glad that you didn’t
confuse the notions of managed thread and OS
thread.


style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma">When you are
running on Windows, one key to good performance is to minimize the number of OS
threads.  Ideally, the number of OS
threads is the same as the number of CPUs – or a small multiple thereof. style="mso-spacerun: yes">  But you may have to turn your
application design on its head to achieve this. style="mso-spacerun: yes">  It’s so much more convenient to have a
large number of (logical) threads, so you can keep the state associated with
each task on a stack.


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma">When faced
with this dilemma, developers sometimes pick fibers as the solution. style="mso-spacerun: yes">  They can keep a large number of
cooperatively scheduled light-weight fibers around, matching the number of
server requests in flight.  But at
any one time only a small number of these fibers are actively scheduled on OS
threads, so Windows can still perform well.


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma">SQL Server
supports fibers for this very reason.


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma">However,
it’s hard to imagine that fibers are worth the incredible pain in any but the
most extreme cases.  If you already
have a fiber-based system that wants to run managed code, or if you’re like SQL
Server and must squeeze that last 10% from a machine with lots of CPUs, then the
hosting interfaces will give you a way to do this. style="mso-spacerun: yes">  But if you are thinking of switching to
fibers because you want lots of threads in your process, the work involved is
enormous and the gain is slight.


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma">Instead,
consider techniques where you might keep most of your threads blocked. style="mso-spacerun: yes">  You can release some of those threads
based on CPU utilization dropping, and then use various application-specific
techniques to get them to re-block if you find you have released too many. style="mso-spacerun: yes">  This kind of approach avoids the rocket
science of non-preemptive scheduling, while still allowing you to have a larger
number of threads than could otherwise be efficiently scheduled by the
OS.


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma"> size=2> 


style="mso-bidi-font-family: Tahoma">Of course,
the very best approach is to just have fewer threads. style="mso-spacerun: yes">  If you schedule your work against the
thread pool, we’ll try to achieve this on your behalf. style="mso-spacerun: yes">  Our threadpool will pay attention to CPU
utilization, managed blocking, garbage collections, queue lengths and other
factors – then make sensible dynamic decisions about how many work items to
execute concurrently.  If that’s what you need, stay away from
fibers.


style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma">If you have
lots of threads or fibers, you may have to reduce your default stack size. style="mso-spacerun: yes">  On Windows, applications get 2 GB of
address space.  With a default stack
size of 1 MB, you will run out of user address space just before 2000
threads.  Clearly that’s an absurd
number of threads.  But it’s still
the case that with a high number of threads, address space can quickly become a
scarce resource.


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma">On old
versions of Windows, you controlled the stack sizes of all the threads in a
process by bashing a value in the executable image. style="mso-spacerun: yes">  Starting with Windows XP and Windows
Server 2003, you can control it on a per-thread basis. style="mso-spacerun: yes">  However, this isn’t exposed directly
because:


style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> style="mso-bidi-font-family: Tahoma"> size=2> 


style="MARGIN: 0in 0in 0pt 40.5pt; TEXT-INDENT: -22.5pt; mso-list: l0 level1 lfo1; tab-stops: list 40.5pt"> style="mso-bidi-font-family: Tahoma; mso-fareast-font-family: Tahoma; mso-bidi-font-size: 12.0pt"> style="mso-list: Ignore">1) style="FONT: 7pt 'Times New Roman'">        
style="mso-bidi-font-family: Tahoma">It is a recent addition to
Windows. style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


style="MARGIN: 0in 0in 0pt 40.5pt; TEXT-INDENT: -22.5pt; mso-list: l0 level1 lfo1; tab-stops: list 40.5pt"> style="mso-bidi-font-family: Tahoma; mso-fareast-font-family: Tahoma; mso-bidi-font-size: 12.0pt"> style="mso-list: Ignore">2) style="FONT: 7pt 'Times New Roman'">        
style="mso-bidi-font-family: Tahoma">It’s not a high priority for non-EXE’s to
control their stack reservation, since there are generally few threads and lots
of address space. style="FONT-SIZE: 12pt; mso-bidi-font-family: Tahoma">


style="MARGIN: 0in 0in 0pt 40.5pt; TEXT-INDENT: -22.5pt; mso-list: l0 level1 lfo1; tab-stops: list 40.5pt"> style="mso-bidi-font-family: Tahoma; mso-fareast-font-family: Tahoma"> style="mso-list: Ignore">3) style="FONT: 7pt 'Times New Roman'">        
size=2>There is a
work-around.


style="mso-bidi-font-family: Tahoma"> size=2> 


style="mso-bidi-font-family: Tahoma">The
work-around is to PInvoke to CreateThread, passing a Delegate to a managed
method as your LPTHREAD_START_ROUTINE.  Be sure to specify
STACK_SIZE_PARAM_IS_A_RESERVATION in the CreationFlags. style="mso-spacerun: yes">  This is clumsy compared to calling
Thread.Start(), but it works.


style="mso-bidi-font-family: Tahoma"> size=2> 


style="mso-bidi-font-family: Tahoma"> face=Tahoma>Incidentally, there’s another way to deal with the scarce resource
of 2 GB of user address space per process. 
You can boot the operating system with the /3GB switch and – starting
with the version of the CLR we just released – any managed processes marked with
IMAGE_FILE_LARGE_ADDRESS_AWARE can now take advantage of the increased user
address space.  Be aware that
stealing all that address space from the kernel carries some real costs. style="mso-spacerun: yes">  You shouldn’t be running your process
with 3 GB of user space unless you really need
to.


style="mso-bidi-font-family: Tahoma"> size=2> 


style="mso-bidi-font-family: Tahoma">The one
piece of guidance from all of the above is to reduce the number of threads in
your process by leveraging the threadpool. 
Even client applications should consider this, so they can work well in
Terminal Server scenarios where a single machine supports many attached
clients.


style="mso-bidi-font-family: Tahoma"> size=2> 

Comments (15)

  1. Philip Canarsky says:

    You mentioned the using the threadpool provided with the framework. I have run into what I found to be a serious limitation with this thread pool. I needed to be able to call Thread.Abort() on all threads currently executing in a process’ thread pool.

    My dilema is that I would prefer to use the provided thread pool because, as you mentioned, it takes much more into account then I would in my own implementation of a thread pool(CPU utilization, managed blocking, garbage collections, queue lengths and other factors), however, I need a way to cleanly exit all threads running in the pool(preferably using Abort() and the ThreadAbortException).

    Is there a way to do this with the thread pool provided in the framework?

  2. Chris Brumme says:

    One of the topics I want to address involves Thread.Abort, AppDomain.Unload, enumeration of threads & appdomains, hosting & reliability. But it’s going to take me a week or so before I can do justice to that topic.

  3. Henk de Koning says:

    In the context of threads an appdomains, could you also cover the security implications of starting your own thread versus using one from the thread pool (wrt x-thread marshaling cas markers) ?

  4. JD says:

    Did you ever get that article on enumeration of threads & appdomains complete? If so, I’d like to read id.

  5. Chris Brumme says:

    I’ve written on parts of it, like Abort, Unload, hosting and reliability. But I’ve never talked about enumeration of threads and AppDomains. Thanks for the reminder (but no promises).

  6. sandeep bhatia says:

    I just need to ask , in case a managed thread is still running e.g. A asynchronous operation being done on the same , will it be Garbage collected, if the creator has gone ……

  7. Yan-Yu says:

    If I am not mistaken, lock(e.g. critical section say) stay with a thread, not a fiber. i.e. if a fiber (while running "on" a given native thread) acquired a lock, the next fiber on this thread would become the owner etc.

    If that’s the case, and the CLR hosts scheduling of managed thread is just fibers-native assignnment underneath, I wonder about lock ownership here:

    If a managed thread acquired a lock (say via CLR mutex class), and then this managed thread got switched to a different native thread —-

    which thread owns the lock (and should we talk native or managed)??

    Thanks!

  8. Chris Brumme says:

    If you are talking about an OS critical section, it does indeed have affinity to the OS thread. If you are talking about a Monitor (e.g. lock in C# and SyncLock in VB.NET), then it has affinity to the logical managed thread. Since logical managed threads are properly coordinated with fibers through the hosting APIs, this means that managed locks correctly stay with the logical managed thread.

    Mutex is an interesting case. If you acquire a Mutex by PInvoke’ing to WaitForSingleObject, then you are at risk. If the CLR is running in fiber mode, a different logical managed thread might get scheduled on the OS thread. Since the OS Mutex has affinity with the OS thread, this will cause application bugs (both deadlocks and data corruption).

    However, if you use the managed Mutex class (System.Threading.Mutex), we can properly handle this case. The managed Mutex class is a wrapper for the same underlying OS Mutex functionality. But the CLR is now aware of whether a logical managed thread holds an OS mutex. The CLR disassociates this logical managed thread from the host’s fiber scheduler when the mutex is acquired and it re-associates the thread with the host’s scheduler when the mutex is released. This disassociation / re-association is performed via the hosting interfaces.

  9. LBG says:

    You mentioned that ‘On old versions of Windows, you controlled the stack sizes of all the threads in a process by bashing a value in the executable image’.

    How can it be done in windows2000 for example?

  10. Chris Brumme says:

    LBG,

    If you didn’t set the stack size of the EXE when you built it (via the linker command line), you can change it after the fact by bashing the value in the EXE header. One tool that can do this is EDITBIN.EXE, which you will find in the BIN directory of your Visual Studio install. It has a /STACK option that allows you to manipulate the Reserve & Commit values in the EXE header.

  11. See below link for better understanding about managed thread vs OS thread.

  12. This has been discussed fairly frequently on the Web. Chris Brumme discusses this here: http://blogs.msdn.com/cbrumme/archive/2003/04/15/51351.aspx

  13. In the previous day we saw what would happen underneath when we create a managed thread and how it maps