Identify deadlock through the hwnd handler

 

 

As we know, there are some good articles discussing about deadlock detection in action:

 

Advanced Techniques To Avoid And Detect Deadlocks In .NET Apps

https://msdn.microsoft.com/zh-cn/magazine/cc163618(en-us).aspx

 

contextSwitchDeadlock

https://msdn.microsoft.com/en-us/library/ms172233.aspx

 

There are also some automatic tools can help us to detect deadlock with postmortem dump files, especially for famous DebugDiag 1.1:

 

https://www.microsoft.com/downloadS/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en

 

However, due to the complicated resource locking symptom, there is no a common way to identify all deadlock pattern so far. Here is one interesting sample about deadlock happens between hwnd and event handler.

 

Symptom

==========

One customer uses a new thread to create COM object. But always experience no responding on object creating. I collected the memory dump while issue happens. There is no any critical section lock. Several threads seems run under tasks, but in waiting status.

 

Analysis

=========

Thread 13 call stack is interesting:

 

0:013> kbnL

 # ChildEBP RetAddr Args to Child

00 02b8f0d8 7739d1ec 77391908 003a0428 00000405 ntdll!KiFastSystemCallRet

01 02b8f114 7739c337 00b11728 00000405 0000babe user32!NtUserMessageCall+0xc

02 02b8f134 776f58ee 003a0428 00000405 0000babe user32!SendMessageW+0x7f

03 02b8f168 77733822 001918c0 02b8f218 02b8f188 ole32!CDllHost::GetApartmentToken+0x203

04 02b8f178 776f67b8 02b8f218 02b8f390 02b8f19c ole32!DoSTApartmentCreate+0x12

05 02b8f188 776acd3c 00000000 00000002 02b8f218 ole32!CClassCache::GetActivatorFromDllHost+0xa3

06 02b8f19c 776accf1 02b8f1b8 02b8f390 02b8f218 ole32!CClassCache::GetOrCreateApartment+0x20

07 02b8f1f0 776acc78 0019dd5c 00000000 02b8f390 ole32!FindOrCreateApartment+0x46

08 02b8f22c 776ad907 77794960 02b8f5b0 02b8f244 ole32!CProcessActivator::GetApartmentActivator+0xc7

09 02b8f248 776acb27 77794960 00000001 00000000 ole32!CProcessActivator::CCICallback+0x17

0a 02b8f268 776acad8 77794960 02b8f5b0 00000000 ole32!CProcessActivator::AttemptActivation+0x2c

0b 02b8f2a4 776ada17 77794960 02b8f5b0 00000000 ole32!CProcessActivator::ActivateByContext+0x4f

0c 02b8f2cc 776aaf7e 77794960 00000000 02b8f754 ole32!CProcessActivator::CreateInstance+0x49

0d 02b8f30c 776aaf19 02b8f754 00000000 02b8fc9c ole32!ActivationPropertiesIn::DelegateCreateInstance+0xf7

0e 02b8f33c 776aaf7e 7779487c 00000000 02b8f754 ole32!CClientContextActivator::CreateInstance+0x8f

0f 02b8f37c 776ab10f 02b8f754 00000000 02b8fc9c ole32!ActivationPropertiesIn::DelegateCreateInstance+0xf7

10 02b8fd50 776a679a 02b8fe20 00000000 00000017 ole32!ICoCreateInstanceEx+0x3f8

11 02b8fd84 776a6762 02b8fe20 00000000 00000000 ole32!CComActivator::DoCreateInstance+0x6a

12 02b8fda8 776a6963 02b8fe20 00000000 00000017 ole32!CoCreateInstanceEx+0x23

13 02b8fdd8 7825037e 02b8fe20 00000000 00000017 ole32!CoCreateInstance+0x3c

 

Look at frame 2, SendMessageW tried to send message to a window. Check the function definition, the first parameter is hwnd: 003a0428

 

Then I go through other threads to see which one owns the hwnd 003a0428, by checking the ole data stored in each COM thread, found it is thread 9 actually:

 

$t0=00000009

   +0x074 hwndSTA : 0x003a0428 HWND__

 

0:009> kbL

ChildEBP RetAddr Args to Child

0207fe88 7c827cfb 77e6202c 00000001 0207fed8 ntdll!KiFastSystemCallRet

0207fe8c 77e6202c 00000001 0207fed8 00000000 ntdll!ZwWaitForMultipleObjects+0xc

0207ff34 77e62fbe 00000001 02601920 00000001 kernel32!WaitForMultipleObjectsEx+0x11a

0207ff50 00423f08 00000001 02601920 00000001 kernel32!WaitForMultipleObjects+0x18

0207ffb0 0042413e 00000000 77e64829 015dcf20 MyModule!Run+0x3f8

0207ffb8 77e64829 015dcf20 00000000 00000000 MyModule! ThreadFunc+0x2e

0207ffec 00000000 00424110 015dcf20 00000000 kernel32!BaseThreadStart+0x34

 

Check thread 9 in detail, WaitForMultipleObjectsEx only waits on one handle 0x00000314:

 

0:009> dc 0207fed8 L1

0207fed8 00000314

 

And the 314 is actually a thread event handler of thread 13:

 

0:009> !handle 0x00000314 f

Handle 00000314

  Type Thread

….

    Thread Id d80.b04

    Priority 9

    Base Priority 0

 

0:009> ~13

  13 Id: d80.b04 Suspend: 1 Teb: 7ffad000 Unfrozen

 

Now the deadlock graph is clear. Thread 13 is pending on Thread 9 to pick up message on a whindow which handle is 003a0428, but thread 9 is waiting on Thread 13 complete its task.

 

Solution

==========

The deadlock is unusual, who made it happen and how to Resolve?

 

Review thread 13 again, at frame 3, the function was called:

 

ole32!DoSTApartmentCreate

Actually this function will only be called when a Main STA object is created. That’s why thread 13 stuck on thread 9, which is the main STA thread.

 

Ask customer to change the thread type for this object in registry, from Main STA to Apartment STA. Open Registry Editor, check the key for the IND_COMMON.clsExecProc compoenent:

 

 HKCR\CLSID\ {<Object class ID>}\InprocServer32

Make sure the "ThreadingModel" value is "Apartment"

 

Summary

===========

As a summary, we learn:

 

1. Deadlock pattern is quite various, this can happen to allocating different types of resources. In this sample, one is STA window, another is Thread event. We must identify which resource is occupied carefully during debugging process.

2.  We cannot use main STA object in COM/COM+ environment, it is easily cause performance or locking issue.

 

More information

==================

About COM thread modes:

 

150777 INFO: Descriptions and Workings of OLE Threading Models

https://support.microsoft.com/default.aspx?scid=kb;EN-US;150777

 

 Regards,

 

 By Freist Li