Finding the AX user and the X++ call stack from a memory dump the easy way


******** Latest update 11-July-2011 ******** Scripts now allow class and table name resolution ********

This post explains how to find the AX user and the X++ call stack that caused an AOS crash – by using special scripts for WinDbg – before reaching this stage you need to first have captured a memory dump, and then set up WinDbg ready to do some analysis, we have posts which explain both of those steps:

Capturing memory dumps:
http://blogs.msdn.com/b/emeadaxsupport/archive/2010/05/12/possibilities-to-create-memory-dumps-from-crashing-processes.aspx

Setting up WinDbg:
http://blogs.msdn.com/b/emeadaxsupport/archive/2011/04/10/setting-up-windbg-and-using-symbols.aspx

Once you have your memory dumps and have set up WinDbg, just open WinDbg, go to file- open a crash dump, and open the *.dmp file you created.

Note: these scripts are only currently compatible for AX2009 x64. For AX4 32 bit or AX2009 32 bit you will need to follow the manual way described in this post:
http://blogs.msdn.com/b/emeadaxsupport/archive/2011/04/10/finding-the-x-call-stack-that-caused-a-crash.aspx

Attached to this post are three *.txt file scripts (in a zip file at the end of the post). These need to be saved to the same folder on your machine where WinDbg.exe is (which is typically something like C:\Program Files (x86)\Debugging Tools for Windows (x86)). These will enable you to retrieve some information from dump, even if you don’t find any symbols.

The three scripts are:
– AxUser.txt – this will find the AX user, their current company and their client machine name.
– AxStack.txt – this will find the X++ call stack.
– ThreadUserStack.txt – this will run both AxUser.txt and AxStack.txt at the same time and gives nicer output for running across all threads.

To run a script for the current thread enter $$><“script name”, for example:

$$><axuser.txt

Run the script as above when you’re analysing a dump taken from a crash. When you load a crash dump in Windbg it will automatically switch to the thread which caused the crash, so just run the script and it’ll return information relevant to the crash.

To run a script for all threads, you can run:

~*e$$><ThreadUserStack.txt

Run the script as above when you’re analysing a dump taken from a hang – or any a manual dump that was nothing to do with a crash. When you load a manual dump in WinDbg it will default to the first thread, zero, so to see what the AOS was doing you will need to run the scripts across every thread and see which users and which pieces of X++ were active.

The output you see from the AxStack.txt script will look like this:

0:009> $$><axstack.txt
00 00000000`1fe9b6b8 000007fe`f6ff5abc ntdll!ZwWaitForSingleObject+0xa
01 00000000`1fe9b6c0 000007fe`fd4910ac vfbasics!AVrfpNtWaitForSingleObject+0x38
02 00000000`1fe9b6f0 000007fe`f6ff5785 KERNELBASE!WaitForSingleObjectEx+0x79
03 00000000`1fe9b790 000007fe`fecb0afd vfbasics!AVrfpWaitForSingleObject+0xa9
04 00000000`1fe9b7c0 000007fe`fed712c8 rpcrt4!EVENT::Wait+0xd
05 00000000`1fe9b7f0 000007fe`fed80c1b rpcrt4!OSF_SCALL::SendReceive+0x98
06 00000000`1fe9b8a0 000007fe`fed80c0d rpcrt4!NdrpClientCall2+0xa38
07 00000000`1fe9c010 00000000`008d2f47 rpcrt4!NdrClientCall2+0x1d
08 00000000`1fe9c040 00000000`009e68b0 Ax32Serv!ClientRunDebugger+0x97
09 00000000`1fe9c0c0 00000000`009dcfea Ax32Serv!cqlDebuggerNew::RunDebugger+0x1b0
0a 00000000`1fe9c140 00000000`0061ebfa Ax32Serv!cqlDebugger::xaldb_brk+0x34a
0b 00000000`1fe9c210 00000000`0061ee38 Ax32Serv!interpret::evalLoop+0x11a
0c 00000000`1fe9c290 00000000`005facfe Ax32Serv!interpret::eval+0x58
0d 00000000`1fe9c2c0 00000000`00616843 Ax32Serv!interpret::CQLEvalProc+0x48e
0e 00000000`1fe9c830 00000000`00616c62 Ax32Serv!interpret::doEval+0x4c3
0f 00000000`1fe9ca00 00000000`006178a2 Ax32Serv!interpret::evalFunc+0x322
———————- X++ Stack frame: 0x213 :: printJournal ()
10 00000000`1fe9cac0 00000000`00617fe4 Ax32Serv!interpret::xal_eval_func+0xa62
11 00000000`1fe9cbd0 00000000`0061ecb5 Ax32Serv!interpret::xal_eval_id+0x94
12 00000000`1fe9cc10 00000000`0061ee38 Ax32Serv!interpret::evalLoop+0x1d5
13 00000000`1fe9cc90 00000000`005facfe Ax32Serv!interpret::eval+0x58
14 00000000`1fe9ccc0 00000000`00616843 Ax32Serv!interpret::CQLEvalProc+0x48e
15 00000000`1fe9d4f0 00000000`00616c62 Ax32Serv!interpret::doEval+0x4c3
16 00000000`1fe9d6c0 00000000`006178a2 Ax32Serv!interpret::evalFunc+0x322
———————- X++ Stack frame: 0x20c :: run ()
17 00000000`1fe9d780 00000000`00617fe4 Ax32Serv!interpret::xal_eval_func+0xa62
18 00000000`1fe9d890 00000000`0061ecb5 Ax32Serv!interpret::xal_eval_id+0x94
19 00000000`1fe9d8d0 00000000`0061ee38 Ax32Serv!interpret::evalLoop+0x1d5
1a 00000000`1fe9d950 00000000`005facfe Ax32Serv!interpret::eval+0x58
1b 00000000`1fe9d980 00000000`00616843 Ax32Serv!interpret::CQLEvalProc+0x48e
1c 00000000`1fe9dec0 00000000`00616c62 Ax32Serv!interpret::doEval+0x4c3
1d 00000000`1fe9e090 00000000`006178a2 Ax32Serv!interpret::evalFunc+0x322
———————- X++ Stack frame: 0x213 :: run ()
1e 00000000`1fe9e150 00000000`00617fe4 Ax32Serv!interpret::xal_eval_func+0xa62
1f 00000000`1fe9e260 00000000`0061ecb5 Ax32Serv!interpret::xal_eval_id+0x94
20 00000000`1fe9e2a0 00000000`0061ee38 Ax32Serv!interpret::evalLoop+0x1d5
21 00000000`1fe9e320 00000000`005facfe Ax32Serv!interpret::eval+0x58
22 00000000`1fe9e350 00000000`00616843 Ax32Serv!interpret::CQLEvalProc+0x48e
23 00000000`1fe9eae0 00000000`00616c62 Ax32Serv!interpret::doEval+0x4c3
24 00000000`1fe9ecb0 00000000`00618c64 Ax32Serv!interpret::evalFunc+0x322
———————- X++ Stack frame: 0x20c :: mainOnServer ()
25 00000000`1fe9ed70 000007fe`feccc7f5 Ax32Serv!ServerEvalFunc+0x944
26 00000000`1fe9efc0 000007fe`feccb0b2 rpcrt4!Invoke+0x65
27 00000000`1fe9f0b0 000007fe`fecc809d rpcrt4!NdrStubCall2+0x32a
28 00000000`1fe9f6d0 000007fe`fecc9c24 rpcrt4!NdrServerCall2+0x1d
29 00000000`1fe9f700 000007fe`fed711a2 rpcrt4!DispatchToStubInCNoAvrf+0x14
2a 00000000`1fe9f730 000007fe`fecc9d86 rpcrt4!DispatchToStubInCAvrf+0x12
2b 00000000`1fe9f760 000007fe`fecc23f9 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x146
2c 00000000`1fe9f880 000007fe`fecc225e rpcrt4!OSF_SCALL::DispatchHelper+0x159
2d 00000000`1fe9f9a0 000007fe`fecc1ad9 rpcrt4!OSF_SCALL::ProcessReceivedPDU+0x18e
2e 00000000`1fe9fa10 000007fe`fecc1653 rpcrt4!OSF_SCONNECTION::ProcessReceiveComplete+0x3e9
2f 00000000`1fe9fac0 000007fe`fd498f6f rpcrt4!CO_ConnectionThreadPoolCallback+0x123
30 00000000`1fe9fb70 00000000`76ccef7a KERNELBASE!BasepTpIoCallback+0x4b
31 00000000`1fe9fbb0 00000000`76cd906f ntdll!TppIopExecuteCallback+0x1ff
32 00000000`1fe9fc60 00000000`76a6f56d ntdll!TppWorkerThread+0x3f8
33 00000000`1fe9ff60 00000000`76cf2cc1 kernel32!BaseThreadInitThunk+0xd
34 00000000`1fe9ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

What you are seeing here is the kernel C++ call stack, and the scripts is picking up when there is an X++ call being made and extracts the details, the lines you see like “———————- X++ Stack frame: 0x20c :: mainOnServer ()” are the X++. The “0x20c” part is the class or table ID in hex, the quickest way to convert it to decimal is to enter “? 0x20c” in WinDbg.

0:009> ? 0x20c
Evaluate expression: 524 = 00000000`0000020c

Once you have the decimal value of the class/table ID you can run “info(classId2Name(524));” in an X++ job to find out the name.

If you have any questions or problems with the scripts please post a comment to this post and we will reply.

–author: Tariq Bell
–editor: Tariq Bell
–date: 10/04/2011

15-April-2011 – Attached scripts updated – added ability to pick up X++ stack frames relating to X++ kernel classes. /Tariq

26-June-2011 – Attached scripts updated – class IDs now converted to decimal instead of hex. /Tariq

11-July-2011 – Attached scripts updated – class and table IDs now converted to show names when public symbols are available (will still show only IDs when no symbols available).

WinDbgScripts.zip

Comments (6)

  1. Michael Troelsen says:

    Thank you very much for providing the symbol files.

    This is a fantastic new tool in the toolbox. I did some small modification to the script so it would work with the AX32.EXE. Another greate tool for creating dump file is procdump from the sysinternals team which was left out from the "create Memory Dumps blog". There is also a beta version of Debugdiag 1.2 available at http://www.viisual.net/Tools/

    It would be greate, if you could provide more information on how to extract even more information from dump files using windbg.

  2. Thanks for the feedback Michael. Yes I will update the create memory dumps blog to include debug diag 1.2 – we should have a full RTM release of that coming out very soon, so I was hoping to wait until it RTM'ed. I can mention procdump too.

    We do plan to release more information in this area over the coming months. Is there anything specific you'd like to be able to extract?

    /Tariq

  3. Vadim says:

    Tariq,

    thank you for providing the information and the scripts. I am having problems loading the symbols (I can't see the X++ functions). I entered "msdl.microsoft.com/…/symbols" in the symbols search path.

    Do you have an idea what might be causing the problem here?

    Thank you!

  4. Hi Vadim, symbols aren't available for most builds yet, but the scripts should work regardless of whether symbols are available. it may be the case that there is no X++ frames in the thread that you're looking at. What I recommend is first run…

    ~*e$$><threadUserStack.txt

    ..to return the X++ stack for every thread, if you see that there are some threads which show some X++ frames then you know the script is working with your dump file. In which case it might not be possible for you to self-diagnose this one, then you may need to raise a support case with Microsoft for us to take a look into what is happening in the AX kernel on the problem thread – if you mention my full name (Tariq Bell) in the case then I'll help you.

    Alternatively, if when you run the command above you find that no threads show any X++ frames, then there could be an issue with the scripts, in which case let me know and we can sync up offline, if I can get a copy of your dumps file then I'll tell you what the stacks are and update the scripts to take account of this situation.

    /Tariq

  5. Michael Troelsen says:

    Hi Tariq,

    I have 3 cases which describes problems that we have where dumpfiles could be used find the callstack.

    Case 1. Long running query.

    We have cases where a customer call in with a the message that the system seems slow. When we look at the SQL server we detect a long running query or a blocking issue. It's easy to extract the SPID and the offending SQL statement using SQL server management studio, but how do we correlate the SPID and SQL statement to the X++ call stack and the user.

    I could use procdump.exe to generate a dump file of the running AX32serv.exe service.

    Case 2. Temp file 2 GB limit.

    We have a special case where a customer invoice print uses a temp table to generate the invoice lines. In some instances the temp file keeps growing until the file system returns an error.

    Error Code: 50110 = Maximum file size exceeded.

    Error code: 50109 = Check error.  The check digit read in the data is illegal.  If this is regarding an entire table cache, the cache might have been flushed.  Restart your job if this is the case.

    Would it be possible to use the crash dump file to indentify which temp table is currently in use.

    I was able to make a copy of the temp table file in the file system, but the ISAM file format is unknown. Is it possible to identify the temp table from the file.

    Case 3. Business Connector / IIS Web service.

    We have a  case where we get a "'System.OutOfMemoryException" when running a webservice using the business connector on DAX 4.0. We have build the webservice, so we can generate the symbol files for this part.

    I am unsure, if we can get any X++ call stack information from the business connector running on IIS.

    Could we use procdump.exe on the w3wp.exe to generate the dumpfile or should we use debugdiag.exe to generate the dump file.

    /Michael

  6. Hi Michael,

    For the long running query – you can correlate the SPID back to the user by looking at the online users form in AX – here you can make a link between SPID and AX user, which should give you what you need to join up wha tyou see in SQl to what you see in the dump. Note that online users will only show SPIDs for the currently connected AOS – if there are multiple AOSes in the cluster then you need to log onto the same one as the user to see the SPID, otherwise it'll show no SPID.

    For the temp file 2gb limit: for you identify the temp table directly yourselves might not be possible – if you have a dump can you see which X++ the user was running at the time? From that you might be able to work out which temp tables are in use. There is more we can do on that internally – if you raise a support case with the dump we can give some feedback about what was going on at the time.

    For the BC/Web service – yes BC is hosted in w3wp in this scenario and it is technically possible to extract information about the BC from the w3wp dump. You won't be able to do this with the scripts etc.. from these blogs as they only cover AX2009. You may need to collect the dump and raise a support case with us. What I can recommend for this – is to first use the SysInternal tool process explorer – when the problem is happening, or when the web service is active and busy, just open procexp and check what .NET objects are loaded in the relevant w3wp process – it would be most common for something like this to be caused by leaked .NET objects called from X++ in the BC – because of the limitation of X++ .NET garbage collection.

    You can use either procdump or debugdiag – personally I prefer to use debugdiag as I find it more user friendly, and it's nice to be able to leave it running constantly as a service, then if anything ever crashes you have a dump, and it doesn't matter if the machine is turned on/off or anything as the service will manage itself.

    /Tariq