This article explains a real-world approach to dealing with high memory situations in Dynamics AX. Being able to identify root cause of memory usage is obviously useful in support scenarios when there's a memory problem in a production system, but it's also useful in test - catch those things and identify the root causes before they have a chance to get into production.
The key to my suggested approach here is simply to apply practical steps, not reliant on some extreme knowledge of underlying memory structures (although that can be fun too!).
Notice that I was careful above not to say "memory leak". It's common for people to describe high memory situations as leaks, but I'm making the distinction here because in my experience working the root cause of a high memory situation is rarely a leak. A leak would be a process that every time it runs, leaves behind a little bit of memory that cannot be used by the next process. The majority of the time there is a runaway process, or other heavy load causing the issue, but not actually a leak.
What shape is the issue?
The first thing to consider is what shape the high memory takes. Is it:
- Constantly growing steadily over a long period - could be a leak - you're looking for something that runs a lot over and over.
- Suddenly spiking up really high - unlikely to be a leak, is probably a runaway process.
Performance counters will tell us - within the counter set called "process" use the private bytes and virtual bytes counters and set them on for ax32serv to monitor the AOS process. The results of these counters will give you a graph in perfmon which shows the shape of the memory issue.
If your issue is slowly building memory up and up then perhaps you have a real memory leak. It's important to note that memory must keep growing and growing forever, if it just grows up to a point then stays around that level, even if that level is quite high, then it's not a leak.
Sudden spiking memory:
For sudden spiking memory - I like to use a performance rule in the tool debug diag. This enables me to create a performance rule based on a performance counter, so when a certain counter value is hit it'll make a memory dump of my target process. So I tell it to look at private bytes counter (mentioned earlier) and create a dump of ax32serv when the counter goes over the normal running threshold of the AOS - so normally it runs at 4gb, I'll get a dump when it goes past 5gb - you can tell it to generate a few dumps a few minutes apart.
Once you have a memory dump, just look at what was running at the time the dump was taken, it should be easy to see which process was running across all 3 dumps - there wouldn't be many things that run for a few minutes so typically you can expect it to be the only thing running across all 3. You can find out how to check a dump file for AX here.
.NET process using memory:
If you suspect it's a .NET process on your AOS (anything running as IL - Batches, services/AIF) using the memory then it's pretty easy to identify it - there is a .NET memory analysis script included in debug diag, just collect a dump while memory is high, and then run it through the debug diag analysis (second tab) using the .NET memory script listed there.
This will give you a nice html output which flags any large objects in memory - as far as an AOS is concerned, don't get too carried away reading every line of this report - at the top there will be headlines if it has noticed something it thinks are wrong, look at those first - you're pretty much looking for it to report that there is a large data table in memory or something along those line. The output is pretty human readable, so expect that it's quite easy to decipher what it's trying to tell you.
If you've written your own XppIL process and it's memory usage seems much higher in IL than it is when you run it as normal X++ then read this article.
A real "LEAK"
The hardest type of memory issue to investigate is the constant growth - these are rare in newer versions of Dynamics AX, since we had 64 bit instances of AOS - prior to that 32 bit resource limit could make it seem like there was a leak, when actually if it could use more memory it would have been ok.
In a suspected leak situation the first thing to do is take a practical approach - look at when it started what code changed or what new processes have been introduced in that period etc.. then test those changes/processes to see if you can simulate the memory issue. If you can catch it like this then finding the root cause will be easier.
If the practical approach yields nothing then you're likely to need to talk to Microsoft, contact Support (or come through your Partner if you don't have a Support contract with us directly). Expect us to run over the kind of things I've explained here, and then we'll collect a difference kind of memory dump which we can analyse at Microsoft to explain what is happening.