X++ Pedal To The Metal

This is a post I’ve been waiting to write for quite a while – but it had to wait until R3 became available.

T-Shirt-Fast Xpp The SysExtension framework offers some great capabilities, but unfortunately it also comes with a performance penalty (in R2). A while ago I set out to understand where the time was spent, and hoping to optimize it. As an engineer this is a typical task with an expected outcome: A few days spent, an optimization of 10-20 percent is found. This time the task turned out to completely consume me for the better part of a week, I learned some important lessons, and the outcome exceeded my wildest imagination. This blog shares my findings.

Starting point

To be able to measure the impact of any changes I build a very simple test harness to exercise the SysExtension framework. A two level deep class hierarchy and one attribute that decorated the sub-class. I then compiled everything to IL, and wrote a small job to measure how many class instances I could spin up per second. This velocity measurement was around 3,400 classes/second.

First success

Debugging through the code I quickly learned that a lot of things were going on. This included creating a key for the attribute for various caches. This key was constructed via reflection on the attribute class. I avoided using reflection by introducing a new interface (SysExtensionIAttribute) and I fixed a number of other minor issues. Now the velocity jumped to 40,000 classes/second.

First physical limit

Is this an acceptable velocity? Well, how fast can it possibly be? The logic is in essence just creating a class via reflection, so I did a measurement of DictClass.MakeObject(). This could give me 84,000 classes/second. Slightly about double of the my current implementation. After some investigation I discovered two expensive calls: “new DictClass()” and “dictClass.makeObject()”. Can you spot what they have in common?  They both requires a call into the native AOS libraries. In other words an interop call. I tried various other calls into the AOS, such as “TTSBegin” (only the first is hitting the DB), and “CustParameters::Find()” (again, only the first one is hitting the DB). To my surprise the velocity of these calls where comparative to DictClass.MakeObject(). The interop overhead outweighs what the method is actually doing.  In other words, there is a limit to how many native AOS methods you can call per second. Let us call this velocity: Speed-of-sound.

Ultimate physical limit

Being a bit intrigued I measured the fastest and rawest possible implementation: “new MyClass()”. This would run strictly in IL, no overhead of any kind, the result was a whooping 23,800,000 classes/second.  Let us call this velocity: Speed-of-light. In the words of Barney Stinton: “Challenge accepted!” 

Final success

To achieve this kind of velocity the code must run 100% as IL. No calls into native AOS code. Period. Naturally there are APIs in .NET allowing for dynamically creation of class instances – they are slower than a direct instantiation, but still much faster than calling native AOS code. One other challenge was that SysExtension also can execute as pCode, and then a call into IL would cause a similar slow interop – just in the opposite direction. After a few iterations I had an implementation that would not cause an interop calls, regardless of if the code runs as IL or pCode. Take a look at SysExtensionAppClassFactory.getClassFromSysExtAttribute in R3 for details. I was pleased with the velocity: 661,000 classes/second. Or about 200 times faster than R2. Or about 15 times faster than a call to CustParameters::Find(). 

Problem solved: The SysExtension framework no longer has performance issues.



For a long time we have been hunting for SQL and RPC calls when looking for performance. RPC calls are expensive, as communication between two components (Client and Server) occurs. Just like SQL calls are expensive as the Server communicates with SQL (and waits for the reply). We still need to hunt for unnecessary RPC and SQL calls! Nothing changed; except that a third culprit has been identified: Native AOS calls. Relatively speaking the cost of calls into native AOS code is insignificant when compared to RPC or SQL calls – in the absence of these, the impact is measurable and significant.

This is an ERP system, so there will always be SQL calls. So why be concerned with the performance of X++ code?  Well, if you can minimize the time between SQL calls, then you will also limit the time SQL holds locks, and you will experience better overall performance and scalability. After all, do you want you code to run with the speed-of-sound or speed-of-light?

Update 11-05-2014:
Here is the test harness I used: PrivateProject_SysExpProject.xpo

Comments (8)

  1. Logger says:


    >> Solving all this – and more is in the works for the next major release of AX – code named "Rainier".

    And what about ax7 now ? 🙂

    Did you remove all this "interop overhead" in ax7 ?

    Did you rewrite ax kernel with .Net ?

    If I try to call external dll like it was in WinApi class in ax2009/ax2012, will I get the same problem ?

  2. msmfp says:

    Thanks for the feedback alibertism,  Your comment nicely demonstrates the core problem this post is trying to qualify.   In my examples above all X++ code is running in CLR – the problem is not with the language.

    Even if X++ was a full blown CLR language you would still face these problems. What we need to fix is not "just" the language and compiler – but all the libraries you are using when writing X++ code. In AX 2012 the majority of these are written in C++. To avoid the interop overhead we need to replace these with CLR implementations. This includes everything under System Documentation in the AOT, i.e >100 built-in functions, > 500 classes and the entire data access stack.

    Solving all this – and more is in the works for the next major release of AX – code named "Rainier".

  3. alibertism says:

    The T-Shirt lies. There is no such thing as fast X++ code. Please make X++ a full blown CLR language ASAP. Otherwise we are all doomed (a partner).

  4. msmfp says:

    I added a link to the test harness above.

  5. chrisjones81 says:

    Very interesting post.  Is it possible for you to share your test harness?

  6. KimKopowski says:

    Great post! Do you mind sharing your job that counts the number of class instances per second?

  7. msmfp says:

    Thanks Tommy.

    The changes can relatively easy be back ported to R2 or even AX2012 RTM. It's all X++ in a few classes.   However; we only do this on-demand. So for that to happen, a customer support case must exists.

    Perhaps I have a future in designing t-shirts – if demand is high enough, I might get some produced 🙂

  8. Tommy Skaue says:

    Great post, Michael!

    I guess the question now is will these changes be back ported to R2? If not, can we at least get a T-Shirt?