Some Thoughts About Inlining


There’s a few things I’d like to say about inlining in the managed world.  I guess I should start with the basics of why it’s important, but before I do, remember that like all my advice this is to be taken in moderation with due consideration to your own factors and the circumstances of your problem.  Which is a fancy way of saying don’t forget that you should ignore me sometimes, and please do.


So, why/when is inlining important?  There’s basically two cases which are sort of variations on the same case. Both cases are about saving work. 


Direct Savings from Inlining


If you have a relatively small function that does a modest amount of computing it may be the case that the cost of calling and returning from the function is actually comparable to, or greater than, the actual work the function will do.  Inlining the body of the function in that case can be a huge win.


In functions that are called very often this can be very significant – in the best case you could be in a situation where a virtual function call can be reduced to a direct function call because the exact type of the object is known (e.g. calling a virtual method on a freshly-newed value type, or on a sealed type) and the (now direct) function call can be inlined.  Modern processers can have a great deal of difficulty predicting and prefetching through the sort of computed indirect jumps that are required by virtual calls.  Removing those paths entirely and replacing them with potentially straight-line code (see below) can net you big wins.


Indirect Savings from Inlining


When inlining is being considered, the code to be inlined can be considered not in its full generality but in the context of the particular invocation that is to be inlined; which in turn means that extra optimizations might be possible.  Here are some examples:



  • if the function is being called with some constant arguments those arguments might rule out a variety of code flow branches in if/switch statements thus avoiding unnecessary testing, or they might be combinable with other constants in the inlined code allowing certain computations to be completely resolved at compile time
  • the inlined code might have some sub-expressions which also appear in the calling code, those might not need to be re-evaluated, again saving work. 
  • certain bits of code in the caller might be provably unnecessary because it turned out that the inlined code didn’t need those arguments given certain other arguments for this particular call
  • simplifications in one level of inlining may then make it possible to consider an additional level of inlining, where the inlined function in turn inlines something else
  • combinations of the above

Compiler Considerations


Again, I can’t give you the complete story but I can say a few things:



  • inlining happens in the JIT, not in the language compiler so all languages get basically the same level of inlining support
  • inlining too much is probably worse than inlining too little due to bloat, so compiler inlining decisions need to be carefully considered
  • inlining one thing can deeply affect the utility of inlining something else (e.g. it could be that inlining “A” and “B” is a good idea but inlining just “A” or just “B” is a bad idea)
  • unlike regular compilers, the JIT compiler is time-constrained, so it can’t afford to “think as hard” as say the native C++ compiler would about inlining decisions
  • it would be cool if the JIT would “think harder” about inlining if used in the context of ngen but it doesn’t (at this time)

As a result of all this you get substantially more conservative inlining from the JIT than you would from say the C++ compiler — much to my chagrin sometimes.  However sad this may be, it is what it is, at least for now. :-)


What can be Inlined: Rule of Thumb


About the most complicated thing you can reasonably expect to be inlined (and remember you really have to see for yourself to be sure) is a function like this:


public int GoodChanceOfInliningThisFunction(…args…)
{
    if (simple-expression-involving-args)
        return simple-expression-for-easy-case;
    else
        return FunctionThatDoesHardCase(…args…);       
}


Basically you get one “if” and some simple expressions. Remember, sneaky if’s like “&&” and “?:” count too, so the simple expression needs to be pretty simple. 


So if you’re writing simple field-accessors for (non-virtual) property getters and setters, you can reasonably expect those to be inlined but if you have looping and so forth, that’s not likely to get inlined.  This is a pretty ripe area for improvements so I’m hopeful things will get better over time, but that’s a decent yardstick.


As always I invite comments, questions, and clarifications. 


Thanks for reading.

Comments (9)

  1. juan felipe says:

    Thanks for your articlets. There is so little documentation about performance around….. It’s good to see someone is really trying to give us (performance-thirsty developers) one or two things to think about.

    I hope you write a little more often….

    What about Jan Gray?? it seems he stops blogging… (tooooo bad) :(

  2. Rico Mariani says:

    Jan sits in the office next to mine. I think I can relay your message :)

  3. Rico Mariani says:

    In the self-education department, I just learned from a colleague that some of the language compilers will be doing compile-time inlining (naturally at the IL level) to help shore up the JITs inlining in Whidbey.

    Sounds good to me, who knows what goodness that may bring!

  4. Jim Argeropoulos says:

    How do you do inlining in C#?

    I would like some of my get/set property operations to be inline, but I see no way to such a thing. The best I can accomplish is to tell the debugger not to step into them via an attribute.

    It appears that this is just one of those "simple first" things that they did.

  5. Rico Mariani says:

    Inlining is automatic, so you don’t have to do anything, the JIT decides if it can inline and if it can and it thinks it would make the code better than it does. Thing is, as I wrote above, it has a very simple inliner, so only the most simple functions tend to get inlined.

    Note, when debugging things tend to not get inlined at all to make it clear what’s going on. To see if you’re getting the benefit the best thing to do is to start your application normally, then attach a debugger (I think "msdev -p process-id" does it) and then look at the disassembly that was generated. Looking at the IL doesn’t usually tell you anything because the JIT does the inlining.

  6. Ian Marteens says:

    This is a trick I learned the hard way: inline doesn’t happen when calling a method from a class derived from MarshalObjectByRef… as System.Windows.Forms.Form is. Remember this when measuring performance from a WinForm application.

  7. Ian is right on the money. Of course, this is just another good reason to separate the GUI code from the business logic code. Still, most properties are declared on components, and being MBRO, those property setters and getters will not be inlined. I can’t help but think that deriving Component from MBRO was a mistake – most WinForm applications does not need the remoting capability of MBRO – and killing inlining for 99% o make it easier to do remoting for the 1% that needs it, might not have been the best solution?

    Maybe it should be possible to turn off the remoting support of MBROs by using an attribute on the application level? Then the jitter could enable inling of MBROs in that AppDomain and just raise an exception is remoting is attempted?

  8. Channel 9 says:

    .Net inlines everything it can