There’s a few things I’d like to say about inlining in the managed world. I guess I should start with the basics of why it’s important, but before I do, remember that like all my advice this is to be taken in moderation with due consideration to your own factors and the circumstances of your problem. Which is a fancy way of saying don’t forget that you should ignore me sometimes, and please do.
So, why/when is inlining important? There’s basically two cases which are sort of variations on the same case. Both cases are about saving work.
Direct Savings from Inlining
If you have a relatively small function that does a modest amount of computing it may be the case that the cost of calling and returning from the function is actually comparable to, or greater than, the actual work the function will do. Inlining the body of the function in that case can be a huge win.
In functions that are called very often this can be very significant – in the best case you could be in a situation where a virtual function call can be reduced to a direct function call because the exact type of the object is known (e.g. calling a virtual method on a freshly-newed value type, or on a sealed type) and the (now direct) function call can be inlined. Modern processers can have a great deal of difficulty predicting and prefetching through the sort of computed indirect jumps that are required by virtual calls. Removing those paths entirely and replacing them with potentially straight-line code (see below) can net you big wins.
Indirect Savings from Inlining
When inlining is being considered, the code to be inlined can be considered not in its full generality but in the context of the particular invocation that is to be inlined; which in turn means that extra optimizations might be possible. Here are some examples:
- if the function is being called with some constant arguments those arguments might rule out a variety of code flow branches in if/switch statements thus avoiding unnecessary testing, or they might be combinable with other constants in the inlined code allowing certain computations to be completely resolved at compile time
- the inlined code might have some sub-expressions which also appear in the calling code, those might not need to be re-evaluated, again saving work.
- certain bits of code in the caller might be provably unnecessary because it turned out that the inlined code didn’t need those arguments given certain other arguments for this particular call
- simplifications in one level of inlining may then make it possible to consider an additional level of inlining, where the inlined function in turn inlines something else
- combinations of the above
Again, I can’t give you the complete story but I can say a few things:
- inlining happens in the JIT, not in the language compiler so all languages get basically the same level of inlining support
- inlining too much is probably worse than inlining too little due to bloat, so compiler inlining decisions need to be carefully considered
- inlining one thing can deeply affect the utility of inlining something else (e.g. it could be that inlining “A” and “B” is a good idea but inlining just “A” or just “B” is a bad idea)
- unlike regular compilers, the JIT compiler is time-constrained, so it can’t afford to “think as hard” as say the native C++ compiler would about inlining decisions
- it would be cool if the JIT would “think harder” about inlining if used in the context of ngen but it doesn’t (at this time)
As a result of all this you get substantially more conservative inlining from the JIT than you would from say the C++ compiler -- much to my chagrin sometimes. However sad this may be, it is what it is, at least for now. 🙂
What can be Inlined: Rule of Thumb
About the most complicated thing you can reasonably expect to be inlined (and remember you really have to see for yourself to be sure) is a function like this:
public int GoodChanceOfInliningThisFunction(…args…)
Basically you get one “if” and some simple expressions. Remember, sneaky if's like “&&” and “?:” count too, so the simple expression needs to be pretty simple.
So if you’re writing simple field-accessors for (non-virtual) property getters and setters, you can reasonably expect those to be inlined but if you have looping and so forth, that’s not likely to get inlined. This is a pretty ripe area for improvements so I’m hopeful things will get better over time, but that’s a decent yardstick.
As always I invite comments, questions, and clarifications.
Thanks for reading.