JIT ETW Tail Call Event Fail Reasons

This is a follow-up post  for JIT ETW tracing in .NET Framework 4.  These are some of the possible strings that might show up in the FailReason field of the MethodJitTailCallFailed event.  These are reasons that come from or are checked for by the VM (as compared to the JIT) and are listed in no particular order:

  • "Caller is ComImport .cctor" - This means the caller is a static class constructor for a type which has a base type somewhere in the class hierarchy marked with the ComImportAttribute. This is caused by an implementation choice within the runtime for managed objects that effectively derive from native COM objects. You must remove the attribute if you want to perform a tail call.
  • "Caller has declarative security" - This means the caller has a declarative security attribute applied to it (usually an Assert or a Demand, but Deny and PermitOnly also prevent tail calls).  The current implementation relies on the caller remaining on the stack to enforce the security attribute.  You must remove the attribute from the caller, if you want to perform a tail call.
  • "Different security" - The caller and callee have different permissions, and the one with the ‘lower' permissions must remain on the stack. Since comparing permissions is expensive, we simplify it to Full Trust and non-Full Trust. Full Trust code can do anything, including tail calls. One other special case is homogenous appdomains, where everything in the appdomain has the same permissions, so even if the callee is unknown (due to virtual or indirect calls), the callee must have the same permissions as the caller. If you want to do a tail call, use a homogenous appdomain, grant the caller Full Trust, or put the caller and the callee in the same assembly and make sure it is a direct call.
  • "Caller is the entry point" - If there is no "tail." instruction prefix, the JIT is not allowed to generate a tail call from a method marked as the entrypoint for a module. The idea is that programers like to see their Main method at the bottom of the stack always. There is no way around this restriction.
  • "Caller is marked as no inline" - If there is no "tail." instruction prefix and the caller is explicitly marked with MethodImplOptions.NoInlining, then the VM assumes the programmer really wants that method frame to remain on the stack and not get elided via inlining or tail calls, and so it prevents tail calls from that method. If you want to do a tail call, either explicitly add the "tail." prefix to the call or remove the NoInlining flag.
  • "Callee might have a StackCrawlMark.LookForMyCaller" - certain methods in mscorlib rely on a stack walk to determine their caller. They are marked to prevent inlining and also to prevent direct tail calls. This will only happen if the callee is known, and is inside mscorlib. There is no way to generate tail calls directly to these methods.
  • "Caller is a CER root" - See the Constrained Execution Regions topic on MSDN.

From x86 JIT, we get this list of failure reasons (again in no particular order):

  • "Caller is synchronized" - The caller is marked with MethodImplOptions.Synchronized.  The JIT needs to leave the caller's frame on the stack until after the callee finishes in order to know when to release the runtime-implemented locking.
  • "Caller is varargs" - This is just an implementation limitation of the x86 JIT. For more information about varargs in C# (not the params keyword), search for __arglist.
  • "Caller requires a security check." - The caller is marked with mdRequireSecObject for imperative security.  With our current implementation, such methods need their own call frame so the corresponding Assert or Deny will end at the return of the method.  If you want to do a tail call, remove the imperative security calls.
  • "Needs security check" - Same as above.
  • "Callee is native" - We currently cannot tail call from managed code to native code.
  • "PInvoke calli" - Same as above.
  • "Return types don't match" - The caller and callee must have the exact same return type.  If you want to do a tail call, change the return types to match.
  • "Localloc used" - This is just an implementation limitation of the x86 JIT. In C# if you use stackalloc then the JIT cannot be sure of the intended lifetime, and so it goes safe and prevents the tail call. If you want to do a tail call, remove the stackalloc.
  • "Need to copy return buffer" - If the return value doesn't fit in a register, the caller needs to allocate a buffer. Normally the JIT reuses the caller's return buffer for the caller to avoid a copy, but sometimes it can't, and because it now has to do a copy after the callee returns, it can't do a tail call. If you want to do a tail call, use an out parameter rather than a return value.
  • "Changed into handle" - The C# expression ‘typeof(XXX).TypeHandle' involves a call to the property method get_TypeHandle.  The JIT can turn that whole expression (including the call) into a simple embedded constant (the TypeHandle as provided by the VM).  We think that is faster and better than any tail call.

From the 64-bit JIT, we get this list of failure reasons:

  • "function has EH" - The IA64 JIT doesn't support tail calls from methods with try/catch/finally clauses, unless the call uses the "tail." instruction prefix. If you want to do a tail call remove the exception handling clauses or add a "tail." prefix.
  • "found symbol with address taken" - if the call doesn't use the "tail." instruction prefix and the method takes the address of a local, the JIT doesn't do enough analysis to see if it is address-escaped (meaning the callee uses the address to access the caller's local) and so it just doesn't try to optimize a normal call into a tail call.
  • "local address taken" - Same as above.
  • "synchronized" - This is the same as the x86 JIT's "Caller is synchronized".
  • "caller's imperative security" - This is the same as the x86 JIT's "Caller requires a security check".
  • "caller's declarative security" - This is the same as the VM's "Caller has declarative security".
  • "not optimizing" - The JIT disabled all optimizations, and so it only performs a tail call if the "tail." prefix is present. If you want to do a tail call, either add the "tail." prefix or re-enable optimizations. Some of the reasons why JIT optimizations might be disabled include: using MethodImplOptions.NoOptimization, a method that is too big or too complex to optimize, running under a debugger, and certain compiler switches.
  • "localloc" - This is the same as the x86 JIT's "Localloc used", except the 64-bit JIT will do a tail call if the call explicitly uses the "tail." instruction prefix.
  • "GS" - The method uses local buffers (unmanaged arrays) and the JIT adds extra code to detect buffer overruns before they can be exploited. These extra checks are incompatible with tail calls in our current implementation. The name comes from the C++ compiler's /GS command-line switch, and attempts to prevent many similar issues as they appear in unsafe managed code.
  • "turned into intrinsic" - The 64-bit JIT cannot tail call certain methods that effectively turn into special code. This is similar to the x86 JIT's "Changed into handle".
  • "P/Invoke" - This is the same as the x86 JIT's "Callee is native".
  • "return type mismatch" - The caller and callee must have compatible return types (types that don't require any conversion at the hardware level). This is similar to the x86 JIT's "Return types don't match", but the 64-bit JIT is slightly more permissive.
  • "processor specific reasons" - The caller and callee's signature are different enough that the calling convention makes it hard (or impossible) to do a traditional optimized tail call. This is usually caused by the callee having more (or bigger) arguments than the caller. On x64 if the "tail." instruction prefix is used, the JIT will generate a HelperAssistedTailCall.

It is worth noting that the 64-bit JIT tries to optimize almost all calls into tail calls.  The JIT also implies a certain amount of knowledge, intent and analysis when the "tail." IL prefix is used on a call.  A normal call (no prefix) is sort of like telling the JIT to make a call however it deems best.  The JIT then does some quick conservative checks to see if a tail call is possible and would be as good as or better than a normal call.  On the other hand a call with the "tail." prefix is sort of like telling the JIT to try as hard as possible to make a tail call, because the programmer or the compiler did some big analysis and proved that, despite what the JIT might think, the tail call is safe and will be better than a regular call.  Thus the only things the JIT has to check for are known problems (verification, security, and implementation limitations).

The x86 JIT, on the other hand, currently only attempts to do a tail call when the IL explicitly uses the "tail." prefix.  Thus the x86 JIT only checks for correctness.

It is my understanding that the C# and VB.NET compilers never emit the "tail." instruction prefix, but the C++ and F# compilers generate it automatically, so the programmer has very little control over this condition.  So unless you write in IL, or use some form of IL rewriter, your ability to add or remove the ".tail" prefix is limited at best.

Lastly if you're still reading you have probably noticed that there is a lot of redundancy.  This is partly because the messages are generated by different components in the runtime - the VM, the x86 JIT, and the x64 JIT - which were developed, and have evolved, fairly independently.  There is also some amount of redundancy as a safety precaution.

Grant RIchins
CLR Codegen Team