Dissecting the local functions in C# 7


The Local functions is a new feature in C# 7 that allows defining a function inside another function.

When to use a local function?

The main idea of local functions is very similar to anonymous methods: in some cases creating a named function is too expensive in terms of cognitive load on a reader. Sometimes the functionality is inherently local to another function and it makes no sense to pollute the "outer" scope with a separate named entity.

You may think that this feature is redundant because the same behavior can be achieved with anonymous delegates or lambda expressions. But this is not always the case. Anonymous functions have certain restrictions and their performance characteristics can be unsuitable for your scenarios.

Use Case 1: eager preconditions in iterator blocks

Here is a simple function that reads a file line by line. Do you know when the ArgumentNullException will be thrown?

public static IEnumerable<string> ReadLineByLine(string fileName)
{
   
if (string.IsNullOrEmpty(fileName)) throw new ArgumentNullException(nameof
(fileName));
   
foreach (var line in File.
ReadAllLines(fileName))
    {
       
yield return
line;
    }
}

// When the error will happen?
string fileName = null;
// Here?
var query = ReadLineByLine(fileName).Select(x => $"\t{x}").Where(l => l.Length > 10);
// Or here?
ProcessQuery(query);

Methods with yield return in their body are special. They called Iterator Blocks and they're lazy. This means that the execution of those methods is happening "by demand" and the first block of code in them will be executed only when the client of the method will call MoveNext on the resulting iterator. In our case, it means that the error will happen only in the ProcessQuery method because all the LINQ-operators are lazy as well.

Obviously, the behavior is not desirable because the ProcessQuery method will not have enough information about the context of the ArgumentNullException. So it would be good to throw the exception eagerly - when a client calls ReadLineByLine but not when a client processes the result.

To solve this issue we need to extract the validation logic into a separate method. This is a good candidate for anonymous function but anonymous delegates and lambda expressions do not support iterator blocks (*):

(*) Lambda expressions in VB.NET can have an iterator block.

public static IEnumerable<string> ReadLineByLine(string fileName)
{
   
if (string.IsNullOrEmpty(fileName)) throw new ArgumentNullException(nameof
(fileName));

   
return
ReadLineByLineImpl();

   
IEnumerable<string
> ReadLineByLineImpl()
    {
       
foreach (var line in File.
ReadAllLines(fileName))
        {
           
yield return line;
        }
    }
}

Use Case 2: eager preconditions in async methods

Async methods have the similar issue with exception handling: any exception thrown in a method marked with async keyword (**) manifests itself in a faulted task:

public static async Task<string> GetAllTextAsync(string fileName)
{
   
if (string.IsNullOrEmpty(fileName)) throw new ArgumentNullException(nameof
(fileName));
   
var result = await File.
ReadAllTextAsync(fileName);
    Log(
$"Read {result.Length} lines from '{fileName}'"
);
   
return
result;
}


string fileName = null;
// No exceptions
var task = GetAllTextAsync(fileName);
// The following line will throw
var lines = await task;

(**) Technically, async is a contextual keyword, but this doesn't change my point.

You may think that there is not much of a difference when the error is happening. But this is far from the truth. Faulted task means that the method itself failed to do what it was supposed to do. The failed task means that the problem is in the method itself or in one of the building blocks that the method relies on.

Eager preconditions validation is especially important when the resulting task is passed around the system. In this case, it would be extremely hard to understand when and what went wrong. A local function can solve this issue:

public static Task<string> GetAllTextAsync(string fileName)
{
   
// Eager argument validation
    if (string.IsNullOrEmpty(fileName)) throw new ArgumentNullException(nameof
(fileName));
   
return
GetAllTextAsync();

   
async Task<string
> GetAllTextAsync()
    {
       
var result = await File.
ReadAllTextAsync(fileName);
        Log(
$"Read {result.Length} lines from '{fileName}'"
);
       
return result;
    }
}

Use Case 3: local function with iterator blocks

I found very annoying that you can't use iterators inside a lambda expression. Here is a simple example: if you want to get all the fields in the type hierarchy (including the private once) you have to traverse the inheritance hierarchy manually. But the traversal logic is method-specific and should be kept as local as possible:

public static FieldInfo[] GetAllDeclaredFields(Type type)
{
   
var flags = BindingFlags.Instance | BindingFlags.Public |
                BindingFlags.NonPublic | BindingFlags.
DeclaredOnly;
   
return
TraverseBaseTypeAndSelf(type)
       
.SelectMany(t => t.
GetFields(flags))
       
.
ToArray();

   
IEnumerable<Type> TraverseBaseTypeAndSelf(Type
t)
    {
       
while (t != null
)
        {
           
yield return
t;
            t
= t.BaseType;
        }
    }
}

Use Case 4: recursive anonymous method

Anonymous functions can't reference itself by default. To work around this restriction you should declare a local variable of a delegate type and then capture that local variable inside the lambda expression or anonymous delegate:

public static List<Type> BaseTypesAndSelf(Type type)
{
   
Action<List<Type>, Type> addBaseType = null
;
    addBaseType
= (lst, t) =>
    {
        lst
.
Add(t);
       
if (t.BaseType != null
)
        {
            addBaseType(lst, t
.
BaseType);
        }
    };

   
var result = new List<Type
>();
    addBaseType(result, type);
   
return result;
}

This approach is not very readable and similar solution with local function feels way more natural:

public static List<Type> BaseTypesAndSelf(Type type)
{
   
return AddBaseType(new List<Type
>(), type);

   
List<Type> AddBaseType(List<Type> lst, Type
t)
    {
        lst
.
Add(t);
       
if (t.BaseType != null
)
        {
            AddBaseType(lst, t
.
BaseType);
        }
       
return lst;
    }
}

Use Case 5: when allocations matters

If you ever work on a performance critical application, then you know that anonymous methods are not cheap:

  • Overhead of a delegate invocation (very very small, but it does exist).
  • 2 heap allocations if a lambda captures local variable or argument of enclosing method (one for closure instance and another one for a delegate itself).
  • 1 heap allocation if a lambda captures an enclosing instance state (just a delegate allocation).
  • 0 heap allocations only if a lambda does not capture anything or captures a static state.

But allocation pattern for local functions is different.

public void Foo(int arg)
{
    PrintTheArg();
   
return
;
   
void
PrintTheArg()
    {
       
Console.WriteLine(arg);
    }
}

If a local function captures a local variable or an argument then the C# compiler generates a special closure struct, instantiates it and passes it by reference to a generated static method:

internal struct c__DisplayClass0_0
{
   
public int
arg;
}

public void Foo(int
arg)
{
   
// Closure instantiation
    var c__DisplayClass0_ = new c__DisplayClass0_0() { arg =
arg };
   
// Method invocation with a closure passed by ref
    Foo_g__PrintTheArg0_0(ref
c__DisplayClass0_);
}

internal static void Foo_g__PrintTheArg0_0(ref c__DisplayClass0_0
ptr)
{
   
Console.WriteLine(ptr.arg);
}

(The compiler generate names with invalid characters like < and >. To improve readability I've changed the names and simplified the code a little bit.)

A local function can capture instance state, local variables (***) or arguments. No heap allocation will happen.

(***) Local variables used in a local function should be definitely assigned at the local function declaration site.

There are few cases when a heap allocation will occur:

  1. A local function is explicitly or implicitly converted to a delegate.

Only a delegate allocation will occur if a local function captures static/instance fields but does not capture locals/arguments.

public void Bar()
{
   
// Just a delegate allocation
    Action a =
EmptyFunction;
   
return
;
   
void EmptyFunction() { }
}

Closure allocation and a delegate allocation will occur if a local function captures locals/arguments:

public void Baz(int arg)
{
   
// Local function captures an enclosing variable.
    // The compiler will instantiate a closure and a delegate
    Action a =
EmptyFunction;
   
return
;
   
void EmptyFunction() { Console.WriteLine(arg); }
}
  1. A local function captures a local variable/argument and anonymous function captures variable/argument from the same scope.

This case is way more subtle.

The C# compiler generates a different closure type per lexical scope (method arguments and top-level locals reside in the same top-level scope). In the following case the compiler will generate two closure types:

public void DifferentScopes(int arg)
{
    {
       
int local = 42
;
       
Func<int> a = () =>
local;
       
Func<int> b = () =>
local;
    }

   
Func<int> c = () => arg;
}

Two different lambda expressions will use the same closure type if they capture locals from the same scope. Lambdas a and b reside in the same closure:

private sealed class c__DisplayClass0_0
{
   
public int
local;

   
internal int
DifferentScopes_b__0()
    {
       
// Body of the lambda 'a'
        return this.
local;
    }

   
internal int
DifferentScopes_b__1()
    {
       
// Body of the lambda 'a'
        return this.
local;
    }
}

private sealed class c__DisplayClass0_1
{
   
public int
arg;

   
internal int
DifferentScopes_b__2()
    {
       
// Body of the lambda 'c'
        return this.
arg;
    }
}

public void DifferentScopes(int
arg)
{
   
var closure1 = new c__DisplayClass0_0 { local = 42
};
   
var closure2 = new c__DisplayClass0_1() { arg =
arg };
   
var a = new Func<int>(closure1.
DifferentScopes_b__0);
   
var b = new Func<int>(closure1.
DifferentScopes_b__1);
   
var c = new Func<int>(closure2.DifferentScopes_b__2);
}

In some cases, this behavior can cause some very serious memory-related issues. Here is an example:

private Func<int> func;
public void ImplicitCapture(int
arg)
{
   
var o = new VeryExpensiveObject
();
   
Func<int> a = () => o.
GetHashCode();
   
Console.
WriteLine(a());

   
Func<int> b = () =>
arg;
    func
= b;
}

It seems that the o variable should be eligible for garbage collection right after the delegate invocation a(). But this is not the case. Two lambda expressions share the same closure type:

private sealed class c__DisplayClass1_0
{
   
public VeryExpensiveObject
o;
   
public int
arg;

   
internal int
ImplicitCapture_b__0()
       
=> this.o.
GetHashCode();

   
internal int
ImplicitCapture_b__1()
       
=> this.
arg;
}

private Func<int
> func;

public void ImplicitCapture(int
arg)
{
   
var c__DisplayClass1_ = new c__DisplayClass1_0
()
    {
        arg
=
arg,
        o
= new VeryExpensiveObject
()
    };
   
var a = new Func<int>(c__DisplayClass1_.
ImplicitCapture_b__0);
   
Console.
WriteLine(func());
   
var b = new Func<int>(c__DisplayClass1_.
ImplicitCapture_b__1);
   
this.func = b;
}

This means that the lifetime of the closure instance is bound to the lifetime of the func field: the closure stays alive until the delegate func is reachable from the application. This can prolong the lifetime of the VeryExpensiveObject drastically causing, basically, a memory leak.

A similar issue happens when a local function and lambda expression captures variables from the same scope. Even if they capture different variables the closure type will be shared causing a heap allocation:

public int ImplicitAllocation(int arg)
{
   
if (arg == int.
MaxValue)
    {
       
// This code is effectively unreachable
        Func<int> a = () =>
arg;
    }

   
int local = 42
;
   
return
Local();

   
int Local() => local;
}

Compiles to:

private sealed class c__DisplayClass0_0
{
   
public int
arg;
   
public int
local;

   
internal int
ImplicitAllocation_b__0()
       
=> this.
arg;

   
internal int
ImplicitAllocation_g__Local1()
       
=> this.
local;
}

public int ImplicitAllocation(int
arg)
{
   
var c__DisplayClass0_ = new c__DisplayClass0_0 { arg =
arg };
   
if (c__DisplayClass0_.arg == int.
MaxValue)
    {
       
var func = new Func<int>(c__DisplayClass0_.
ImplicitAllocation_b__0);
    }
    c__DisplayClass0_
.local = 42
;
   
return c__DisplayClass0_.ImplicitAllocation_g__Local1();
}

As you can see all the locals from the top-level scope now become part of the closure class causing the closure allocation even when a local function and a lambda expression captures different variables.

Local functions 101

Here is a list of the most important aspects about local functions in C#:

  1. Local functions can define iterators.
  2. Local functions useful for eager validation for async methods and iterator blocks.
  3. Local functions can be recursive.
  4. Local functions are allocation-free if no conversion to delegates is happening.
  5. Local functions are slightly more efficient than anonymous functions due to a lack of delegate invocation overhead (****).
  6. Local functions can be declared after return statement separating main logic from the helpers.
  7. Local functions can "hide" a function with the same name declared in the outer scope.
  8. Local functions can be async and/or unsafe no other modifiers are allowed.
  9. Local functions can't have attributes.
  10. Local functions are not very IDE friendly: there is no "extract local function refactoring" (yet) and if a code with a local function is partially broken you'll get a lot of "squiggles" in the IDE.

(****) Here is a benchmark and the results:

private static int n = 42;

[
Benchmark]
public bool
DelegateInvocation()
{
   
Func<bool> fn = () => n == 42
;
   
return
fn();
}

[
Benchmark]
public bool
LocalFunctionInvocation()
{
   
return
fn();
   
bool fn() => n == 42;
}
                  Method |      Mean |     Error |    StdDev |
------------------------ |----------:|----------:|----------:|
      DelegateInvocation | 1.5041 ns | 0.0060 ns | 0.0053 ns |
LocalFunctionInvocation | 0.9298 ns | 0.0063 ns | 0.0052 ns |

To get this numbers you have to manually “decompile” a local function to a regular function. The reason for that is simple: such a simple function like “fn” is inlined by the runtime and the benchmark won’t show you real invocation cost. To get these numbers I used a static function marked with NoInlining attribute (unfortunately, you can’t use attributes with local functions).


Comments (22)

Cancel reply

  1. Very much thanks to the publication of this article is very functional

  2. Dmytro Dziuma says:

    What’s about a use case when you return a Task but still want to use `yield` for some reason? I guess it would be a bit similar to Use Case 3

    1. The following code is invalid:

      public async Task<IEnumerable> M()
      {
      yield return 42;
      }

      Or maybe I’ve missed your point, Dmytro:(

      1. Dmytro Dziuma says:

        I meant something as follows:
        “`
        public async Task<IEnumerable> IterateAsync()
        {
        return Iterate();

        IEnumerable Iterate()
        {
        yield return 42;
        }
        }
        “`

        1. Oh… I see. This is indeed a very useful use case. Not sure how often you may need it though:)

  3. Oleg says:

    Minor typo in the internal int DifferentScopes_b__1() method.
    Comment should be // body of the lambda ‘b’

  4. Maxim says:

    Thank you Sergey for the great article.
    I have two questions:
    1. What is the root cause of the behavior when the same closure type is used across multiple delegates/functions? Isn’t it clr team’s failure?
    2. Are there any reasons to use closures or situations when it’s a vital necessity? Most of the time i prefer to pass possible closure candidate as a function argument.

    1. 1. I don’t think this is a mistake from the CLR team (actually, by C# team, because closure is created by the compiler not by the runtime). Most likely it is a tradeoff between potential issue like one I’ve mentioned with VeryExpensiveObject and general cost of creating closure types and instances.
      2. I didn’t get this. Could you please explain what do you mean by “Most of the time i prefer to pass possible closure candidate as a function argument.”?

      1. Maxim says:

        I mean that it’s difficult for me to find out an example when we need closures. Probably, the next example could prevent those closure’s overhead?

        public void DifferentScopes(int arg)
        {
        Func<int, Func> preventClosure = (val) => () => val;

        {
        int local = 42;

        Func a = () => preventClosure(local)();
        Func b = () => preventClosure(local)();
        }

        Func c = () => preventClosure(arg)();
        }

        1. To avoid closure allocation you lambda expression should not capture anything from the enclosing context. In your case `a` captures another function that causes the allocation. In you case there is a lot of them actually: https://sharplab.io/#v2:CYLg1APgAgDABFAjAbgLACgoGYECY4DCcA3hnOQjlACxwAiAlgGZMCmATqwHYAuAygGMA9gAdWAZwAUDXnACG7AOYBKDKUwBWADwyeAGgTbdAPmNwRnAG7ceBADZDxAV05wAvHEmW5d5e7OSfm5m3nZo6GoYunAOAj7ucNS44RhQRrxmcgmB/uZWNvaOLqySsT7KgeFpOhlwAEbZQWYWrNa8hc6cpUJxvpUYAL4Yqek8ZgKNuS1ttg6dJQoq/ehDK0A=

          > I mean that it’s difficult for me to find out an example when we need closures.
          The question should be other way around: you don’t need closures, you need an expected behavior of local or anonymous function that allows them to use variables/arguments/instance/static fields naturally. Then compiler will decide how to achieve this and what is needed for that. If you local/anonymous function doesn’t use anything from the enclosing context, then the generated code is more optimal. If it DOES use something, then the compiler have to glue together the context and the generated function in one entity – closure.

          1. Maxim says:

            Thank you, i think i got it.

  5. I found another situation where a heap allocation happens. (It doesn’t seem like you covered it in this article, but maybe it’s a special case of one of the situations you mentioned.) I blogged about it in detail at http://faithlife.codes/blog/2017/08/local-functions-and-allocations/ but in summary, a local iterator method or local async function that captures a local variable will allocate the compiler-generated class for its backing state machine, even if the outer function never invokes the local function.

    Your ReadLineByLine sample method exhibits this behaviour, which you can see by examining the IL in SharpLab: http://bit.ly/2xYxOWr

  6. Cool post!

    Some comments:
    1. Cases 1 and 3 are basically the same (iterator blocks)
    2. Case 2 can be achieved using async delegates as well
    3. Case 4 is pretty minor
    4. The first DifferentScopes_b should be DifferentScopes_a and the second “Body of the lambda ‘a'” should be “Body of the lambda ‘b'”
    5. Console.WriteLine(func()) should be Console.WriteLine(a())
    6. “stays alive until” should be “stays alive as long”
    7. ImplicitAllocation_b__0() should be ImplicitAllocation_a__0()

    BTW the shared closure behavior is a long time Resharper warning: https://stackoverflow.com/questions/13633617/why-does-resharper-tell-me-implicitly-captured-closure

  7. lindexi_gd says:

    What is the Mean and Error and StdDev means?

    1. Svick says:

      Benchmark.NET (the library used to run the benchmark) explains the terms like this:

      Mean : Arithmetic mean of all measurements
      Error : Half of 99.9% confidence interval
      StdDev : Standard deviation of all measurements

  8. Svick says:

    Unless your have a 70 GHz machine, I don’t think LocalFunctionInvocation is measuring what you think it’s measuring.

    1. Can you explain your thought, please:)
      As far as I can tell the timing for the local function is on par with a static method invocation…

      1. Svick says:

        Executing LocalFunctionInvocation once took 0.0142 ns. At 3 GHz, executing a single instruction is going to take 0.3333 ns. That tells me that the loop that was executing the benchmark has been eliminated, otherwise you couldn’t get less time per invocation than it takes to execute a single instruction. So you’re probably measuring some benchmarking overhead, not how long it actually takes to execute the invocation.

        1. Svick, you’re absolutely right. The numbers were that low because the runtime inlined the local function. I’ve updated the post with new data. The local functions are still faster than delegates, but the difference is more reasonable now.

  9. Yury Zholobov says:

    Second instance of “// Body of the lambda ‘a'” should read “// Body of the lambda ‘b'”.
    The star-reference technique is excessive and unnecessary in all the places you use it, in my humble opinion. 🙂

    Otherwise, thank you for the article.

Skip to main content