Immutable instances and deferred references

This will hopefully be the last post on the subject of building immutable circular references subject (but I make no promises).

In today's post, we'll examine the use of deferred references to build circular references for immutable objects.

First Solution
The way this is typically accomplished is to associate a name or identifier to each object, and have the objects keep track of these names rather than to each other. The references aren't resolved until they are needed, at which point they have all been presumably created.

For example, in the following sample, you'll see that the CircularRef class now takes a name, and then the name for the object it references. We don't resolve these until we're in the WriteOut method, which is the only place where it's needed.

class C {
  public static void Main() {
    var a = new CircularRef("a", "b");
    var b = new CircularRef("b", "a");
    var refs = new Dictionary<string, CircularRef>();
    refs[a.Name] = a;
    refs[b.Name] = b;

    Console.WriteLine("Done!");
    a.WriteOut(refs, new HashSet<CircularRef>());
  }
}

public class CircularRef {
  private static int idGen;
  private readonly int id;
  private readonly string name;
  private readonly string theOtherOne;

  public CircularRef(string name, string theOtherOne) {
    this.id = idGen++;
    this.name = name;
    this.theOtherOne = theOtherOne;
  }

  public string Name { get { return this.name; } }

  public void WriteOut(
      Dictionary<string, CircularRef> refs, 
      HashSet<CircularRef> visited) {
    if (visited.Add(this)) {
      Console.WriteLine("Found " + this.id + " (" + this.name + ")");
      refs[this.theOtherOne].WriteOut(refs, visited);
    } else {
      Console.WriteLine("Breaking loop at " + this.id);
    }
  }
}

In this case, we're using a simple generic Dictionary, as it's the simplest thing that works, but sometimes this is a more complex object that can do tricks with the name or maintain scoped levels for names.

There is one problem with this approach, though: the caller of a specific method on CircularRef maintains all references. This is a problem because CircularRef can't resolve this at any point in time; if another method wants to refer to the other reference, now all callers of that will have to pass in the dictionary. Another case of brittle code where local changes propagate much further than they should.

Variations
An alternative is to pass in the dictionary at construction time and have CircularRef store it. The upside is that now the reference can be resolved at any point. The downside is that the object is "a bit" more mutable. The reference to the dictionary will never change, but the contents of the dictionary certainly will, as there is no way that the dictionary is fully populated by the time the constructor is called (otherwise we're in the business of direct references, which we've already looked at in the previous post).

Once the CircularRef class can resolve the reference at any time, it can be tempting to expose the reference as a property that internally does the lookup, maybe something along these lines:

public CircularRef TheOtherOne {
  get { return this.refs[this.theOtherOne]; }
}

The problem here is that now it's even more obvious that the dictionary contents change, and it's no longer a matter of making sure CircularRef waits before dereferencing the name; any user of the property also needs to wait until the dictionary is fully populated, and there's no clear indication of when that might be the case.

For these cases, I tend to favor the design where the object isn't fully immutable, but instead is constructed in two phases: an instance is created and initialized, then all the references are resolved and the instance is made immutable before exposing it to the rest of the system. So I might have a method that does all the creation of the objects, then does a second pass wiring them up, and then makes sure the instances are all read-only before returning them or otherwise making them accessible.

This is generally a good tradeoff, as it clearly delineates a period of time where objects may change, keeps the classes immutable for the purposes of most of the codebase, and allows a single piece of code outside the class itself to handle all the bookkeeping and do all dereferences at once (if a lookup is going to fail, it's good to know when).

Hope the series wasn't too dense and provides, if not some food for thought, at least some structure in how to think of the different design tradeoffs.

Enjoy!

PS: for practical examples, you can look at most any system that can use a name before it's declared, whether it be the parser for C# or a XAML page.