The more things change


A while back we got an MSDN feedback report about a crash a customer
was experiencing.  The customer had the following code:

class A : B {
}

class B : A {
}

class C : B {
override //Type space here
}

At this point the IDE would just up and vanish.  What
happened?  Well, (as you can probably guess from the code snippet)
we wanted to pop up the “override completion list” so we attempt to
walk the entire inheritance chain to find the list of overridable
methods, infinitely recurse, and *poof* the IDE is gone.  Now,
there are a lot of features that need to walk the inheritance chain
(regular completion lists after <dot> for one), but they dont’
have this problem.  Why?  Well, in order to prevent this
problem we have a specialized search class that performs a depth first search (DFS) over the type system and thus doesn’t run into this problem. 
Unfortunately, the “override completion list” didn’t use this search
class, instead it would just explicitly call into methods like “GetSuperType()” in a while
loop, and thus didn’t detect these circularities in teh graph.  So
this was pretty simple to fix.  We moved over to using the search
class, and we also did a full code review of all GetSuperType calls to
make sure they weren’t doing anything similar that would cause problems.



However, in the whidbey time we recently discovered that this solution
was not good enough and we could still run into problems with
circularities.  How?  Well, consider the following code:

interface ISet<A> : IMap<A,A> {
}

interface IMap
<A,B> : ISet<ITuple<A,B>> {
}

interface ITuple
<A,B> {
}

class C : IMap
<int,string> { //try to invoke implement-interface here
}

In this case we have a similar, but subtly different, problem going on
here.  Implement-Interface (II) does use the DFS-class, but in
this case that class no longer sufficed.  Why?  Well, the way
II works is that it walks the interface inheritance tree to determine
the set of methods that needs to be implemented.  Any methods that
aren’t already implemented will then be stubbed out for you.  So
let’s start trying to figure out the total set of interfaces
implemented by “class C”.  We start with:

IMap<A,B> and we ask “what
interfaces do you have?” and we find out:

ISet<ITuple<A,B>> to which we ask what
interfaces do you have?” and we find out:

IMap<ITuple<A,B>,ITuple<A,B>> to which we ask to which
we ask “what interfaces do you have?”:

ISet<ITuple<ITuple<A,B>,ITuple<A,B>>> to which we ask to
which we ask “what interfaces do you have?”:

IMap<ITuple<ITuple<A,B>,ITuple<A,B>>,ITuple<ITuple<A,B>,ITuple<A,B>>>

Ad infinitum.



So the question then arises “how can i determine the complete set of
interfaces that a class implements, while ensuring that recursive
generic definitions don’t expand like above”.  It’s a rather fun
problem and so far i like the current solution we have for this.


Comments (15)

  1. loc says:

    i copied the sample code to whidbey and it went berserk… haha interesting.

    to which we already asked "what interfaces do you have?"

  2. damien morton says:

    This is all very interesting, but why is ISet<A> derived from IMap<A,A>?

    One another node, any knowlege to share on supprting variable arity generics in c# 3.0?

    i.e. ITuple<A,…>

  3. Dr Pizza says:

    "This is all very interesting, but why is ISet<A> derived from IMap<A,A>? "

    Because a set can be regarded as a map where each element is its own key.

  4. CyrusN says:

    Damien: Sets and Maps are somewhat interchangable. Although i prefer IMap being a Set of Tuples.

    The problem with a set being a map from keys’ to themselves is enforcing internal consistancy. When you say: SetAt("foo", "bar"), something needs to go in and ensure that the key and hte value are the same given some equality criteria.

    So i was modifying some of this collections stuff and i was changing this stuff around and *Boom*.

    I can’t talk about C# 3.0. However, i would say that having ITuple2/ITuple3/ITuple… really sucks.

  5. damien morton says:

    I tend to think that Sets are related to Maps, but I disagree that a Set is a Map of A->A. Your consistency issue is one example of the problems that causes.

    Id rather have Map defined in terms of a Set of ‘keyed’ objects, but then, if you’re building on top of Map, youre screwed.

    The whole tuple arity issue is stumping me, and Im at the point where Im thinking that dynamically generating Tuple<T1,T2,T3,T4….> classes on demand is the only way to go, especially if I want to avoid boxing value types. I havent actually tried this yet, though.

  6. Dr Pizza says:

    "I tend to think that Sets are related to Maps, but I disagree that a Set is a Map of A->A. Your consistency issue is one example of the problems that causes. "

    That it may cause an implementation "problems" does not change its essential nature. It *is* a map that uses values as their own keys.

    "Id rather have Map defined in terms of a Set of ‘keyed’ objects, but then, if you’re building on top of Map, youre screwed. "

    I don’t think anyone actually does that, though. Java defines its set implementations as being implemented by maps; C++ Standard Libraries tend to do this (though of course it’s not actually defined that way), and I would expect that when .NET gets a containers framework worth a damn it’ll do the same too.

  7. damien morton says:

    I might be convinced that a set is a map of type->null, assuming that there was such as a thing as a null type. I know c# doesnt support that, and I’m pretty sure that Java doesnt – not sure about c++.

  8. Keith J. Farmer says:

    Damian: Actually, when I needed a Set, I merely created a Hashtable, and did key -> null mappings. Not null-type, but simply null value, as a dummy.

    Incidentally, on your arity issue, just jump right in on the dynamic method generation; it’s actually rather fun, and more useful than people are probably aware (I used it to get around the generics vs operator overloading battle). Don’t expect Intellisense to help you out much, of course, since the types would be created at runtime, unless you carefully construct a factory.

  9. damien morton says:

    Keith: And this is the reason why map is defined in terms of set rather than the other way around. The first thing people implement is a hashmap, and then they think about implementing a set. So they grab their hashmap and use it to implement the set, even going as far as to define the set interfafce to be derived from hashmap.

    If people build a map from a set<KeyValuePair<A,B>> then they wouldnt be tempted to defined the set interface in terms of map.

    Its set deriving from map is just so wrong – its merely a product of temporal serendipity and wooly minded thinking.

  10. Xepol says:

    I would suspect that having circular references of any type should probably annoy the system. And since these are clearly first level dependancies…

    Usually a list of refered objects in the ancestory ‘ll do. Before you add an item, see if it is already there, if so, bail. In this case, it doesn’t even matter what a and b are in the generics, this just can’t ever work.

  11. Dr Pizza says:

    "Keith: And this is the reason why map is defined in terms of set rather than the other way around. The first thing people implement is a hashmap, and then they think about implementing a set. So they grab their hashmap and use it to implement the set, even going as far as to define the set interfafce to be derived from hashmap. "

    What would your initial "hash set" implementation look like? Consider that a "hash map" in the basic case is itself a map of "hash code" to "value". Similarly a red-black tree is a map from "partially orderable keys" to "nodes". Even an array is a mapping from "integral index" to "value". What data structure would you use to usefully implement your set ("usefully" means "you can’t use linked lists") that doesn’t possess some kind of a "mapping" characteristic?

    It’s easy to go from hash map or red-black tree to a kind of "degenerate" case where the hash codes or partially orderable keys are the same as the values. It’s not obvious how things should work the other way.

    "Its set deriving from map is just so wrong – its merely a product of temporal serendipity and wooly minded thinking. "

    What’s wrong about it? In what sense is a set not a map whose keys are its values?

  12. damien morton says:

    drPizza:

    Im thinking of implementations, rather than interfaces – a hashtable designed to be a map has entries of the form entry(hash, key, value). Its preferable to have entries of the form entry(hash, keyvaluepair(key,value)), so that you can easily drop the value part and use it as a set.

    This allows for a single generic hashtable implementation to be efficiently re-used as a set OR a map, with the set having 2/3 the memory usage of the map (assuming the key and value are both reference types).

  13. CyrusN says:

    Xepol: "Usually a list of refered objects in the ancestory ‘ll do. Before you add an item, see if it is already there, if so, bail. In this case, it doesn’t even matter what a and b are in the generics, this just can’t ever work"

    That’s not actually the case. It’s quite possible for a type to derive from both IList<int> and IList<string>. So even though they will bring in the same interface (ignoring generics), you cannot restrict this case.

  14. Radu Grigore says:

    I’d try a modified DFS. The nodes of the graph are incomplete specified classes like Map<A, B> (as opposed to "complete" classes like Map<int, int>). The trick is to keep track of the derivations already traveled. We can tag these with numbers:

    0. Set<A,B> -> Map<A,A>

    1. Map<A,B> -> Set<Tuple<A,B>>

    I’ll also asume we have similar integer tags for nodes, just to keep the pseudocode simple (you can always use a map instead of the implied array).

    dfs(n)

    __if seen[n] return

    __seen[n] <- T

    __process(n)

    __for each derivation d of n // i.e. derived[d] = n

    ____if dseen[d] continue

    ____dseen[d] <- T

    ____dfs(base[d])

    ____dseen[d] <- F

    __seen[n] <- F // should handle multiple inheritance

    This uses two arrays (seen and dseen) that keep track what nodes/edges were seen.