Thoughts on the right way to indicate failure in an API


I’m writing down the API for my IMap<A,B> interface. In other systems it has the name Dictionary, Associative Array, or Map. I prefer the latter because it seems to be just a way of mapping a domain (A) to a range (B). The basics of the interface are three operations:


  • GetAt(A a)
  • SetAt(A a, B b)
  • RemoveAt(A a)
  • ISet<A> Domain { get; }
  • ICollection<B> Range { get; }
However, I’m a little unsure about what the signatures should look like. I was considering: “B GetAt(A a)” but then there’s the issue of what happens when you try to get using an element that isn’t in the map. I think returning default(B) is a bad idea because you can’t distinguish the failure case from the case where the value actually maps to default(B). I could throw an exception, however based on personal experience this has a very serious affect on performance. In one of my applications a Map was used as the core data structure and ‘Get’ing was easily 75% of the app time and it was extremely common to try to get values that weren’t in the map. If i had to either catch the exception, or do a “if (map.Domain.Contains(a))” then I’d seriously affect performance.


Because of this use of a map i see it as being part of the expected behavior that someone might try to query what a value maps to when that value isn’t in the Domain of the map. In other words, I don’t think that that’s exceptional behavior, but expected behavior. Because of this, the API needs to take that into account. There are a couple of ways that taht could be done. One is to use C#’s ‘out’ parameters (a feature I’m not a fan of) to indicate if the value was in the range: “B GetAt(A a, out bool inDomain)”. If ‘a’ was in the domain then the value is returned and “true” is passed out. If ‘a’ isn’t in the domain then “false” is passed out and the return value is undefined. The other alternative is to use the Optional type to return both the value and the bool at once: “IOptional<B> GetAt(A a)”. If the map doesn’t contain the value in the domain then an IOptional whose HasValue is false is returned. If the map does contain the value in the domain then an IOptional whose Hasvalue is true and whose Value is the element from teh range is returned.


Any other ideas for how this should be handled? Are there any styles that you think are better than this? Compelling reasons for exceptions, out params, optional types?


Another thing that came up while talkign to Neil about the collections is what the interface for IList would look like. Currently it’s:

public interface IList<A> : ICollection<A>, IMap<int,A>, IRange<int> {

}
This inheritance chain made sense to me because you can think of a list as a collection of values and a mapping from the integers to those values. However, this leads to an interesting developement. The methods for IList now look like this:
    IOptional<A> GetAt(int index);
IOptional<A> SetAt(int index, A value);
IOptional<A> RemoveAt(int index);
This is quite different that the list APIs that I am used to that normally throw when you try to get an element using an index that isn’t in the range 0<->list.Length – 1. However, in this version you’ll instead get an IOptional back and will nevver throw. I think I could get used to this, however I’m unsure of how galling other’s might find it. The more I use it, the more I like it. It’s more cumbersome in the case where you know you are indexing ok into the list, but it’s far less cumbersome in the case where you’re not and you’re catching IndexOutOfRangeExceptions.

Edit: The code for IOptional<T> looks like this:


 

public interface IOptional<A> {
    bool HasValue { get; }
    A Value { get; }
}

Comments (49)

  1. BradA was mentioning a similar problem with the change from Hastables to Dictionaries (http://blogs.msdn.com/brada/archive/2004/04/26/120438.aspx). FWIW my take is at http://blogs.geekdojo.net/pdbartlett/archive/2004/04/27/1772.aspx

  2. David Levine says:

    I’d avoid using the default value as the return value because the default value is in the range of legal values and therefore cannot always be used to indicate the absence of a value. If the goal is to use a sentinel value to indicate that a key is not contained in the collection then one way is to let the user define the sentinel value to use; perhaps set during construction or a property that can be changed, and use the nullable value as the default sentinel value.

    I’m not familiar with IOptional (I haven’t looked at the Whidbey bits) but it does sound like a useful approach. If you could write code like this…

    if ( (b=GetAt(a)).HasValue )

    { // use b.Value

    }

    then it’s compact and intuitive to use. I don’t think it saves much in perf terms because it just changes the test from before the table access to afterwards, but it does result in one fewer call to the collection object itself.

    There are also usage issues. There are times where I "know" what’s in the table, and using a key that is not contained in the map indicates a programming error. In this case I’d want it to throw an exception. Also, for perf sensitive code I’d prefer to avoid the hit of testing the return value on each access and if an invalid key is used I want it to throw.

    I think both models are valid; perhaps this argues for two classes. One that uses return values and another that throws.

  3. Aleksei Guzev says:

    In databases we have NULL idicating absence of value. Isn’t it good to return something like ‘null’?

    It would be nice if method signature included exceptions thrown. But it would be difficult if not impossible to decide at compile time wich overloaded variant to call.

  4. Well in my class we always indicate failure by returning a big fat F and a "See me after class". It’s very useful… maybe you could do something like that?

  5. James Bibby says:

    Maybe I am missing something but why not return null? It’s easy to test for, indicates that nothing was found (or that the found value was null…. which in many cases result in the same program behaviour) Is there a reason that everyone is examining only exceptions or some kind of return value? I think that the following is a fairly efficient way to indicate not found….

    B b = a.GetAt();

    if (b == null)

    {

    //deal with not found / null found case

    }

    -J

  6. damien morton says:

    Most of the C# exception advice is tautological "only throw exceptions in exceptional circumstances".

    The rule the Python people use is easier to understand: the samurai principle "return a usefull result else throw an exception".

    The samurai principle would suggest that you have a GetAt() method that throws on lookup failure, and a TryGetAt() which returns a boolean indicating failure and potentially places an output value into an out paramater.

    Of course, one could dispense with out paramaters if C# had tuples like the "nice" language: http://nice.sourceforge.net/manual.html#id2493192

  7. Dan Golick says:

    Use nullable types to solve the problem. This way your map can handle reference and value types.

    Howabout IMap<A, B?>

    Now you can have

    B? b = map.GetAt(a);

    if (b.HasValue)

    {

    }

  8. S N says:

    These functions should return a boolean value indicating whether the operation is performaed successfully or not. If performed successful, then the result should be returned as either ref or out parameter.

  9. Orion Adrian says:

    I think I discussed this earlier in another post, but simply the best signature I’ve found for this is.

    T value;

    if( x.Find( out value ) )

    {

    //do something with value.

    }

    x.Find returns a boolean indicating if the value was found and if it was, outputs it to value. Note somebody else mentioned that above.

    No exceptions needed.

    Remember you can’t use the return value of a method unless you can garuntee that the return value will always be valid. In this case that can only be done using exceptinos or instead of returning the found value, you return a boolean indicating found/not found.

    Orion Adrian

  10. TuesdaysComing: Do you want to go see a movie?

  11. I’m with Mr. Bibby. Null is the correct way to go. All of these IOptional style choices smell a lot like our old friend, the union. It would seem to me if you have "null" in your hand and someone asks you for what is in your hand, you give them "null"

  12. Orion Adrian says:

    "I’m with Mr. Bibby. Null is the correct way to go. All of these IOptional style choices smell a lot like our old friend, the union. It would seem to me if you have "null" in your hand and someone asks you for what is in your hand, you give them "null" "

    Unfortunately null is a perfectly acceptable answer to a question and it in the answer space. There are going to be circumstances where I want to be able to store null and I need to know when they key isn’t being found.

    IOptional isn’t union, it’s something else. Though I’m not a fan of it, I’m not a fan of it for a completely different reason.

    After looking at the Nice language I do like the idea of tuple return types, though that doesn’t solve this issue. As far as I know, there are only two ways to solve this issue… exceptions and out parameters.

    Honestly my perferred method is throwing exceptions. There just needs to be a better way of handling them. Though remember, you can get around all sorts of exception problems by checking first (e.g. container.Contains(x) before calling container.GetAt(x)).

    Orion Adrian

  13. Sean Griffin says:

    I assume that IOptional is a struct/ class of some type and that this would have two values a return code and the value.

    This could be something like (in a mixture of c#, java and c++)

    public class Response

    {

    bool found;

    B value;

    public isFound();

    {

    return found;

    }

    public getValue()

    {

    return value();

    }

    so Get would be defined as

    public Response Get(A)

    {



    }

    and called like

    r= Get(A).isfound;

    if (r.isFound())

    {

    \use r.getValue();

    }

    This gives something akin to the tuples of "nice". Also if the map API was designed to be called/used as a remote cache it also allows things like more complicated error messages to be placed in the response such as a description.

    Is it possible using the templating to specify the responce object that is returned. Bit hazy on the c# implementation to remember and I have forgotten if its possible in c++.

  14. Aleksei: "null" is not a reasonable return value because it won’t work for generic collections. What if you have:

    IMap<int,int>. There is no null int that you can return for the GetAt method.

  15. Dan: I can’t use nullable types because nullable is restricted to value types. So you couldn’t have: IMap<string,string>

  16. S N: What makes the "out param" bool nicer than just an object that encapsulates that funtionality.

  17. Orion: As I mentioned exceptions aren’t appropriate here because it’s not exceptional behavior in many domains to ask a map what an element maps to. In that case the map just says "i don’t map that to anything".

    "Unfortunately null is a perfectly acceptable answer to a question and it in the answer space"

    I don’t see how it’s an acceptable answer. In maps of value types, there is no null, so I can’t return a null.

    It’s also unclear if null is the value to return or if it indicates "not found"

  18. James: You said:



    B b = a.GetAt();

    if (b == null)

    {

    //deal with not found / null found case

    }

    But there are cases (like the value case) where ‘b’ can never be null.

  19. David: "I don’t think it saves much in perf terms because it just changes the test from before the table access to afterwards, but it does result in one fewer call to the collection object itself. "

    It saves a _lot_ in perf. Imagine the case of a Map backed by a tree. If you test containment and then get the element you incur two O(log n) calls. If you have the system that returns the IOptional you do one O(log n) call, and then perform a near instantaneous check of a bool.

    I"ve had code where that map access call was the most expensive part. If I had to double that I would see a literal halving in the speed of my app.

  20. Damien: Many languages have tuples and I agree that they are much much nicer than out parameters. However, in C# they don’t exists and so you generally create lightweight named structs/classes to solve the issue. That’s what I’ve done with the IOptional interface. They allow you to do the same thing you want to do here, namely "return boolean saying if the value was found and the value if the bool was true".

  21. damien morton says:

    Hi Cyrus,

    IOptional<T> is functionaly identical to Nullable<T>, which I’m quite ambivalent about. It would certainly make the simple act of querying a map for a value more expensive (GC _and_ CPU), and for that reason alone I dont like it.

    I tend to think theres no global solution to this problem, and so, the answer is to provide both kinds of accessors.

    Map.Get (or thow an exception)

    Map.TryGet (and return some indicator of success/failure)

    Im not really fussed as to whether its done with an out param or not. Its fundamentally the same thing (although an out param might be more efficient)

  22. Damien: Nullable<T> only works on value types. IOptional<T> works on any type.

    I’m also confused. You say: "Im not really fussed as to whether its done with an out param or not" but then say "it would certainly make the simple act of querying a map for a value more expensive (GC _and_ CPU)"

    Also, you say:

    "I tend to think theres no global solution to this problem"

    In what way does each solution fail? I see the exception based solution failing. However, I don’t see the IOptional or out param based solution failing.

  23. Orion Adrian says:

    However the lack of null is an artificial restriction on value types. There are certainly uses for nulls for value types variables (just look at databases). So why would you limit value-type variables when you don’t have to.

    Orion Adrian

  24. Aleksei Guzev says:

    Any way the function should return something regardless of was the mapping defined or not. While in some cases we would like receive an exception and bool in other ones, there should be two different functions with different signatures. Remember in CLU types of exceptions thrown are parts of method signature. I don’t know a reasonable algorithm for compiler to decide wich method to call if it would allow to overload methods by exception types. That is why defining two methods seems to be the best way.

    This positions the API closer to scripting languages where there is a lot of methods to do almost the same. Take a look at Python, Smalltalk.

  25. Orion: I’m not sure what you mean. Value types cannot be null. Nullable is restricted to value types to lift them into reference types which can be null.

    Aleksei: What benefit does throwing an exception give you? What information do you get out of that? Exceptions should be thrown in exceptional conditions. Asking if a Map maps a value to doesn’t seem like an exceptional act. In my experience it’s an extremely common thing to do.

  26. Orion: To clarify, it’s _not_ an artificial restriction. it’s a restriction that exists with the CLR today. value types cannot be null. While I would like to change that for this API I cannot. So returning a Nullable is not possible with this API (unless I restrict the map to only having value types in it).

  27. damien morton says:

    I didnt realize that nullable is restricted to value types only. Seems like an odd restriction. On second thoughts, maybe not.

    I guess I did hedge my bets….

    IOptional and TryGet(k, out v) are functionaly equivalent. Heres what the syntax looks like:

    IOptional<V> v = map.TryGet(k);

    if (v.HasValue) f(v.Value);

    —- OR ————–

    V v;

    if (map.TryGet(k, out v)) f(v);

    I like the out param version better.. the out keyword is pretty clear and uncluttered.

  28. Aleksei Guzev says:

    As I understand the API should provide a mechanism to do something with the image of given object if any.

    What about implementing as interator in Ruby or Smalltalk style. I.e. the action would be provided as a code block parameter to the method. The block simply will not be executed if the mapping is undefined. Does C# support this approach?

  29. Thanks for the feedback damien!

    Aleksei: Excellent suggestion. I love it. 🙂

    Yes. C# 2.0 can do this. Specifically you would write it as:

    IMap<int,string> map;

    map.GetAt(4, delegate (string s) {

    //do stuff with s

    });

    I’ll have to give this style a lot more thought, it’s very intruiging.

  30. damien: The restriction to value types exists because it doesn’t really make sense to have a nullable reference type. A reference type is already implicitly nullable. If you had something like:

    string? s = null

    string? t = new string?(null)

    what are the semantics of:

    s == null ?

    t == null ?

    s.HasValue ?

    t.HasValue ?

    In the value case the semantics are elevated to that of a reference type (which are well understood).

  31. Aleksei Guzev says:

    …or moreover one codeblock if defined, and another if undefined. Like in PostScript.

  32. Aleksei: True. Analogous to the two codepaths that would arise from teh boolean return. Thanks!

  33. I agree with Aleksei Guzev. Indeed, in the framework, the same problem arise with the Int32 Int32.Parse(string) method, that throw an exception when it fails. Then the bool Int32.TryParse(string,out Int32) method has been added in the Whidbey version.

  34. Patrick: Which part of Aleksei’s posts did you agree with? Passing along an anonymous block to operate on the mapped value, or returning a bool and an out param. If you prefer the latter do you have any reasons to like it better than returning the IOptional<A>?

  35. Thanks all! Excellent suggestions all around.

  36. Aleksei Guzev says:

    Passing an anonymous block is in theory the same as returning tuple and iterating through it with foreach statement. But the latter requires building the tuple object at runtime, while the former should build anonymous method at compile time.

    I like CLU-Smalltalk-Ruby iterator constructs for their ability to process huge sets of data with minimum buffering. This allows, for example, parallel processing: the codeblock does the work while iterator fetches the next item.

  37. Aleksei: There are problems with internal vs external iterators. There are, unfortunately, some things that are extremely difficult to do with the former and I think a good system will provide both models to the programmers to balance ease-of-use over power.

  38. AT says:

    I believe that x.Find( out value ) is nice.

    But returning IOptional<A> is also fine.

    Using IOptional you can make some kind of async and lazy evaluation. IOptinal can be not simply a struct with values – but it can be IAsyncResult or some kind of cache storage filled only then corresponding properties accessed and freed in case of low memory conditions.

    As well IOptional can be extended to add more information about result – for example costs of accessing data, size of data or query plan to get most resent value if needed.

    A lot of this functionally will not be used in near term and can be considered as bloatware. But repeatable and non-sequential computations is the future.

  39. Orion Adrian says:

    Cyrus: I was _referring_ to the restriction the CLR put on you, not you. Essentially if you have a variable representing a set that can have a size of 0 or 1 the CLR forces you to make an 1-sized array. Which isn’t particularly nice.

    I’m not griping about your implementations with respect to the CLR, just the recent realization that there are a lot of fundamental problems with the CLR that could be addressed. It’s still, IMO, the best framework I’ve ever worked with. Don’t misunderstand my interest in improvement as not liking it.

    Orion Adrian

  40. AT: That’s very interesting. The lazy evaluation could be very powerful. I"ll have to do more reading of OKasaki’s book.

    Orion: Gripe all you want 🙂

    The more people who are upset about flaws in the current design and who see better ways out there to fix it, the better chance that it will happen!

  41. Orion: IMO that was one of those concessions that was made _purely_ for performance reasons. Based on research they felt they absolutely had to have some type that would be stack allocated.

    I’m letting people know about your opinions so they can think about these things as future advancements to the CLR

  42. Aleksei Guzev says:

    Will the codeblack have access to local (on stack) variables of the calling method?

    This ability is important since many loops are run to accumulate some data, like sum, maximum, or average.

  43. Aleksei: Yes, the code block will have access. It’s very similar to a closure (albeit with some caveats). Check out the C# 2.0 spec to see how they work. They’re called "anonymous delegates" or "anonymous methods".

  44. Damien: "IOptional and TryGet(k, out v) are functionaly equivalent"

    I’m reminded of a relevant quote:

    Do or do not. There is no try.

    🙂