Internal and External Iterators


I’ve updated ICollection<A> in the following manner:



   public interface ICollection<A> : IEnumerable<A>


    {


        /// Previous stuff


 


        /// <summary>


        /// Iterates over the members in this collection applying ‘p’ to each element.


        /// The return value of p(element) indicates if iteration should proceed.  i.e.


        /// If p(element) returns false then iteration stops.  If it returns true then


        /// iteration continues


        /// </summary>


        /// <param name=“p”>The function to apply to the elements in this Collection.


        /// The return value says when iteration should continue</param>


        void Iterate(Predicate<A> p);


    }


It not only extends IEnumerable<A> so you can foreach over collections, but it also provides an internal iterator.  This exists because for a vast majority of cases, writing an efficient internal iterator is far easier than the equivalant external one.  I mentioned the case of an external iterator on a tree, and I recommend you try it and see what it’s like


Now: since I’m writing a library one of my core goals is to make extensibility very easy.  I.e. i would like to provide ‘almost’ implementation with almost all functionality basically implemented.  I say ‘basically’ because I mean that the implementation might be quite dumb and enneficient, however subclasses will be free to override with specialized versions for themselves.  In other words, I’d like abstract implementations with almost all the work taken care of for you.  So I started to write AbstractCollection<A> and I ended up in an interesting situation:



    public abstract class AbstractCollection<A> : ICollection<A>


    {


        public void Iterate(Predicate<A> p)


        {


            foreach (A element in this)


            {


                if (!p(element))


                {


                    break;


                }


            }


        }


 


        public IEnumerator<A> GetEnumerator()


        {


            this.Iterate(delegate(A element)


            {


                yield return element;


            });


        }


    }


As you can see, I have a pair of mutually recursive methods.  This is by design.  The intent is that subclasses only need implement one of the methods in order to get both.  I.e. if you implement GetEnumerator() then you will get Iterator(Predicate<A> p) for free, and vice-versa.  However when I tried to compile I was met with the dissapointing error:

 Error 1  The yield statement cannot be used inside anonymous method blocks 

Oh noes!  How do I work around this?  Is it possible to define GetEnumerator in terms of Iterate?  If you can think of a way how, let me know!


Comments (8)

  1. AT says:

    Framework convert your methods with yield into IEnumerator/IEnumerable delivered class with internal state of method.

    As you use arbitrary Iterate it’s impossible to save current state to restore it on next invocation.

    As an workaround we can use threading (yep, this is lame).

    Here is undebugged example:

    public virtual IEnumerator<A> GetEnumerator()

    {

    A current = A.default;

    bool finished = false;

    AutoResetEvent nextReady = new AutoResetEvent(false);

    AutoResetEvent iteratorCanGo = new AutoResetEvent(true);

    Predicate<A> pred = delegate(A element)

    {

    iteratorCanGo.WaitOne();

    current = element;

    nextReady.Set();

    return true;

    };

    ThreadStart iterator = delegate()

    {

    try

    {

    this.Iterate(pred);

    iteratorCanGo.WaitOne();

    }

    finally

    {

    finished = true;

    nextReady.Set();

    }

    };

    Thread t = new Thread(iterator);

    t.Start();

    while (!finished)

    {

    nextReady.WaitOne();

    if (finished) yield break;

    yield return current;

    iteratorCanGo.Set();

    }

    }

    I was unable to find how to better create fiberthreads-like processing in C#. Used two events to signal processing.

    Note it has at least two (but big) bugs (examples – calling IEnumerator.Reset() or not reading enumeration completely) and not recomemded for any use other that education purpose ;o) !!

  2. AT: I am very scared by your implementation (but I don’t think that comes as any suprise to you) 🙂

    Note: I’m not requiring the use of ‘yield’ and the subsequent compiler manipulations that occur when you do that. However, I would be interested to know if there was a way (without threading hacks) to create an IEnumerator<A> that worked properly on top of the Iterate method.

    BTW: IEnumerator<A> doesn’t have the reset method like IEnumerator does. Do we don’t need to implement that.

  3. SteveJS says:

    Waste memory version:

    public IEnumerator<A> GetEnumerator()

    {

    Queue<A> qa = new Queue<A>();

    this.Iterate(new Predicate<A>(delegate(A element)

    {

    qa.Enqueue(element);

    return true;

    }));

    while (qa.Count != 0)

    {

    yield return qa.Dequeue();

    }

    }

    or Waste time version:

    public IEnumerator<A> GetEnumerator()

    {

    Queue<A> qa = new Queue<A>();

    this.Iterate(new Predicate<A>(delegate(A element)

    {

    qa.Enqueue(element);

    return false;

    }));

    int skip = qa.Count;

    while (qa.Count != 0)

    {

    yield return qa.Dequeue();

    int i = 0;

    this.Iterate(new Predicate<A>(delegate(A element)

    {

    if (i < skip)

    {

    i++;

    return true;

    }

    qa.Enqueue(element);

    return false;

    }));

    skip += qa.Count;

    }

    }

    or dynamically waste a little of each:

    public IEnumerator<A> GetEnumerator()

    {

    Queue<A> qa = new Queue<A>();

    this.Iterate(new Predicate<A>(delegate(A element)

    {

    qa.Enqueue(element);

    return false;

    }));

    int skip = qa.Count;

    while (qa.Count != 0)

    {

    while (qa.Count != 0)

    {

    yield return qa.Dequeue();

    }

    int i = 0;

    int ct = skip;

    this.Iterate(new Predicate<A>(delegate(A element)

    {

    if (i < skip)

    {

    i++;

    return true;

    }

    if (ct > 0)

    {

    ct–;

    qa.Enqueue(element);

    return true;

    }

    return false;

    }));

    skip += qa.Count;

    }

    }

    Overall though I’d say you should remove this as a goal for your interface.

  4. AT says:

    Cyrus:

    I believe there is no any way to solve this problem without storing state of Iterate function between each MoveNext() call.

    So it’s not possible to implement your design. You have to push subclasses to implement GetEnumerator and keep states of their Iterate-like functions inside subclasses of IEnumerator<A>.

    There is very tricky issues with try/catch/finally processing inside Iterate.

    Consider simple fact – Microsoft was unable to allow yield inside finally and inside any of try with catch and simple catch blocks.

    Any of this problem prevent from creating reliable method state (which can include not only values of local variables – but also a statuses of exception monitors). But any of this are allowed in your generic Iterate function.

    I believe that it’s theoretically possible to create FSA for exceptions and try/finally blocks – but somebody inside Microsoft decided to not do this complex job.

    If you will come up with solution for generic Iterate – you will solve all this problem with yield and will be able to integrate them inside compiler.

    As for my code – I was simply checking if it will really work. Probably this scare you most ;o)

    Before giving advice to use threading I’ve to check it first.

    The major problem with threading is that you do not know that user expect to do with your enumeration – has it finished reading or nope yet.

    There is no Dispose inside IEnumerator – but this will be good for example for database, filesystem or thread driven enumerations ;o)

    P.S.> Pay attention to smileys.

  5. AT says:

    SteveJS:

    Iterate() is not obligated to preserve ordering.

    There is nothing that must prevent from giving the same elements in different order.

    But it’s a better tradeoff compared to threading ;o)

  6. AT: I agree. I am able to solve the problem with Continuations to transform an internal iterator into an external. But as those are unavilable in C# I don’t see how I can do this. It’s unfortunate because i can do this pretty trivially in Scheme or Ruby. (ok, the code is trivial, understanding the continuations isn’t 🙂 ). I’m going to do some work to find out what it would take to get mimic the closure functionality I would need in C#. It’s possible to do as well with some sort of freakish use of setjmp longjmp, but I don’t know enough IL to know if that possible/allowed.

  7. Thanks Justin. I like how you broke out the logic on whether to continue iterating or not.