InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Bill McCarthy asks “Are Iterators Fundamentally Flawed?”

Posted by Jonathan Allen on Aug 05, 2008

Sections
Architecture & Design,
Development
Topics
.NET ,
Programming ,
Multi-threading ,
Parallel Programming

Iterators are at the core of .NET programming. Only rarely do developers actually work against indexed data, preferring to use for-each loops for most tasks. But is this inherently sequential access method appropriate as we turn more to multi-threaded applications?

Bill McCarthy writes,

If you consider an iterator primary task of getting the next element, then look at how it does it, you should observe a major design flaw… the operation is not atomic. An iterator’s implementation is IEnumerator or IEnumerator(Of T), and IEnumerator require you to first call MoveNext then read the Current property. If you allow multiple threads to use the same IEnumerator, then you could get a call sequence such as MoveNext, MoveNext, Current, Current, which would skip one item and repeat the next one. So IEnumerator is not well designed for threading. This is a known design limitation, but rather than address that, languages like C# have implemented their iterators to be a different iterator per thread. That is they not only recognise the limitation but they also enforce it.

Bill continues with notes about how a new IEnumerator would have to work in order to properly support multiple threads.

the whole concept of iteration points to a deeper flaw in most OO languages by andrew mcveigh Posted
Thats a pointless discussion to begin with. by Francois Ward Posted
Re: Thats a pointless discussion to begin with. by Al Tenhundfeld Posted
Re: Thats a pointless discussion to begin with. by Francois Ward Posted
Re: Thats a pointless discussion to begin with. by Jonathan Allen Posted
Re: Thats a pointless discussion to begin with. by Francois Ward Posted
Iterators are usually too low level by Kurt Christensen Posted
Iterators are a tool by Aaron Erickson Posted
don't see the problem by Ben Murphy Posted
  1. Back to top

    the whole concept of iteration points to a deeper flaw in most OO languages

    by andrew mcveigh

  2. Back to top

    Thats a pointless discussion to begin with.

    by Francois Ward

    With the advent of deeper integration of functional programming in .NET, with LINQ and extension methods, Foreach isn't used nearly as much (since usually, a foreach is there to do something else... often easily represented by a lambda and some of the LINQ Operators such as projections through select)



    At that point, its quite easy to multi-thread: and it is being done. MS is way ahead of this discussion, in the form of PLINQ, which has been discussed a lot here already. Once its released (along with the rest of the parallel framework), this will be a non-story.

  3. Back to top

    Re: Thats a pointless discussion to begin with.

    by Al Tenhundfeld

    I completely agree, Francois.



    This is one area where I think MS is ahead of the rest of the software community and certainly ahead of most developers using the MS technology stack.

  4. Back to top

    Re: Thats a pointless discussion to begin with.

    by Francois Ward

    I don't think they're ahead of the software community as a whole. There are many languages and toolkits that do similar things already.



    Are they doing it better? Thats subjective. I think considering the limitations of OOP and imperative languages (or hybrid with functions as first class constructs), what they're doing is very, very good.



    Ahead of most developers using MS Technology though? Definately. Basic multithreading in ASP.NET is quite easy and efficient (you can automatically "fork" all of your long running tasks, and they "join" back in the event where you're supposed to do your UI operations, so its perfect..), but almost no one uses it.

  5. Back to top

    Iterators are usually too low level

    by Kurt Christensen

    I think Bill is on to something. When most programmers do foreach, what they *really* mean to do is write a mapping function. And a mapping function could of course be implemented to operate across multiple nodes. Just one of the many ways as programmers we need to raise the levels of abstraction at which we think in order to play in the multicore playgrounds of the future.

  6. Back to top

    Iterators are a tool

    by Aaron Erickson

    Iterators serve a great function - you just have to know their limitations w/regard to threading - that is, don't use them in a multithreaded environment.

    Indeed, FP and F# should make a lot of this go away.

  7. Back to top

    Re: Thats a pointless discussion to begin with.

    by Jonathan Allen

    Underneath LINQ you are still using a IEnumerator. What limitations does this place upon libraries like Parallel LINQ?

  8. Back to top

    Re: Thats a pointless discussion to begin with.

    by Francois Ward

    None. You still have a set to go through to start the jobs. The jobs themselves have to be in parallel, not the iteration.



    The limitation of foreach loops in the article is that before you go to the next iteration, you have to finish the work for the current iteration. If you take all the work, and push it as a task in another thread, you can go to the next one right away (and the iteration itself is an insignificant operation for 99.9% of cases).

    So PLINQ will just iterate normally, but do the work in separate, throttled threads (throttled because if you have 1000 elements to work on, and you start 1000 threads, you'll bottleneck elsewhere and things will be slower, not faster, in many cases), and wait for all of the signals before telling the main thread that all the 1000 jobs are done.



    Typical throttled fork/join operations, just with a whole lot less pain :)


    As a note, PLINQ doesn't actually work on IEnumerables, but really on IParallel (or something). Before using PLINQ, you have to call the AsParallel() extension method to convert the IEnumerable to a parallizable construct. Still requires a standard enumerator originally, but again, even with hundreds of thousand elements, usually looping through is an insignificant operation.

  9. Back to top

    don't see the problem

    by Ben Murphy

    even if you have a thread safe access pattern that gives you back a tuple you still need the implementation to be thread safe. it is too costly to make all enumerators thread safe and MoveNext()/Current is easier to use than one method that gives a tuple.