Iterators are at the core of .NET programming. Only rarely do developers actually work against indexed data, preferring to use for-each loops for most tasks. But is this inherently sequential access method appropriate as we turn more to multi-threaded applications?
If you consider an iterator primary task of getting the next element, then look at how it does it, you should observe a major design flaw… the operation is not atomic. An iterator’s implementation is IEnumerator or IEnumerator(Of T), and IEnumerator require you to first call MoveNext then read the Current property. If you allow multiple threads to use the same IEnumerator, then you could get a call sequence such as MoveNext, MoveNext, Current, Current, which would skip one item and repeat the next one. So IEnumerator is not well designed for threading. This is a known design limitation, but rather than address that, languages like C# have implemented their iterators to be a different iterator per thread. That is they not only recognise the limitation but they also enforce it.
Bill continues with notes about how a new IEnumerator would have to work in order to properly support multiple threads.
Community comments
the whole concept of iteration points to a deeper flaw in most OO languages
by andrew mcveigh,
Thats a pointless discussion to begin with.
by Francois Ward,
Re: Thats a pointless discussion to begin with.
by Al Tenhundfeld,
Re: Thats a pointless discussion to begin with.
by Francois Ward,
Re: Thats a pointless discussion to begin with.
by Jonathan Allen,
Re: Thats a pointless discussion to begin with.
by Francois Ward,
Iterators are usually too low level
by Kurt Christensen,
Iterators are a tool
by Aaron Erickson,
don't see the problem
by Ben Murphy,
the whole concept of iteration points to a deeper flaw in most OO languages
by andrew mcveigh,
Your message is awaiting moderation. Thank you for participating in the discussion.
citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1....
Thats a pointless discussion to begin with.
by Francois Ward,
Your message is awaiting moderation. Thank you for participating in the discussion.
With the advent of deeper integration of functional programming in .NET, with LINQ and extension methods, Foreach isn't used nearly as much (since usually, a foreach is there to do something else... often easily represented by a lambda and some of the LINQ Operators such as projections through select)
At that point, its quite easy to multi-thread: and it is being done. MS is way ahead of this discussion, in the form of PLINQ, which has been discussed a lot here already. Once its released (along with the rest of the parallel framework), this will be a non-story.
Re: Thats a pointless discussion to begin with.
by Al Tenhundfeld,
Your message is awaiting moderation. Thank you for participating in the discussion.
I completely agree, Francois.
This is one area where I think MS is ahead of the rest of the software community and certainly ahead of most developers using the MS technology stack.
Re: Thats a pointless discussion to begin with.
by Francois Ward,
Your message is awaiting moderation. Thank you for participating in the discussion.
I don't think they're ahead of the software community as a whole. There are many languages and toolkits that do similar things already.
Are they doing it better? Thats subjective. I think considering the limitations of OOP and imperative languages (or hybrid with functions as first class constructs), what they're doing is very, very good.
Ahead of most developers using MS Technology though? Definately. Basic multithreading in ASP.NET is quite easy and efficient (you can automatically "fork" all of your long running tasks, and they "join" back in the event where you're supposed to do your UI operations, so its perfect..), but almost no one uses it.
Iterators are usually too low level
by Kurt Christensen,
Your message is awaiting moderation. Thank you for participating in the discussion.
I think Bill is on to something. When most programmers do foreach, what they *really* mean to do is write a mapping function. And a mapping function could of course be implemented to operate across multiple nodes. Just one of the many ways as programmers we need to raise the levels of abstraction at which we think in order to play in the multicore playgrounds of the future.
Iterators are a tool
by Aaron Erickson,
Your message is awaiting moderation. Thank you for participating in the discussion.
Iterators serve a great function - you just have to know their limitations w/regard to threading - that is, don't use them in a multithreaded environment.
Indeed, FP and F# should make a lot of this go away.
Re: Thats a pointless discussion to begin with.
by Jonathan Allen,
Your message is awaiting moderation. Thank you for participating in the discussion.
Underneath LINQ you are still using a IEnumerator. What limitations does this place upon libraries like Parallel LINQ?
Re: Thats a pointless discussion to begin with.
by Francois Ward,
Your message is awaiting moderation. Thank you for participating in the discussion.
None. You still have a set to go through to start the jobs. The jobs themselves have to be in parallel, not the iteration.
The limitation of foreach loops in the article is that before you go to the next iteration, you have to finish the work for the current iteration. If you take all the work, and push it as a task in another thread, you can go to the next one right away (and the iteration itself is an insignificant operation for 99.9% of cases).
So PLINQ will just iterate normally, but do the work in separate, throttled threads (throttled because if you have 1000 elements to work on, and you start 1000 threads, you'll bottleneck elsewhere and things will be slower, not faster, in many cases), and wait for all of the signals before telling the main thread that all the 1000 jobs are done.
Typical throttled fork/join operations, just with a whole lot less pain :)
As a note, PLINQ doesn't actually work on IEnumerables, but really on IParallel (or something). Before using PLINQ, you have to call the AsParallel() extension method to convert the IEnumerable to a parallizable construct. Still requires a standard enumerator originally, but again, even with hundreds of thousand elements, usually looping through is an insignificant operation.
don't see the problem
by Ben Murphy,
Your message is awaiting moderation. Thank you for participating in the discussion.
even if you have a thread safe access pattern that gives you back a tuple you still need the implementation to be thread safe. it is too costly to make all enumerators thread safe and MoveNext()/Current is easier to use than one method that gives a tuple.