Bill McCarthy asks “Are Iterators Fundamentally Flawed?”

| by Jonathan Allen Follow 641 Followers on Aug 05, 2008. Estimated reading time: less than one minute |

Iterators are at the core of .NET programming. Only rarely do developers actually work against indexed data, preferring to use for-each loops for most tasks. But is this inherently sequential access method appropriate as we turn more to multi-threaded applications?

Bill McCarthy writes,

If you consider an iterator primary task of getting the next element, then look at how it does it, you should observe a major design flaw… the operation is not atomic. An iterator’s implementation is IEnumerator or IEnumerator(Of T), and IEnumerator require you to first call MoveNext then read the Current property. If you allow multiple threads to use the same IEnumerator, then you could get a call sequence such as MoveNext, MoveNext, Current, Current, which would skip one item and repeat the next one. So IEnumerator is not well designed for threading. This is a known design limitation, but rather than address that, languages like C# have implemented their iterators to be a different iterator per thread. That is they not only recognise the limitation but they also enforce it.

Bill continues with notes about how a new IEnumerator would have to work in order to properly support multiple threads.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

the whole concept of iteration points to a deeper flaw in most OO languages by andrew mcveigh

Thats a pointless discussion to begin with. by Francois Ward

With the advent of deeper integration of functional programming in .NET, with LINQ and extension methods, Foreach isn't used nearly as much (since usually, a foreach is there to do something else... often easily represented by a lambda and some of the LINQ Operators such as projections through select)

At that point, its quite easy to multi-thread: and it is being done. MS is way ahead of this discussion, in the form of PLINQ, which has been discussed a lot here already. Once its released (along with the rest of the parallel framework), this will be a non-story.

Re: Thats a pointless discussion to begin with. by Al Tenhundfeld

I completely agree, Francois.

This is one area where I think MS is ahead of the rest of the software community and certainly ahead of most developers using the MS technology stack.

Re: Thats a pointless discussion to begin with. by Francois Ward

I don't think they're ahead of the software community as a whole. There are many languages and toolkits that do similar things already.

Are they doing it better? Thats subjective. I think considering the limitations of OOP and imperative languages (or hybrid with functions as first class constructs), what they're doing is very, very good.

Ahead of most developers using MS Technology though? Definately. Basic multithreading in ASP.NET is quite easy and efficient (you can automatically "fork" all of your long running tasks, and they "join" back in the event where you're supposed to do your UI operations, so its perfect..), but almost no one uses it.

Iterators are usually too low level by Kurt Christensen

I think Bill is on to something. When most programmers do foreach, what they *really* mean to do is write a mapping function. And a mapping function could of course be implemented to operate across multiple nodes. Just one of the many ways as programmers we need to raise the levels of abstraction at which we think in order to play in the multicore playgrounds of the future.

Iterators are a tool by Aaron Erickson

Iterators serve a great function - you just have to know their limitations w/regard to threading - that is, don't use them in a multithreaded environment.

Indeed, FP and F# should make a lot of this go away.

Re: Thats a pointless discussion to begin with. by Jonathan Allen

Underneath LINQ you are still using a IEnumerator. What limitations does this place upon libraries like Parallel LINQ?

Re: Thats a pointless discussion to begin with. by Francois Ward

None. You still have a set to go through to start the jobs. The jobs themselves have to be in parallel, not the iteration.

The limitation of foreach loops in the article is that before you go to the next iteration, you have to finish the work for the current iteration. If you take all the work, and push it as a task in another thread, you can go to the next one right away (and the iteration itself is an insignificant operation for 99.9% of cases).

So PLINQ will just iterate normally, but do the work in separate, throttled threads (throttled because if you have 1000 elements to work on, and you start 1000 threads, you'll bottleneck elsewhere and things will be slower, not faster, in many cases), and wait for all of the signals before telling the main thread that all the 1000 jobs are done.

Typical throttled fork/join operations, just with a whole lot less pain :)

As a note, PLINQ doesn't actually work on IEnumerables, but really on IParallel (or something). Before using PLINQ, you have to call the AsParallel() extension method to convert the IEnumerable to a parallizable construct. Still requires a standard enumerator originally, but again, even with hundreds of thousand elements, usually looping through is an insignificant operation.

don't see the problem by Ben Murphy

even if you have a thread safe access pattern that gives you back a tuple you still need the implementation to be thread safe. it is too costly to make all enumerators thread safe and MoveNext()/Current is easier to use than one method that gives a tuple.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

9 Discuss