Misc

Iteration in .NET with IEnumerable and IEnumerator

A discussion broke out on Code Project recently about why .NET has two interfaces for iteration (what .NET calles “enumeration”).

What are the two interfaces and what do they do?

The IEnumerable interface is placed on the collection object and defines the GetEnumerator() method, this returns a (normally new) object that has implements the IEnumerator interface. The foreach statement in C# and For Each statement in VB.NET use IEnumerable to access the enumerator in order to loop over the elements in the collection.

The IEnumerator interface is esentially the contract placed on the object that actually does the iteration. It stores the state of the iteration and updates it as the code moves through the collection.

Why not just have the collection be the enumerator too? Why have two separate interfaces?

There is nothing to stop IEnumerator and IEnumerable being implemented on the same class. However, there is a penalty for doing this – It won’t be possible to have two, or more, loops on the same collection at the same time. If it can be absolutely guaranteed that there won’t ever be a need to loop on the collection twice at the same time then that’s fine. But in the majority of circumstances that isn’t possible.

When would someone iterate over a collection more than once at a time?

Here are two examples.

The first example is when there are two loops nested inside each other on the same collection. If the collection was also the enumerator then it wouldn’t be possible to support nested loops on the same collection, when the code gets to the inner loop it is going to collide with the outer loop.

The second example is when there are two, or more, threads accessing the same collection. Again, if the collection was also the enumerator then it wouldn’t be possible to support safe multithreaded iteration over the same collection. When the second thread attempts to loop over the elements in the collection the state of the two enumerations will collide.

Also, because the iteration model used in .NET does not permit alterations to a collection during enumeration these operations are otherwise completely safe.

These names are confusing, why didn’t Microsoft just have an IEnumerator and a ISafeEnumerator and get rid of the IEnumerable? These would convey a much better meaning to the developer as the lack of distinction in the terminology will always make it more difficult to remember which was which.

IEnumerator and ISafeEnumerator would have broadly the same implementation without any real performance gain. It is already stated in the MSDN documentation that code in a loop is not permitted to change the contents of the collection that is being looped over, so in reality all enumerators are safe so long as the instances of enumerator objects are not shared between different loops at the same time.

And as for the lack of distinction in terminology, the suffixes make the distinction. Words in English that end in -able denote the ability to do something. In this case enumerable means the ability to enumerate. Words ending in -or, called agent nouns, denote someone or something that will perform some work. In this case enumerator means something that enumerates.

NOTE: This was rescued from the Google Cache. The original date was Saturday 11th September, 2004.

Tags:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s