Saturday, July 29, 2006

Why closure methods on collections matter

I like the new closure methods (like ForEach and ConvertAll) on .NET 2.0 classes such as List<T> and Array, because from an OO standpoint, it really should be the responsibility of a collection to know how to traverse its own elements. However, I've always had a tough time justifying why you'd want to do...

employees.ForEach(delegate(Employee e) 
{
    Console.WriteLine(e);
});


... instead of...

foreach (Employee e in employees)
{
    Console.WriteLine(e);
}


In this case, it doesn't look like you gain anything.

Recently, I came across a situation which made me see the difference more clearly. Let's say you have a list of employees, and now, you want to remove all managers. Happily, we write...

foreach (Employee e in employees) 
{ 
    if (employee.Manager)
    {
        employees.Remove(e);
    }
}


This causes an exception because you can't modify a collection while you're iterating through it. OK, how about...

for (int i = 0; i < employees.Count; i++) 
{
    if (employees[i].Manager)
    {
        employees.RemoveAt(i);
    }
}


This doesn't throw an exception, but also doesn't work properly because we're modifying the count in the loop and our index will skip over elements. OK, so the solution is...

for (int i = employees.Count - 1; i >= 0; i--) 
{
    if (employees[i].Manager)
    {
        employees.RemoveAt(i);
    }
}


That's great! However, we now have to worry about remembering that a forward loop through a collection to conditionally remove elements semantically doesn't work. If you forget, no exception will be thrown, and only running unit tests or executing the program will reveal this.

On the other hand, we wouldn't have to remember this little gotcha if we simply relied on the RemoveAll method...

employees.RemoveAll(delegate(Employee e) { return e.Manager; });


In a way, this situation is similar to why we don't hesitate to use the Sort(IComparer) or Sort(Comparison<T>) methods, versus getting the elements and writing our own sorting algorithm. Do we have to care how the collection's sort works? Generally, no, since that's the responsibility of the collection.

FYI, check out the List<T>.RemoveAll method in .NET Reflector. It's not using a reverse for-loop.