New C# Features in the .NET Framework 4-Covariance and Contravariance

Covariance and Contravariance

Covariance and contravariance are best introduced with an example, and the best is in the framework. In System.Collections.Generic, IEnumerable<T> and IEnumerator <T> represent, respectively, an object that’s a sequence of T’s and the enumerator (or iterator) that does the work of iterating the sequence. These interfaces have done a lot of heavy lifting for a long time, because they support the implementation of the foreach loop construct. In C# 3.0, they became even more prominent because of their central role in LINQ and LINQ to Objects—they’re the .NET interfaces to represent sequences.

So if you have a class hierarchy with, say, an Employee type and a Manager type that derives from it (managers are employees, after all), then what would you expect the following code to do?

IEnumerable<Manager> ms = GetManagers();

IEnumerable<Employee> es = ms;

It seems as though one ought to be able to treat a sequence of Managers as though it were a sequence of Employees. But in C# 3.0, the assignment will fail; the compiler will tell you there’s no conversion. After all, it has no idea what the semantics of IEnumerable<T> are. This could be any interface, so for any arbitrary interface IFoo<T>, why would an IFoo<Manager> be more or less substitutable for an IFoo<Employee>?

In C# 4.0, though, the assignment works because IEnumerable<T>, along with a few other interfaces, has changed, an alteration enabled by new support in C# for covariance of type parameters.

IEnumerable<T> is eligible to be more special than the arbitrary IFoo<T> because, though it’s not obvious at first glance, members that use the type parameter T (GetEnumerator in IEnumerable<T> and the Current property in IEnumerator<T>) actually use T only in the position of a return value. So you only get a Manager out of the sequence, and you never put one in.

In contrast, think of List<T>. Making a List<Manager> substitutable for a List<Employee> would be a disaster, because of the following:

List<Manager> ms = GetManagers();

List<Employee> es = ms; // Suppose this were possible

es.Add(new EmployeeWhoIsNotAManager()); // Uh oh

As this shows, once you think you’re looking at a List<Employee>, you can insert any employee. But the list in question is actually a List<Manager>, so inserting a non-Manager must fail. You’ve lost type safety if you allow this. List<T> cannot be covariant in T.

The new language feature in C# 4.0, then, is the ability to define types, such as the new IEnumerable<T>, that admit conversions among themselves when the type parameters in question bear some relationship to one another. This is what the .NET Framework developers who wrote IEnumerable<T> used, and this is what their code looks like (simplified, of course):

public interface IEnumerable<out T> { /* ... */ }

Notice the out keyword modifying the definition of the type parameter, T. When the compiler sees this, it will mark T as covariant and check that, in the definition of the interface, all uses of T are up to snuff (in other words, that they’re used in out positions only—that’s why this keyword was picked).

Why is this called covariance? Well, it’s easiest to see when you start to draw arrows. To be concrete, let’s use the Manager and Employee types. Because there’s an inheritance relationship between these classes, there’s an implicit reference conversion from Manager to Employee:

Manager → Employee

And now, because of the annotation of T in IEnumerable<out T>, there’s also an implicit reference conversion from IEnumerable<Manager> to IEnumerable<Employee>. That’s what the annotation provides for:

IEnumerable<Manager> → IEnumerable<Employee>

This is called covariance, because the arrows in each of the two examples point in the same direction. We started with two types, Manager and Employee. We made new types out of them, IEnumerable<Manager> and IEnumerable<Employee>. The new types convert the same way as the old ones.

Contravariance is when this happens backward. You might anticipate that this could happen when the type parameter, T, is used only as input, and you’d be right. For example, the System namespace contains an interface called IComparable<T>, which has a single method called CompareTo:

public interface IComparable<in T> { 

  bool CompareTo(T other); 

}

If you have an IComparable<Employee>, you should be able to treat it as though it were an IComparable<Manager>, because the only thing you can do is put Employees in to the interface. Because a manager is an employee, putting a manager in should work, and it does. The in keyword modifies T in this case, and this scenario functions correctly:

IComparable<Employee> ec = GetEmployeeComparer();IComparable<Manager> mc = ec;

This is called contravariance because the arrow got reversed this time:

Manager → Employee
IComparable<Manager> ← IComparable<Employee>

So the language feature here is pretty simple to summarize: You can add the keyword in or out whenever you define a type parameter, and doing so gives you free extra conversions. There are some limitations, though.

First, this works with generic interfaces and delegates only. You can’t declare a generic type parameter on a class or struct in this manner. An easy way to rationalize this is that delegates are very much like interfaces that have just one method, and in any case, classes would often be ineligible for this treatment because of fields. You can think of any field on the generic class as being both an input and an output, depending on whether you write to it or read from it. If those fields involve type parameters, the parameters can be neither covariant nor contravariant.

Second, whenever you have an interface or delegate with a covariant or contravariant type parameter, you’re granted new conversions on that type only when the type arguments, in the usage of the interface (not its definition), are reference types. For instance, because int is a value type, the IEnumerator<int> doesn’t convert to IEnumerator <object>, even though it looks like it should:

IEnumerator <int> image: right arrow with slash IEnumerator <object>

The reason for this behavior is that the conversion must preserve the type representation. If the int-to-object conversion were allowed, calling the Current property on the result would be impossible, because the value type int has a different representation on the stack than an object reference does. All reference types have the same representation on the stack, however, so only type arguments that are reference types yield these extra conversions.

Very likely, most C# developers will happily use this new language feature—they’ll get more conversions of framework types and fewer compiler errors when using some types from the .NET Framework (IEnumerable<T>, IComparable<T>, Func<T>, Action<T>, among others). And, in fact, anyone designing a library with generic interfaces and delegates is free to use the new in and out type parameters when appropriate to make life easier for their users.

By the way, this feature does require support from the runtime—but the support has always been there. It lay dormant for several releases, however, because no language made use of it. Also, previous versions of C# allowed some limited conversions that were contravariant. Specifically, they let you make delegates out of methods that had compatible return types. In addition, array types have always been covariant. These existing features are distinct from the new ones in C# 4.0, which actually let you define your own types that are covariant and contravariant in some of their type parameters.

Source : MSDN Magazine

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s