Tip of the day: Using the null-coalescing operator over the conditional operator

I’ve recently been refactoring a lot of code that used the conditional operator and looked something like this:

int someValue = myEntity.SomeNullableValue.HasValue
                    ? myEntity.SomeNullableValue.Value
                    : 0;

That might seem less verbose than the traditional alternative, which looks like this:

int someValue = 0;
if (myEntity.SomeNullableValue.HasValue)
    someValue = myEntity.SomeNullableValue.Value;

…or other variations on that theme.

However, there is a better way of expressing this. That is to use the null-coalescing operator.

Essentially, what is says is that the value on the left of the operator will be used, unless it is null in which case the value ont the right is used. You can also chain them together which effectively returns the first non-null value.

So now the code above looks a lot more manageable and understandable:

int someValue = myEntity.SomeNullableValue ?? 0;

Why should you be returning an IEnumerable

I’ve seen in many places where a method returns a List<T> (or IList<T>) when it appears that it may not actually really be required, or even desirable when all things are considered.

A List is mutable, you can change the state of the List. You can add things to the List, you can remove things from the List, you can change the items the List contains. This means that everything that has a reference to the List instantly sees the changes, either because an element has changed or elements have been added or removed. If you are working in a multi-threaded environment, which will be increasingly common as time goes on, you will get issues with thread safety if the List is used inside other threads and one or more threads starts changing the List.

Return values should, unless you have a specific use case in mind already, be returning an IEnumerable<T> which is not mutable. If the underlying type is still a List (or Array or any of a myriad of other types that implement IEnumerable<T>) you can still cast it. Also, some LINQ expressions will self optimise if the underlying type is one which better supports what LINQ is doing. (Remember that LINQ expressions always take an IEnumerable<T> or IQueryable<T> anyway so you can do what you like regardless of what the underlying type is).

If you ensure that your return values are IEnumerable<T> to begin with yet further down the line you realise you need to return an Array or List<T> from the method it is easy to start doing that. This is because everything accepting the return value from the method will still be expecting an IEnumerable<T> which List<T> and Array implement. If, however, you started with a List<T> and move to returning an IEnumerable<T> then because so much code will have the expectation of a List<T> without actually needing it you will have a lot of refactoring to do just to update the reference types.

Have I convinced you yet? If not, think about this. How often are you inserting items into a collection of objects after the initial creation routine? How often do you remove items from a collection after the initial creation routine? How often do you need to access a specific item by index within a collection after the initial creation routine? My guess is almost never. There are some occasions, but not actually that many.

It took me a while to get my head around always using an IEnumerable<T>, until I realised that I almost never require to do the things in the above paragraph. I almost always just need to loop over a collection of objects, or filter a collection of objects to produce a smaller set. Both of those things can be done with just an IEnumerable<T> and LINQ.

But, what if I need a count of the objects in the List<T>, that would be inefficient with an IEnumerable<T> and LINQ? Well, do you really need a count? Oftentimes I just need to know if there are any objects at all in the collection, I don’t care how many object there actually are, in which case the LINQ extension method Any() can be used. If you do need a count LINQ is clever enough to work out that the underlying type may expose a Count property and it calls that (anything that implements ICollection<T> such as arrays, lists, dictionaries, sets, and so on) so it is not iterating over all the objects counting them up each time.

Remember, there is nothing necessarily wrong with putting a ToArray() to ToList() before returning as a reference to an IEnumerable<T> something to which a LINQ expression has been applied. That removes the issues that deferred execution can bring (e.g. unexpected surprises when it suddenly evaluates during the first iteration but breaks in the process) or repeatedly applying the filter in the Where() method or the transformation in the Select() method.

Just because an object is of a specific type, doesn’t mean you have to return that specific type.

For example, consider the services you actually need on the collection that you are returning, remembering how much LINQ gives you. The following diagram shows what each of the interfaces expect to be implemented what a number of the common collection types implement themselves.

Incidentally, the reason some of the interfaces on the Array class are in a different colour is that these interfaces are added by the runtime. So if you have a string[] it will expose IEnumerable<string>.

I’d suggest that as a general rule IEnumerable<T> should be the return type when you have anything that implements it as the return type from the method, unless something from an ICollection<T> or IList<T> (or any other type of collection) as absolutely desperately in needed and not just because some existing code expects, say, an IList<T> (even although it is using no more services from it that it would had it been an IEnumerable<T>).

The mutability that implementations of ICollection<T> and IList<T> give will prove problematic in the long term. If you have a large team with members that don’t fully understand what is going on (and this is quite common given the general level developer documentation) they are likely to change the contents of the collection without understanding its implications. In some situations this may fine, in others it may be disastrous.

Finally, if you absolutely do need to return a more specific collection type then instead of returning a reference to the concrete class, return a reference to the lowest interface that you need. For example, if you have a List<T> and you need to add further items to it, but not at specific locations in the list, then ICollection<T> will be the most suitable return type.

Type Inference

Type Inference is a neat feature where the C# compiler can work out itself what type a reference to an object is. While this can be used for the developer to be lazy it is most useful when you are dealing with the exceptionally long type names that LINQ expressions can generate. It is also a requirement when dealing with anonymous types, because there simply is no type name that you can use.

For example:

var aString = “ABC”;
Type st = aString.GetType(); // System.String;

var aNumber = 123.45M;
Type nt = aNumber.GetType(); // System.Decimal

var aDate = DateTime.Now;
Type dt = aDate.GetType(); // System.DateTime

In each of the above examples the compiler works out from what is on the right side of the equals sign the type of the reference (or value) on the left hand side.

What is important is that the compiler has something it can work with. In other words, you must assign something.

var aString; // Illegal – Can’t infer type

// Computer says "no":
// Implicitly-typed local variables must be initialized

Also, because the variable is strongly typed from the get-go each time you assign something to it, it must be of the same type.

var aDate = DateTime.Now;
aDate = 123; // Illegal – Can’t change type

// Computer says "no":
// Cannot implicitly convert type 'int' to 'System.DateTime'

Lambda cheat sheet

A while ago I wrote about all the new language features of C# 3.0 and it occurred to me that I’d left out a chunk about Lambdas. With LINQ being such an important part of C#3.0 that seems like a terrible omission, so I’m going to make up for it now.

The easiest way to think about a lambda is that it is a short form of anonymous methods that were introduced in C#2. However, you can also use lambdas to create expression trees (I’ll come to those in more detail in another post, for the moment, I’ll be concentrate on using lambdas for creating anonymous methods).

Basic Structure

A lambda that looks like this

(i) => 123 + i
^^^ ^^ ^^^^^^^
(1) (2)  (3)

is compiled to an anonymous method

delegate (int i) { return 123+i; }
         ^^^^^^^   ^^^^^^^^^^^^^
           (1)          (3)

The first part of the lambda (1), in the brackets, declares the parameters. The brackets are, incidentally, optional if there is only one parameter. The type is inferred so you don’t have to explicitly declare it as you would have to do with an anonymous method. However, if the compiler cannot infer the parameter type then you will have to declare it explicitly:

(int i) => 123 + i

The second part (2) is pronounced “goes to” and does not really have an equivalent with an anonymous method, although you could argue that it is the equivalent of the delegate keyword.

The third part (3) is the expression or statement. In a lambda expression the return is implicit so it does not need to be declared. It can also contain a number of statements enclosed in separated by semi-colon, but in that case it cannot be used to create expression trees and you must explicitly have a return statement if there is something being returned.

delegate void DisplayAdditionDelegate(int i, int j);

DisplayAdditionDelegate add = (i, j) =>
    { Console.WriteLine("{0} + {1} = {2}", i, j, i+j); };
add(2, 5);
// Output is: 2 + 5 = 7

Framework assistance

The .NET Framework provides a number of predefined generic delegate types that can be used with lambdas in order that is is easy to refer to them and pass them around.

Each of these contains a generic list of parameter types and finally the return type, or if there are no parameters just the return type.

The delegates with a return type are defined as:

public delegate TResult Func<TResult>()
public delegate TResult Func<T, TResult>(T arg)
public delegate TResult Func<T1, T2, TResult>(T1 arg1, T2 arg2)
public delegate TResult Func<T1, T2, T3, TResult>(T1 arg1, T2 arg2, T3 arg3)
public delegate TResult Func<T1, T2, T3, T4, TResult>(T1 arg1, T2 arg2, T3 arg3, T4 arg4)

The delegates without a return type are defined as:

public delegate void Action<T>(T obj)
public delegate void Action<T1, T2>(T arg1, T2 arg2)
public delegate void Action<T1, T2, T3>(T arg1, T2 arg2, T3 arg3)
public delegate void Action<T1, T2, T3, T4>(T arg1, T2 arg2, T3 arg3, T4 arg4)

You can then uses these delegates to represent an appropriate method without having to create a custom delegate type – for most purposes they are quite sufficient.

Outer variables

Outer variables are variables that the lambda can use that are defined in the method that defines the lambda.

int j = 25;
Func<int, int> function = i => 100 + i + j;
j = 75;
int result = function(50);
// result == 225

As you can see in the above example the value of j is not evaluated at the point the lambda is declared, but at the point it is invoked. The lambda is able to keep a reference to j even although it was declared in a different scope. The same is true if the lambda is eventually invoked from a different scope block altogether. For example, if it has been passed out of the method in which it is declared.

The lambda can also change the value of the outer variable. For example:

int j = 6;
Func<int, int> function = i => { j = j * j; return 100 + i + j; };
int result = function(50);
// result == 186
// j == 36

I guess I've never needed to do this before…

I guess I’ve never created a struct in a while (at least in Visual Studio 2008 using C# 3.0) because I’ve just discovered that the Automatic Properties don’t work in structs.

I’ve just created this struct:

public struct CapacityUnit
{
    public string Name { get; private set; }
    public long Multiplier { get; private set; }

    public CapacityUnit(string name, long multiplier)
    {
        Name = name;
        Multiplier = multiplier;
    }
}

Which at first glance looks okay except that I get a compiler error in the constructor on Name. The reason for this is that structs need to have the fields initialised before the this object can be used. Name uses this implicitly as in this.Name. So, it would seem that there is no way, at least as far as I can see, to initialise these properties when using Automatic Properties as I would need to use an implicit this.

Monitoring change in XML data (LINQ to XML series – Part 5)

This is the 5th part in a series on LINQ to XML. In this instalment we will look at monitoring changes in XML data in the XML classes added to .NET 3.5.

The XObject class (from which XElement and XAttribute, among others) contains two events that are of interest to anyone wanting to know about changes to the XML data: Changing and Changed

The Changing event is triggered prior to a change being applied to the XML data. The Changed event is triggered after the change has been applied.

An example of adding the event handler would be something like this:

XElement root = new XElement("root");
root.Changed += new EventHandler<XObjectChangeEventArgs>(root_Changed);

The above example will trigger for any change that happens in the node the event handler is applied to and any node downstream of it. As the example is applied to the root node this means the event will trigger for any change in the XML data.

The event handler is supplied an XObjectChangeEventArgs object which contains an ObjectChange property. This is an XObjectChange enum and it lets the code know what type of change happened.

The sender contains the actual object in the XML data that has changed.

Adding an element

Take the following example where an element is added to the XML data.

XElement child = new XElement("ChildElement", "Original Value");
root.Add(child);

In this case the ObjectChanged is Add and the sender is the XElement: <ChildElement>Original Value</ChildElement>

A similar scenario happens when adding an attribute. However, instead of the sender being an XElement it will be an XAttribute.

child.Add(new XAttribute("TheAttribute", "Some Value"));

Changing an element value

If the value of the element is changed (the bit that currently says “Original Value”) then we don’t get one event fired. We get two events fired. For example:

child.Value = "New Value";

The first event with ObjectChanged set to Remove and the sender set to “Orginal Value” (which is actually an XText object) and the second event with the ObjectChanged set to Add and the sender set to “New Value” (again, this is actually an XText object).

Changing an element name

If the name of the element is changed then the ObjectChanged property will be set to Name and the sender will be the XElement that has changed.

child.Name = "JustTheChild";

Changing an attribute name

Unlike changing an element value, when the value of an attribute changes the ObjectChanged property will be Value and the sender will be the XAttribute.

child.Attribute("TheAttribute").Value = "New Attribute Value";

Technorati Tags: ,,,

Navigating XML (LINQ to XML series – part 4)

In my last few posts on LINQ to XML (part 1, part 2 and part 3) I’ve shown you a starter on navigating around XML data. In this post I’ll continue to show you how to navigate through XML data by showing you how to navigate around sibling elements.

First consider this code:

XElement root = new XElement("root",
    new XElement("FirstChild"),
    new XElement("SecondChild"),
    new XElement("ThirdChild"),
    new XElement("FouthChild"),
    new XElement("FifthChild"));

Which produces the following XML structure:

<root>

<FirstChild />

<SecondChild />

<ThirdChild />

<FouthChild />

<FifthChild />

</root>

We can access the ThirdChild with this code:

XElement child = root.Element("ThirdChild");

From that point, we can also get access to its siblings.

To access the siblings that occur before the element we have a reference to then we can use ElementsBeforeSelf. As with Elements this returns an IEnumerable<XElement> object which allows us to iterate over the result, like this:

IEnumerable<XElement> elements = child.ElementsBeforeSelf();

foreach (XElement element in elements)
    Console.WriteLine(element);

The result is:

<FirstChild />

<SecondChild />

Conversely, we can get the siblings that come after the element we have a reference to with ElementsAfterSelf. Like this:

IEnumerable<XElement> elements = child.ElementsAfterSelf();

The result in this case will be:

<FouthChild />

<FifthChild />

Technorati Tags: ,,,

Navigating XML (LINQ to XML series – Part 3)

In my last two posts (part 1 and part 2) I’ve been introducing you to the the new XML classes in .NET 3.5. In this post I’ll continue that and show you some of the ways to navigate through XML.

First of all, lets start with a simple hierarchy of XML elements:

XElement root = new XElement("FirstGeneration",
                    new XElement("SecondGeneration",
                        new XElement("ThirdGeneration",
                            new XElement("FourthGeneration"))));

Which looks like this when rendered as XML:

<FirstGeneration>

<SecondGeneration>

<ThirdGeneration>

<FourthGeneration />

</ThirdGeneration>

</SecondGeneration>

</FirstGeneration>

Also in the last post I used Element to get a specific element from the current element. For example:

XElement child = root.Element("SecondGeneration");

Elements

If root (or FirstGeneration) only had one child element called “SecondGeneration” then everything is fine, you get what you asked for. However, if it contains multiple children all called “SecondGeneration” then you will only get the first element called “SecondGeneration”.

For example, if you add the following to the code above:

root.Add(new XElement("SecondGeneration","2"));
root.Add(new XElement("SecondGeneration","3"));

You will get a piece of XML that looks like this:

<FirstGeneration>

<SecondGeneration>

<ThirdGeneration>

<FourthGeneration />

</ThirdGeneration>

</SecondGeneration>

<SecondGeneration>2</SecondGeneration>

<SecondGeneration>3</SecondGeneration>

</FirstGeneration>

If you want to get all those additional children called “SecondGeneration” you will need to use the Elements (note the plural) method. For example:

IEnumerable<XElement> children = root.Elements("SecondGeneration");

You’ll also note that we don’t get a collection returned but an enumerable. This give us the opportunity to exploit many of the new extension methods. But I’ll leave them for another post. For the moment, we just need to know that it make it easy for us to enumerate over the data using a foreach loop. For example:

foreach (XElement child in children)
{
    Console.WriteLine(child);
    Console.WriteLine(new string('-', 50));
}

This will write out:

<SecondGeneration>

<ThirdGeneration>

<FourthGeneration />

</ThirdGeneration>

</SecondGeneration>

————————————————–

<SecondGeneration>2</SecondGeneration>

————————————————–

<SecondGeneration>3</SecondGeneration>

————————————————–

Parent

Using the same root object as above, we can see how to navigate back up the XML tree using Parent.

XElement grandchild = root.Element("SecondGeneration").Element("ThirdGeneration");
Console.WriteLine(grandchild.Parent);

The result of the code will be that the SecondGeneration element is printed.

 

Technorati Tags: ,,,

Crazy Extension Method

Here is an example of a crazy extension method that alters the semantics of method calling.

First the extension method:

public static class MyExtensions
{
    public static bool IsNullOrEmpty(this string target)
    {
        return string.IsNullOrEmpty(target);
    }
}

Instead of calling the static method IsNullOrEmpty() on string, we are turning it around to allow it to be called on a string type like an instance method. However, as you can probably tell, it may be called when the reference to the string is null. Normally this would result in an exception to say that you are attempting to call a method on a null value. However, this is an extension method and it actually works with nulls! This is probably not the best idea in the world, to be diplomatic about it.

Here is some calling code:

string a = null;
Console.WriteLine(a.IsNullOrEmpty());

Normally, an exception will be thrown if IsNullOrEmpty() is a real method. However, it isn’t in this case and the application happily writes “True” to the console.

 

Fizz-Buzz in LINQ

This just occurred to me. It is somewhat pointless, but I thought it was interesting:

static void Main(string[] args)
{
    var result = from say in Enumerable.Range(1, 100)
                 select (say % 15 == 0) ? "BuzzFizz" :
                    (say % 5 == 0) ? "Buzz" :
                        (say % 3 == 0) ? "Fizz" : say.ToString();
    foreach (string say in result)
        Console.WriteLine(say);
}
Technorati Tags: ,,