Navigating XML (LINQ to XML series – Part 3)

In my last two posts (part 1 and part 2) I’ve been introducing you to the the new XML classes in .NET 3.5. In this post I’ll continue that and show you some of the ways to navigate through XML.

First of all, lets start with a simple hierarchy of XML elements:

XElement root = new XElement("FirstGeneration",
                    new XElement("SecondGeneration",
                        new XElement("ThirdGeneration",
                            new XElement("FourthGeneration"))));

Which looks like this when rendered as XML:

<FirstGeneration>

<SecondGeneration>

<ThirdGeneration>

<FourthGeneration />

</ThirdGeneration>

</SecondGeneration>

</FirstGeneration>

Also in the last post I used Element to get a specific element from the current element. For example:

XElement child = root.Element("SecondGeneration");

Elements

If root (or FirstGeneration) only had one child element called “SecondGeneration” then everything is fine, you get what you asked for. However, if it contains multiple children all called “SecondGeneration” then you will only get the first element called “SecondGeneration”.

For example, if you add the following to the code above:

root.Add(new XElement("SecondGeneration","2"));
root.Add(new XElement("SecondGeneration","3"));

You will get a piece of XML that looks like this:

<FirstGeneration>

<SecondGeneration>

<ThirdGeneration>

<FourthGeneration />

</ThirdGeneration>

</SecondGeneration>

<SecondGeneration>2</SecondGeneration>

<SecondGeneration>3</SecondGeneration>

</FirstGeneration>

If you want to get all those additional children called “SecondGeneration” you will need to use the Elements (note the plural) method. For example:

IEnumerable<XElement> children = root.Elements("SecondGeneration");

You’ll also note that we don’t get a collection returned but an enumerable. This give us the opportunity to exploit many of the new extension methods. But I’ll leave them for another post. For the moment, we just need to know that it make it easy for us to enumerate over the data using a foreach loop. For example:

foreach (XElement child in children)
{
    Console.WriteLine(child);
    Console.WriteLine(new string('-', 50));
}

This will write out:

<SecondGeneration>

<ThirdGeneration>

<FourthGeneration />

</ThirdGeneration>

</SecondGeneration>

————————————————–

<SecondGeneration>2</SecondGeneration>

————————————————–

<SecondGeneration>3</SecondGeneration>

————————————————–

Parent

Using the same root object as above, we can see how to navigate back up the XML tree using Parent.

XElement grandchild = root.Element("SecondGeneration").Element("ThirdGeneration");
Console.WriteLine(grandchild.Parent);

The result of the code will be that the SecondGeneration element is printed.

 

Technorati Tags: ,,,

Fizz-Buzz in LINQ

This just occurred to me. It is somewhat pointless, but I thought it was interesting:

static void Main(string[] args)
{
    var result = from say in Enumerable.Range(1, 100)
                 select (say % 15 == 0) ? "BuzzFizz" :
                    (say % 5 == 0) ? "Buzz" :
                        (say % 3 == 0) ? "Fizz" : say.ToString();
    foreach (string say in result)
        Console.WriteLine(say);
}
Technorati Tags: ,,

Getting values out of XML in .NET 3.5 (LINQ to XML series part 2)

In my last post I gave a brief introduction to some of the new XML classes available in .NET 3.5. In this post I’ll continue that introduction by explaining how to get information out of the XML.

First off, lets assume we have some XML that looks like this:

XElement root = new XElement("root",
    new XAttribute("Attribute", "TheValue"),
    new XElement("FirstChild"),
    new XElement("SecondChild", new XElement("Grandchild", "The content of the grandchild")));

or, if you prefer in XML format, like this:

<root Attribute=”TheValue”>

<FirstChild />

<SecondChild>

<Grandchild>The content of the grandchild</Grandchild>

</SecondChild>

</root>

There are a number of ways to get the content of the grandchild element. For example:

Console.WriteLine(root.Element("SecondChild").Element("Grandchild").Value);

Value returns a string which contains the content of the element specified. In the above case it will output:

The content of the grandchild

However, you need to watch out for when there are child elements of the thing you want the value of as their content is included when you get the value. For example, if the above XML is extended so that it looks like this:

<root Attribute=”TheValue”>

<FirstChild />

<SecondChild>

<Grandchild>The content of the grandchild

<Great-grandchild>GGC content</Great-grandchild>

<Grandchild>

</SecondChild>

</root>

And the above line of C# is executed again, the result is now:

The content of the grandchildGGC content

As you can see the content of the element you want plus its child elements are now returned. This may not necessarily be what you want.

There is a second way to get the content from an element. That is to use a casting operator. You can cast the element to a number of types. In this case to a string. for example:

Console.WriteLine((string)root.Element("SecondChild").Element("Grandchild"));

The result is the same as calling Value on the element.

Be careful here, because casting an element to a string will not have the same result as calling ToString() on an element. You can see that if you simply pass the element itself to writeline (which will then call ToString() for you). For example:

Console.WriteLine(root.Element("SecondChild").Element("Grandchild"));

The result is:

<Grandchild>The content of the grandchild

<Great-grandchild>GGC content</Great-grandchild>

</Grandchild>

The process is similar if you are dealing with attribute. Using the above XML as an example, an attribute value can be retrieved like this:

Console.WriteLine(root.Attribute("Attribute").Value);

Or, using the cast operator to a string like this:

Console.WriteLine((string)root.Attribute("Attribute"));

Both of the above pieces of code output the same thing: TheValue

If you were to call WriteLine on an XAttribute object you’ll see that the ToString() method returns something slightly different. It returns this: Attribute=”TheValue”

Lets say, for instance, that the Attribute had a value of 123.456, which is a valid number representation. I mentioned earlier about casting operators on XElement and XAttribute. Well, you can cast this to a double if you prefer to get the value in that type. There is no tedious converting in your own code as the framework can handle that for you. For example:

(double)root.Attribute("Attribute")

That’s it for this post. There will be more on XML and LINQ soon.

Introduction to LINQ to XML

Last year I wrote about the new languages features available in C# 3.0 (Anonymous Types, Extension Methods, Automatic Properties, A start on LINQ, Object Initialisers I, Object Initialisers II, & Object Initialisers III) and since then I’ve really got in to LINQ, especially LINQ to XML. The reason for that is that I hate XPath and I see LINQ to XML as a much easier way of querying XML files without faffing about with terse XPath strings. I would much rather have the ability to easily see what is going on with the query than have to figure out why my XPath isn’t working for me.

However, LINQ to XML is more than just new funky querying mechanisms. There is a whole new set of classes to deal with XML that are much easier and more intuitive than the classes that were provided back with .NET 1.0, in my opinion.

The main two classes in the new way of doing XML are XElement and XAttribute. For example, to create a new element:

XElement root = new XElement("root");

And to add an attribute to that element:

root.Add(new XAttribute(“AttributeName”, “TheValue”));

Which produces the result: <root AttributeName=”TheValue” />

If you look at the intellisense for XElement constructor you’ll see that none of the 5 overloads takes a string. The nearest is an XName. This is because there is an implicit conversion happening between a string and an XName so that creating XElements does not have to be so arduous. It would be quite irritating to have to declare XElement objects like this:

XElement root = new XElement(XName.Get("root"));

At this point you’ll find that all the VB developers will be gloating because VB9 contains a feature called XML Literals whereby the developer can just write XML directly into the source code file and VB will parse and compile it correctly. An incredibly handy feature I’m sure you’ll agree. But, since I’m a C# developer that’s what I’ll stick with – especially considering that the majority of demos of LINQ to XML I’ve seen are VB based.

If you look closely at XName’s Get method you’ll see that there are two overrides, one for an expanded name, and the other for a local name and a namespace name. The expanded name is just a string of the name with the namespace embedded in the string inside curly braces, like this:

XName.Get("{mynamespace}root");

If you prefer you can use the other overloaded version and provide two strings. The equivalent XName in that case would be created like this:

XName.Get("root", "mynamespace");

Now, you are probably wondering why a static method is being used rather than a constructor. This is because the XML classes are clever enough to reuse existing XName objects. If you create a second XName object with the same characteristics as an existing XName object it will just reuse the existing XName. For example, the following code will output “True” to the console:

XName name1 = XName.Get("{ns}MyName");
XName name2 = XName.Get("MyName", "ns");
Console.WriteLine(object.ReferenceEquals(name1, name2));

XName is immutable (it cannot change) so this is a perfectly acceptable thing to do.

The extended name notation also works if you are using strings while constructing your XElement. For example:

XElement root = new XElement("{mynamespace}root");

However, there is another way of applying namespaces in an XElement. You can use an XNamespace object and add it to the string. Like this:

XNamespace ns = XNamespace.Get("mynamespace");
XElement root = new XElement(ns + "root");

As you can probably tell the + operator has been overloaded so it can be used to add a namespace to a string to produce an XName.

Technorati Tags: ,,,,,

A start on LINQ

I was at the Microsoft MSDN Roadshow today and I got to see some of the latest technologies being demonstrated for the first time and I’m impressed.

Daniel Moth’s presentation on the Language Enhancements and LINQ was exceptional – It really made me want to be able to use that technology now.

What was interesting was that the new enhancements don’t require a new version of the CLR to be installed. It still uses the Version 2.0 CLR. It works by adding additional stuff to the .NET Framework (this is .NET Framework 3.5) and through a new compiler (the C# 3.0 compiler). The C# 3.0 compiler produces IL that runs against CLR 2.0. In essence, the new language enhancements are compiler tricks, which is why the CLR doesn’t need to be upgraded. Confused yet?

Now that the version numbers of the various components are diverging it is going to make things slightly more complex. So here is a handy cheat sheet:

2002 2003 2005 2006 2007ish
Tool VS.NET 2002 VS.NET 2003 VS 2005 VS 2005
+ Extension
“Orcas”
Language (C#) v1.0 v1.1 v2.0 v2.0 v3.0
Framework v1.0 v1.1 v2.0 v3.0 v3.5
Engine (CLR) v1.0 v1.1 v2.0 v2.0 v2.0

The rest of Daniel’s talk was incredibly densely packed with information. Suffice to say, at the moment, LINQ is going to provide some excellent and powerful features, however, it will also make it very easy to produce code that is very inefficient if wielded without understanding the consequences. The same can be said of just about any language construct, but LINQ does do a remarkable amount in the background.

After the session I was speaking with Daniel and we discussed the power of the feature and he said that, since C#3.0 produces IL2.0 it is possible to use existing tools, such as Lutz Roeder’s Reflector, to see exactly what is happening under the hood. An examination of that will yield a better understanding of how LINQ code is compiler.

LINQ code looks similar to SQL. For example:

var result =
    from p in Process.GetProcesses()
    where p.Threads.Count > 6
    orderby p.ProcessName descending
    select p

This allows the developer to write set based operations in C# a lot more easily than before. A rough equivalent in C# 2.0 to do the same thing would probably look something like this:

List<Process> result = new List<Process>();
foreach(Process p in Process.GetProcesses)
{
    if (p.Threads.Count > 6)
        result.Add(p);
}
result.Sort(new DescendingProcessNameComparer());

* NOTE: Assumes that DescendingProcessNameComparer is an existing comparer that compares two Process objects by their name in descending order.

C# 3.0 introduces the var keyword. This is unlike var in javascript or VB. It is not a variant type. It is strongly typed and the compiler will complain if it is used incorrectly. For example:

var i = 5;
i = "five"; // This will produce a compiler error because i is an integer

In short this was only a fraction of what I learned from just one session – I’ll continue the update as I can.

Tags:

NOTE: This post was rescued from the Google Cache. The original date was Monday, 5th March 2007.


Original comments:

Glad you enjoyed it Colin 🙂

Be sure to check out the written version of my talk on my blog!

3/6/2007 11:34 AM | Daniel Moth