Handling bounces on Amazon SES

If you send to an email that does not exist, Amazon SES will perform some handling of the bounce before passing the details on to you.

When you send email through Amazon SES you may notice that the email arrives with a Return Path that looks something like this: 00000331b8b1d648-b8302192-701f-124d-a1d5-d268912677de-135246@email-bounces.amazonses.com

As it happens, the large delimited hex number before the @ sign is the same value that you got back from the SendEmail or SendRawMail response. (If you’re unfamiliar with sending an email see previous posts on SendEmail and SendRawEmail.)

// client is a AmazonSimpleEmailServiceClient
// request is a SendEmailRequest
SendEmailResponse response = client.SendEmail(request);
string messageId = response.SendEmailResult.MessageId;

When the email bounces, it will go first to Amazon SES where they will note which email bounced. Then the email will be forwarded on to you and you will receive the bounced email. (Be aware, tho’, that the email may end up in your spam folder – they did for me). Exactly where the bounce email will go depends on the API call you are using and the fields that you have populated in the outgoing email. The rules are detailed on the Bounce and Complaints notifications page of the Amazon SES Developer’s Guide.

If you look in the headers of this email you’ll see that Message Id again in various parts of the header. e.g.

X-Original-To: 00000331b8b1d648-b8302192-701f-124d-a1d5-d268912677de-135246@email-bounces.amazonses.com
Delivered-To: 00000331b8b1d648-b8302192-701f-124d-a1d5-d268912677de-135246@email-bounces.amazonses.com
Message-Id: <00000331b8b1d648-b8302192-701f-124d-a1d5-d268912677de-135246@email.amazonses.com>

How you process these bounces on your side is up to you. Amazon do not, yet (I’m hopeful they will and it has been requested a lot) provide an automated way of using the API for querying which emails are bouncing, are complained about or are rejected.

At present the best detail you are going to get on bounced emails is in the aggregate data provided through the GetSendStatistics API call or via the graphs on the AWS Console.

What happens if I send more email to an address that bounced?

If you continue to send emails to an address that bounces you will get a MessageRejectedException when you call SendEmail or SendRawEmail with the message “Address blacklisted.”

Conclusion on bounce handling

At present bounce handling using Amazon SES isn’t great (but it’s certainly no better than using a plain old SMTP service) however Amazon do appear to be interested in providing better support for handling bounces and the like. It may very well be better supported in the future.

Verifying Senders with Amazon SES

I’ve already written a couple of pieces about Amazon Simple Email Service (SES) on sending Email and sending emails with attachments.

Why do you have to verify senders?

It is important to note that while in development mode you have to verify all recipients and senders, in production mode you still have to verify the senders (this is, presumably, an anti-spam measure to ensure the high quality of email).

If you attempt to send an email from an email address that is not registered you will get a MessageRejectedException when you call SendEmail or SendRawEmail with the message “Email address is not verified”.

Listing and verifying senders

You can add and view senders in via AWS Console which is fine if all you need is to add the odd sender now and again. However, if your application is going to send on behalf of a number of people then you need a way to automate this.

The AWS API contains three methods that help with managing verified email addresses. You can VerifyEmailAddress, DeleteVerifiedEmailAddress and ListVerifiedEmailAddresses.

To Verify an email address

Here is the code to verify an email address

var config = new AmazonSimpleEmailServiceConfig();
var client = new AmazonSimpleEmailServiceClient(config);
VerifyEmailAddressRequest request = new VerifyEmailAddressRequest();
request.EmailAddress = "joe.bloggs@example.com";
var response = client.VerifyEmailAddress(request);

The an email will be sent to the email address listed

from        no-reply-aws@amazonaws.com via email-bounces.amazonses.com
to:         joe.bloggs@example.com
date:       13 November 2011 15:08
subject:    Amazon SES Address Verification Request
mailed-by:  email-bounces.amazonses.com

Dear Amazon SES customer:

We have received a request to authorize an email address for use with Amazon
SES.  To confirm that you are authorized to use this email address, please go
to the following URL:


Your request will not be processed unless you confirm the address using this

To learn more about sending email from Amazon SES, please refer to the Amazon
SES Developer Guide.

Sincerely, Amazon Web Services

Once you’ve clicked the link you’ll get a page with a message like this:


You have successfully verified an email address with Amazon Simple Email Service. You can now begin sending email from this address.

If you are a new Amazon SES user and have not yet received production access to Amazon SES, then you can only send email to addresses that you have previously verified. To view your list of verified email addresses, go to the AWS Management Console or refer to the Amazon SES Developer Guide.

If you have already been approved for production access, then you can send email to any address.

Thank you for using Amazon SES.

Once this message has been displayed the email addresses will be displayed in the SES Console and you will be able to send email from this email address (in development mode it also means you will be able to send email to the address)

Listing the verified email addresses

In order to check the email addresses that have passed through the verification process you can use the method ListVerifiedEmailAddresses.

var config = new AmazonSimpleEmailServiceConfig();
var client = new AmazonSimpleEmailServiceClien(config);
var request = new ListVerifiedEmailAddressesRequest();
var response = client.ListVerifiedEmailAddresses(request);
var result = response.ListVerifiedEmailAddressesResult;
List<string> addresses = result.VerifiedEmailAddresses;

The addresses that have been successfully verified will be listed in the addresses list.

If the email goes out (from VerifyEmailAddress or from the AWS Console), and it the address is not yet verified then it won’t appear in the list.

Removing a verified email address

If you no longer need to send from an email address you can use the DeleteVerifiedEmailAddress method.

var config = new AmazonSimpleEmailServiceConfig();
var client = new AmazonSimpleEmailServiceClient(config);
var request = new DeleteVerifiedEmailAddressRequest();
request.EmailAddress = viewModel.NewEmailAddress;
var response = client.DeleteVerifiedEmailAddress(request);

Sending more than a basic email with Amazon SES

Previously, I wrote about getting started with Amazon’s Simple Email Service, and I included details of how to send a basic email. The SendEmail method is excellent at sending basic emails with HTML or Text bodies. However, it doesn’t handle attachments. For that, you need to use SendRawEmail.

SendRawEmail doesn’t give you much functionality. In fact, you have to do all the work to construct the email yourself. However, it does mean that you can do pretty much what you need with the email.

There are still some limitations. Amazon imposes a 50 recipient limit per email, a maximum 10Mb per email, and you can only add a small number of file types as an attachment. This is, I suspect, in order to reduce the ability for people to use the service to spam and infect other people while permitting most of all legitimate uses for the service.

Building an email

When I said that you have to do all the work to construct the email, I really did mean that. You have to figure out the headers, the way the multi-part MIME is put together the character encoding (because email is always sent using a 7-bit encoding) and so on.

I tried to do this, and it it was most frustrating work. The tiniest thing seemed to put Amazon SES into a sulk.

However, I did find a piece of code that someone else had written to do the heavy work for me. Essentially, what he’s doing is constructing a mail message using the built in System.Net.Mail.MailMessage type in .NET and then using .NET’s own classes to create the raw mail message as a MemoryStream, which is what Amazon SES wants.

I’ve refactored the code in the linked post so that it is slightly more efficient if you are calling it multiple times. It uses reflection, and some of the operations need only be carried out once regardless of the number of times you generate emails, so it removes those bits off to a static initialiser so that they only happen the once.

Here’s my refactored version of the code:

public class BuildRawMailHelper
    private const BindingFlags nonPublicInstance =
        BindingFlags.Instance | BindingFlags.NonPublic;

    private static readonly ConstructorInfo _mailWriterContructor;
    private static readonly MethodInfo _sendMethod;
    private static readonly MethodInfo _closeMethod;

    static BuildRawMailHelper()
        Assembly systemAssembly = typeof(SmtpClient).Assembly;
        Type mailWriterType = systemAssembly

        _mailWriterContructor = mailWriterType
            .GetConstructor(nonPublicInstance, null,
                new[] { typeof(Stream) }, null);

        _sendMethod = typeof(MailMessage).GetMethod("Send",

        _closeMethod = mailWriterType.GetMethod("Close",

    public static MemoryStream ConvertMailMessageToMemoryStream(
        MailMessage message)
        using (MemoryStream memoryStream = new MemoryStream())
            object mailWriter = _mailWriterContructor.Invoke(
                new object[] {memoryStream});

            _sendMethod.Invoke(message, nonPublicInstance, null,
                                new[] {mailWriter, true}, null);

            _closeMethod.Invoke(mailWriter, nonPublicInstance,
                null, new object[] {}, null);

            return memoryStream;

At first glance, the fact that the MemoryStream is disposed of does seem a bit counter-intuitive, however some methods of MemoryStream still function when the stream is closed, such as ToArray().

Incidentally, if you want to see what the raw email looks like you can use a piece of code like this to get the raw email as a string:

MemoryStream memoryStream =
byte[] data = rawMessage.Data.ToArray();
using (StreamReader reader = new StreamReader(new MemoryStream(data)))
    string rawMail = reader.ReadToEnd();

Using SendRawEmail

Because you’re doing all the work, the code that actually interacts with Amazon SES is very simple.

// mailMessage is an instance of a System.Net.Mail.MailMessage
var config = new AmazonSimpleEmailServiceConfig();
var client = new AmazonSimpleEmailServiceClient(config);
SendRawEmailRequest request = new SendRawEmailRequest();
request.RawMessage = new RawMessage();
request.RawMessage.Data = BuildRawMailHelper
var response = client.SendRawEmail(request);

And that’s it. You can now send emails with attachments, and anything else you can do with a MailMessage.

First(OrDefault) Vs. Single(OrDefault)

There are two mechanisms (each with an …OrDefault variant) in LINQ for getting one item out of an enumeration. They are First and Single. There is a difference between the two and you can produce code that functions incorrectly if the wrong one is used.

So, what’s the main difference? They both sound like they’ll return just one item out from the enumeration. And, indeed, they do.

First will return the first item that it encounters that matches the predicate (if supplied). Whereas Single will return the one and only item that it encounters that matches the predicate (if supplied). If Single encounters a second item that matches the predicate then it throws an exception. If no predicate is supplied, it throws an exception simply if the enumeration has more that one item.

Why would there be two things that do almost the same thing that are so subtly different? First exists so that you can get the first item regardless of how many items there may actually be. Single exists to get you the one and only item. Single is useful when your predicate operates on a primary key. For example:

data.Single(d => d.PrimaryKey == idToMatch)

The …OrDefault variants will return the default value for the type (for reference types that will be null) if there are no matches found. Otherwise, both First and Single throw an exception if no items are encountered.

Lets look at some code.


string[] data = new[]{"Zero", "One", "Two", "Three",
    "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"};
var first = data.First();

In this case, first will contain the value of "Zero".

If a predicate is added to the call to First then we can see what happens if there is no match.

string[] data = new[]{"Zero", "One", "Two", "Three",
    "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"};
var first = data.First(x => x.Length > 10);

In this case, there are no matches, and an InvalidOperationException is thrown with the message “Sequence contains no matching element”

The same thing will happen if the initial set of data is empty

string[] empty = new string[0];
var first = empty.First();

You can happily supply a predicate that may match more than one item in the enumeration


For example

string[] onlyOneItem = new string[]{"Only item"};
var single = onlyOneItem.Single();

This will return the one and only item that matches.

string[] data = new[]{"Zero", "One", "Two", "Three",
    "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"};
var single = data.Single();

This will thrown an exception. If result set contains more than one item an InvalidOpertationException will be thrown with a message of “Sequence contains more than one element”

string[] empty = new string[0];
var single = empty.Single();

This will throw exactly the same exception as it’s First counterpart; an InvalidOperationException is thrown with the message “Sequence contains no matching element”


This is where things get a little bit more interesting. This says that if the result set contains zero items null (for reference types) is returned. In the case of First, the result set can contain zero, one or many items and it won’t throw an exception. In the case of Single only result sets containing zero or one item will return while any more will result in an exception.

So… what about this scenario:

string[] data = new[]{null, "Zero", "One", "Two", "Three",
    "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"};
var first = data.FirstOrDefault();

The first value of the set is genuinely null. How do you tell the difference between that and the result set being simply empty without throwing an exception?

You could just go back to using the First variant and catching the exception. Or you could (if your result set can be enumerated many times without issue, e.g. the underlying object is an Array or List) use Any to test if the set contains any data in advance. Like this:

string[] data = new[]{null, "Zero", "One", "Two", "Three",
    "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"};
if (data.Any())
    var first = data.FirstOrDefault();
    // Do stuff with the value

Tip of the day: Expire a cookie, don’t remove it

I recently found a bug in my code that I couldn’t fathom initially until I walked through the HTTP headers in firebug. In short, you cannot simply remove a cookie by calling Remove(cookieName) on the HttpCookieCollection. That will have no effect. You have to expire the cookie in order for it to be removed.

In other words, you need code like this:

HttpCookie cookie = new HttpCookie("MyCookie");
cookie.Expires = DateTime.UtcNow.AddYears(-1);

When you create a cookie, the response from the server will contain an HTTP Header called Set-Cookie that contains the value of the cookie.

For example, if we create a cookie like this:

HttpCookie cookie = new HttpCookie("MyCookie");
cookie.Value = "The Value of the cookie";

Then the Response will contain this:

Set-Cookie    MyCookie=The Value of the cookie; path=/

Each subsequent request to the server will contain the cookie, like this:

Cookie        MyCookie=The Value of the cookie

The responses from the server do not contain the cookie unless the server is updating the value of the cookie.

When the cookie is to be removed forcefully, the server must update the cookie with a new expiry, like this:

HttpCookie cookie = new HttpCookie("MyCookie");
cookie.Expires = DateTime.UtcNow.AddYears(-1);

The response will then have this header:

Set-Cookie    MyCookie=; expires=Mon, 20-Sep-2010 21:32:53 GMT; path=/

And in subsequent requests the cookie won’t be present any more as the browser will have removed it.

Installing a web site on a new server

Here are some blog posts that have been useful to me lately when I got caught out installing a website on a new server (I will eventually get that automated build and deploy process actually performing the deploy step successfully!!)

The configuration section ‘system.web.extensions’ cannot be read because it is missing a section declaration:

While installing a website on a new Windows Server I came across this error. In short, it was because the App Pool was set up as a .NET 2.0 application rather than a 4.0. The blog post explains what was going on and how to fix it.

[Resolved] Could not load file or assembly ‘XXXXX’ or one of its dependencies. An attempt was made to load a program with an incorrect format:

Although this didn’t help me in the end, it does suggest a solution. In my case, because of a third-party dependency that requires an x86 build, it couldn’t be used. In time that dependency will be removed, in the meantime the following was more helpful to me…

Could not load file or assembly ‘PresentationCore’ or one of its dependencies. An attempt was made to load a program with an incorrect format. : A solution:

This post did give me the pointer I needed to the setting that had to be changed to get the web site working.

LINQ query performance

A while ago I was reviewing some code and I came across some code that looked like this

if (corpus.Where(a => a.SomeProperty == someValue).Count() > 0)
    // Do Stuff

And it got me thinking that it may not be the best way to do this. What is really being asked here is: “Are there any items in the enumerable?” The count is not actually important in this situation. I considered that it would probably be more efficient to write:

if (corpus.Where(a => a.SomeProperty == someValue).Any())
    // Do stuff

Then I read somewhere (unfortunately, I didn’t note the URL) that for certain situations the .Any() extension method on IEnumerable<T> can be inefficient in certain scenarios. For instance, if concrete type is actually a List<T> which maintains its own Count. In that instance the cost of setting up the Enumerator and calling MoveNext() to determine the existence of at least one element would be more expensive an operation than calling Count on the List<T>.

I was curious about that so I set about working out the relative performance characteristics of a number of the LINQ extension methods. I should note that these were all on LINQ to Objects out of the box so don’t measure how these methods would perform relatively for things like, say, LINQ to SQL.

I tested various scenarios, some where the IEnumerable<T> is a lightweight generator of elements, in this case an Enumerable.Range(…), in other cases I used a List<T> either by a concrete reference or by an reference to the IEnumerable<T> interface.

All timings in this post relate to my desktop machine which is running Windows 7 64bit with 8Gb RAM and an AMD Phenom II X4 955 running at 1.6GHz (which for some unknown reason it won’t run at the full 3.2GHz)

Counting elements

In the first set of tests I counted the number of elements. For the cases where I called the Count property directly on the List<T> and used the Count() extension method on IEnumerable<T> where the concrete type was the List<T> the result was returned in O(1). The LINQ method was 24 times slower.

Where the IEnumerable<T> did not also implement the ICollection<T> interface (as in the case where the values were being generated by Enumerable.Range(…) method) the Count() extension method took O(n) time to return the answer.

The graph above shows the number of Ticks (vertical axis) taken to complete the counting task with n (horizontal axis) elements. A tick is roughly 1/1600th of a millisecond. So for 2000000 elements it took 72.5ms to count them.

Compare that for instances where the Count property was called directly (0.00413 Ticks or 0.00258µs [millionths of a second]) or where the Count() method was called on something that could be cast to an ICollection or ICollection<T> (0.0989 Ticks or 0.0618µs)

So far it looks good for cases where the underlying type implements the ICollection<T> or ICollection interface. However remember as soon as you start filtering the data (e.g. with a Where() method call) then you are returning an IEnumerable<T> which then operates in O(n) time. Also remember that the Where() clause will add some overhead as it has to process the filter as well.

Any elements

It should be no surprise that using our test set of a List<T> and an Enumerable.Range(…) the Any() method runs in O(1) time. Both took similar amounts of time, the former taking 0.278 Ticks (0.174µs) per call, and the latter taking 0.296 Ticks (0.185µs) per call. I suspect that time on the latter is more due to the the small amount of additional time taken to generate additional elements as the enumerator progresses.

However, if you have a reference to something that already implements ICollection<T> which defines a Count property, such as a List<T>, you may find it is faster to perform (corpus.Count>0). I found that for the List<T> I’d created for the test runs it was only marginally slower than the raw call to Count taking 0.00607 ticks (0.00379µs) per call.

Any elements with filter

If you have a filter (a Where clause) then Any may take longer that O(1). It will take as long as it takes to find anything that matches the filter or O(n) if nothing matches the filter.

I ran three tests, one where the filtered condition was met on the first element, one where the condition was met in the middle of the set and one where the condition was not met until the last element.


If you have a concrete type the performance is better when using the Count property both for cases when you need to know the number of elements in the corpus or when you need to know if there any any elements at all.

If you simply need to know if there are any elements at all in the corpus then the use of Any() works out better than using the LINQ extension method Count() as Count() must traverse the entire corpus (unless it derives from ICollection<T> whereas Any() will short circuit at the first available opportunity.

Tip of the day: Splitting a string when encountering whitespace

In .NET the string class has a Split method that splits the string at the separator character(s) that you specify. However, if you want to split the string at any instance of whitespace you don’t have to create a Split call that enumerates all those different types of whitespace… and there are actually quite a lot! Instead you can just call Split without any parameters and it will split at whitespace regardless of the type.

For example, the following program, in which I hope I’ve managed to use all the different types of whitespace in Unicode, will produce the output below:

static void Main(string[] args)
  string source = "Anu0020inspiredrcalligrapherncanu1680createu180epagesu2000ofu2001"+
    "Whitespaceu0085Foru00a0the win!";

  string[] words = source.Split();

  foreach(string word in words)

Produces this output:




Building messages in parallel

I recently saw some code where the developer was attempting to build up messages inside tasks that were being reported outside of the task.

In a sequential system it is easy enough to do this. You have various options available to you, such as

  • message += …;
  • StringBuilder
  • Streams

However, in a parallel system these all fall down because you lose control over the sequencing. You can regain some control by using appropriate locks but then you add in bottlenecks around the synchronisation points which is something you want to minimise in a parallel system.

I’ll show you what I mean. Each example below is attempting to build up a large message containing messages from smaller subroutines. For the moment, let’s assume that the exact order of the individual messages are not important. It may be a series of log entries, or a list of errors to correct. The only important thing is that each individual message is not garbled in anyway. [Skip the code]

The example message is actually just a set of letters and numbers. In the final message each letter must appear 10 times and each number 26 times. Once the tasks have finished, the final messages are examined to see what happened.

Sequential Reference code

Here is the code:

class Program
    static void Main(string[] args)
        string result = SequentialReference();


        Console.WriteLine("Program finished");

    private static string SequentialReference()
        string result = string.Empty;

        for(int i=0; i<10; i++)
            for(char c='A'; c<='Z'; c++)
                result += string.Format("{0}{1}", i, c);
            result += Environment.NewLine;

        return result;

    private static void ShowResult(string message)
        // Code to display the message and the
        // results of the tests

The code generates the messages, then outputs the results. For the reference sequential code (which is what we want all the results to look like) we get:


Does the result contain all the necessary parts?
10 of each letter; 26 of each number
0: 26 occurrences: Pass
1: 26 occurrences: Pass
2: 26 occurrences: Pass
3: 26 occurrences: Pass
4: 26 occurrences: Pass
5: 26 occurrences: Pass
6: 26 occurrences: Pass
7: 26 occurrences: Pass
8: 26 occurrences: Pass
9: 26 occurrences: Pass
A: 10 occurrences: Pass
B: 10 occurrences: Pass
C: 10 occurrences: Pass
D: 10 occurrences: Pass
E: 10 occurrences: Pass
F: 10 occurrences: Pass
G: 10 occurrences: Pass
H: 10 occurrences: Pass
I: 10 occurrences: Pass
J: 10 occurrences: Pass
K: 10 occurrences: Pass
L: 10 occurrences: Pass
M: 10 occurrences: Pass
N: 10 occurrences: Pass
O: 10 occurrences: Pass
P: 10 occurrences: Pass
Q: 10 occurrences: Pass
R: 10 occurrences: Pass
S: 10 occurrences: Pass
T: 10 occurrences: Pass
U: 10 occurrences: Pass
V: 10 occurrences: Pass
W: 10 occurrences: Pass
X: 10 occurrences: Pass
Y: 10 occurrences: Pass
Z: 10 occurrences: Pass
Does the result contain correctly sequenced individual messages?
Each sequence 52 chars; 0A0B0C... 1A1B1C.... etc.
Message 0: PASS - 52 char; PASS - Message content as expected
Message 1: PASS - 52 char; PASS - Message content as expected
Message 2: PASS - 52 char; PASS - Message content as expected
Message 3: PASS - 52 char; PASS - Message content as expected
Message 4: PASS - 52 char; PASS - Message content as expected
Message 5: PASS - 52 char; PASS - Message content as expected
Message 6: PASS - 52 char; PASS - Message content as expected
Message 7: PASS - 52 char; PASS - Message content as expected
Message 8: PASS - 52 char; PASS - Message content as expected
Message 9: PASS - 52 char; PASS - Message content as expected
Program finished

String Concatenation in parallel

The first bad parallel example is this one, where the message is built up using string concatenation.  The code is almost identical to the sequential example, except that the for loop is now a Parallel.For and I’ve injected a Sleep to simulate performing other work (such as getting the data necessary to build the messages).

class Program
    static void Main(string[] args)
        string message = StringConcat();


        Console.WriteLine("Program finished");

    private static string StringConcat()
        string result = string.Empty;

        Parallel.For(0, 10,
                        (int i) =>
                                for (char c = 'A'; c <= 'Z'; c++)
                                    result += string.Format("{0}{1}",i, c);
                                result += Environment.NewLine;

        return result;

And the results are starkly different:


Does the result contain all the necessary parts?
10 of each letter; 26 of each number
0: 26 occurrences: Pass
1: 24 occurrences: Fail
2: 26 occurrences: Pass
3: 24 occurrences: Fail
4: 22 occurrences: Fail
5: 20 occurrences: Fail
6: 16 occurrences: Fail
7: 21 occurrences: Fail
8: 26 occurrences: Pass
9: 26 occurrences: Pass
A: 10 occurrences: Pass
B: 10 occurrences: Pass
C: 8 occurrences: Fail
D: 9 occurrences: Fail
E: 8 occurrences: Fail
F: 10 occurrences: Pass
G: 9 occurrences: Fail
H: 8 occurrences: Fail
I: 9 occurrences: Fail
J: 9 occurrences: Fail
K: 9 occurrences: Fail
L: 10 occurrences: Pass
M: 8 occurrences: Fail
N: 9 occurrences: Fail
O: 9 occurrences: Fail
P: 9 occurrences: Fail
Q: 9 occurrences: Fail
R: 10 occurrences: Pass
S: 10 occurrences: Pass
T: 9 occurrences: Fail
U: 8 occurrences: Fail
V: 8 occurrences: Fail
W: 7 occurrences: Fail
X: 9 occurrences: Fail
Y: 9 occurrences: Fail
Z: 8 occurrences: Fail
Does the result contain correctly sequenced individual messages?
Each sequence 52 chars; 0A0B0C... 1A1B1C.... etc.
Message 0: FAIL - Expected 52 / Got 232 characters
Message 1: FAIL - Expected 52 / Got 2 characters
Message 2: FAIL - Expected 52 / Got 2 characters
Message 3: FAIL - Expected 52 / Got 2 characters
Message 4: FAIL - Expected 52 / Got 2 characters
Message 5: FAIL - Expected 52 / Got 222 characters
Program finished

As you can see, some of it works out… Most of it is a mess!

So what happened?

The string that will contain the result was created outside of the parallel tasks. Inside the tasks the result was updated without any synchronisation structure in place. That means that all the tasks could update the intermediate stages of the result and in the process overwrite each others changes, insert items out of sequence and so on.

I’ve written about some of ways that the Parallel Extensions can help with synchronisation of data across parallel tasks before (e.g. the ConcurrentDictionary being used to help with the aggregation of grouped counts) so perhaps here is an example of where another of the concurrent collections may come in handy. A ConcurrentBag could be used to hold each of the individual completed messages.

A ConcurrentBag is an unordered collection of objects that you can access across multiple threads in a safe way. You can add the same object to the bag as many times as you like. As it is unordered you cannot rely on the objects being retrieved in any particular sequence.

The code that builds the messages now looks like this:

private static string ConcurrentBagExample()
    ConcurrentBag<string> bag = new ConcurrentBag<string>();

    Parallel.For(0, 10,
                    (i) =>
                        string result = string.Empty;
                        for (char c = 'A'; c <= 'Z'; c++)
                            result += string.Format("{0}{1}", i, c);

    return string.Join(Environment.NewLine, bag);

What has changed is that the building of the string has moved inside the task. This means that the task can only build the string for itself. Once it is done the string is added to the ConcurrentBag. The final string is built outside the parallel tasks. At the end of the method a simple string.Join() is used to pull all the data that’s been built up in the ConcurrentBag.

And now the messages are formed correctly. The only difference between the output of this program and that of the sequential reference program [see above] is the ordering of the individual messages:


Parallel Tasks and the HttpContext

A few days ago I spotted a question on StackOverflow by someone trying to use a parallel loop in an ASP.NET application. It may have been an ASP.NET MVC application (I don’t recall) but the issue is the same.

This person had some code in a parallel task that was using the HttpContext object. I would be hesitant to use that object in the first instance as I don’t know how thread safe it is. I suspect that since it holds a lot of information about the state of a request/response that it would be quite dangerous to access an instance in many threads.

His main issue what that he was getting a null back from HttpContext.Current inside the parallel tasks.

ASP.NET is already multithreaded. It abstracts most of that away so that when you are writing against it you only really ever see the request you are currently dealing with. Many other requests are happening around you, but the framework does its best to shield you from that so that you can write code cleanly. It is also its downfall in some cases.

If you don’t realise what the framework is doing for you then you could very easily fall into a number of traps when you get to the edges of that abstraction. So, when someone uses HttpContext.Current inside parallel tasks not realising that there must already by multiple requests being handled, and therefore there must be multiple simultaneous HttpContext objects floating around masquerading as the Current context. It can become very difficult to track down bugs if you know what the constraints of what Current means in this… erm… context.

Ultimately, HttpContext.Current is only available on the thread that you started with in ASP.NET. If you create new threads then it is no longer available unless you explicitly set it yourself.