Spatial Data in a .NET application

THIS POST REFERS TO THE NOVEMBER 2007 CTP (CTP 5) OF SQL SERVER 2008

Following from my last post about Getting Started with Spatial Data in SQL Server 2008 in the database, how about actually using it in an application? Since that is where most data will end up at some point, it must be possible to use the spatial data in an application some how.

So, using the Town table created before and populated with our fictional village of Pitcardine I wrote this very simple program to see what actually came through ADO.NET to the end application.

static void Main(string[] args)
{
    SqlConnection connection =
        new SqlConnection("Server=(local);"+
            "database=SpatialSandbox;Trusted_Connection=Yes;");
    SqlCommand command =
        new SqlCommand("SELECT * FROM Town", connection);
    connection.Open();
    using (SqlDataReader reader = command.ExecuteReader())
    {
        while (reader.Read())
        {
            for (int i = 0; i < reader.FieldCount; i++)
            {
                object value = reader.GetValue(i);
                Console.WriteLine("{0} ({1}) = {2}",
                    reader.GetName(i), value.GetType(), value);
            }
        }
    }

    Console.ReadLine();
}

The result was:

TownId (System.Int32) = 1
Name (System.String) = Pitcardine
TownGeom (Microsoft.SqlServer.Types.SqlGeometry) = POLYGON ((0 0, 370 0, 370 160
, 200 250, 0 250, 0 0))
TownGeomText (System.String) = POLYGON ((0 0, 370 0, 370 160, 200 250, 0 250, 0
0))

At this point I’m looking around to find the assembly that I can reference in my .NET application. After some looking I found it in C:MSSQLMSSQL10.MSSQLSERVERMSSQLBinn however that directory is not guaranteed as it depends on your SQL Server installation. It would be nice if the assembly showed up in the .NET tab in the Add Reference dialog rather than require the developer to hunt it down. (I’m guessing that this is due to the CTP nature of the product I’m currently using)

Once the column value can be accessed as an SqlGeometry object in the code it becomes possible to query the object. However, the results are pretty much all SQL data types. For example:

SqlGeometry geometry = (SqlGeometry)reader.GetValue(2);
SqlDouble sqlArea = geometry.STArea();
double area = sqlArea.Value; // could also cast to double

Note that STArea() is a method, not a property as you might expect in this sort of scenario. I suspect that each time it is called the area is worked out which may be an expensive operation hence keeping the method syntax.

Getting started with Spatial Data in SQL Server 2008

THIS POST REFERS TO THE NOVEMBER 2007 CTP (CTP 5) OF SQL SERVER 2008

This post is probably going to be a wee bit random. After the running around over the last couple of weeks with the MSDN event (Sharepoint for Developers) in Edinburgh, trying to get the Developer Day Scotland website up, an invite to a VBUG event in Reading, the Community Leaders Day at the IoD, DDD6 and the TechNet event of Andrew Fryer‘s 8 Reasons to migrate to SQL Server 2008 I’ve finally got round to trying CTP5 of SQL Server 2008. I actually installed it in a virtual machine within hours of it becoming available for download, but it is only now I’m getting round to trying it out.

First off lets start with the way spatial data is held in SQL Server. There are two spatial types, geometry and geography. Although they sounds very close there is a fair difference between then and it is probably best not to confuse them. However, both types are very well named.

Geometry is a simple two-dimensional grid with X,Y coordinates. The British National Grid is an example of this. I would guess geometry would be most useful in systems where data comes in a specific planar/flat-earth projection, or where mapping of small areas (such as the internals of a building) are needed. Lengths and areas for geometry are easy to work out. The coordinates will have a unit of measurement attached, for instance the National Grid in the UK is in metres, so the distance between any two points can be worked out by simple Pythagorean maths and the result returned in the same unit of measurement as the coordinates.

Geography fits the spatial data on a sphere with lat/long coordinates. This is a better choice for international data or for countries where the land mass is simply too big to fit in one planar projection. However, it is important to realise that lat/long is still projected. There are various schemes for fitting a lat/long position to a place on the earth. It is important to know which is being used otherwise data from different sources may not match up. It is not so simple to calculate distances and areas on a geography type as the distance between two coordinates changes depending on where those coordinates are. For example, a line that is 5º from east to west is smaller the closer to the pole it gets with the largest distance at the equator.

According to the documentation geography also has some other limitations. No geography object may be greater in size that a hemisphere. When loading data into the database it will generate an ArgumentException if tried, and if the result of an spatial operation results in a geography greater than a single hemisphere then the result will be null.

Finally, before getting on with some code, a note on SRID (Spatial Reference Identifier). Each piece of spatial data must be tagged with an SRID. Geometry types can have a SRID of 0 (which means undefined) but geographies must have a defined SRID. By default geography types use an SRID of 4326 which equates to WGS84. Spatial operations can only occur between spatial types with the same SRID. The result of spatial operations between two pieces of data with different SRIDs is null.

With that brief introduction to geometry and geography how do you create data in the database.

CREATE TABLE Town
(
    TownId int NOT NULL,
    Name nvarchar(256),
    TownGeom geometry)

To populate the column there are a number of ways of getting the data in. Currently SQL Server supports WKT (Well Known Text), WKB (Well Known Binary) and GML (Geographic Markup Language). For other data types converters will need to be written. The following example shows WKT:

INSERT Town
VALUES(1, 'Pitcardine',
       geometry::STGeomFromText(
       'POLYGON ((0 0, 370 0, 370 160, 200 250, 0 250, 0 0))', 0));

It is also possible to use the more precise method STPolyFromText. Naturally the parser will be more strict about what WKT it accepts when using the more specialised methods. For example if the WKT for a line string is sent to the STPolyFromText method the error would look like this:

Msg 6522, Level 16, State 1, Line 2
A .NET Framework error occurred during execution of user defined routine or aggregate 'geometry':
System.FormatException: 24142: Expected POLYGON at position 1. The input has LINESTR.
System.FormatException:
   at Microsoft.SqlServer.Types.OpenGisWktReader.RecognizeToken(String token)
   at Microsoft.SqlServer.Types.OpenGisWktReader.ParsePolygonTaggedText()
   at Microsoft.SqlServer.Types.OpenGisWktReader.ReadPolygon()
   at Microsoft.SqlServer.Types.SqlGeometry.STPolyFromText(SqlChars polygonTaggedText, Int32 srid)
.

At present it doesn’t seem to be possible to return the column as GML as the method isn’t found. The documentation for the method doesn’t work either so I suspect it is a feature that isn’t ready yet.

GIS and Me

Years ago, when I left university I, along with some friends, set up a company to develop a GIS product. We then went after some venture capital and just around that time rumours filtered through that Microsoft were hiring some of the best minds in GIS and we thought “Oh, bugger!” our newly formed business would not be able to compete with Microsoft. We were left in a state of shock when Microsoft release MapPoint. It was rubbish! (At least it was rubbish as a competitor to our product.) You couldn’t do even half of what our product could do or our customers were asking for in terms of new features. MapPoint was strong on route planning, but that was the only competitive edge over our own product. We didn’t do route planning at all. MapPoint offered very little in the way of spatial processing and, as far as we could see, was just a very limited way to display data on a map.

However, it looks like things are changing. Spatial data support was finally released in the November CTP of SQL Server 2008. There is no graphical front end, but the processing ability is there and that is the more important part. It is easy to put a new front end on to the data. Microsoft have some solutions in that area already such as Virtual Earth and MapPoint. There are also other vendors out there who have their own, more powerful, front end solution.

My only reservation about these new abilities, at least in the UK, is that spatial data sets are very expensive. Unlike the USA the UK government holds the copyright on the data it produces and it then charges a king’s ransom for access to it. This stifles business’s ability to use spatial data effectively and it will have to resort to poorer quality datasets or being restricted in the ways that it can use the data.A business could, for example, use map data generated by OpenStreetMap. Doing that stifles the way the business can use the data as OpenStreetMap is released under a Creative Commons BY-SA (Attribution Share-Alike) license. This means that anything a company does with OpenStreetMap data must be attributed to OpenStreetMap (not an onerous condition, if they use OS data they must include a Crown Copyright notice) and the derivative works it creates must be covered under the same license – How many companies are going to be willing to do that?

According to the OSM website “CC-By-Sa do not force you to make any source data available. You are only restricted in the license you choose for distributed data”. On the face of it that sounds like companies can use the license so long as the map is kept internally. Well, I can’t really see how that is possible. If one employee creates a derivative work, say a sales chart, and passes it out at a meeting then it has been distributed and must therefore be under the CC-By-Sa license and once in that state the company cannot stop its distribution. Obviously I am not a lawyer, but that is my interpretation of the license.

Currently, I’m looking at ways to put together demos and articles on using the Spatial data types in SQL Server 2008, however I cannot us OS data because it is too expensive and I cannot use OpenStreetMap because I’m not willing to release my articles, presentations and demos (my derivative works) under that license. I’d also like to use UK data because that is where I’ll be doing my presentations/demos.

If anyone has links to vector data that I can use then I’d love to know about it.

Geek Dinners

A few days ago I was invited to a dinner by Tim Leung of VBUG in Reading. Thursday night I attended that dinner. It was a dinner with a difference.

First of all Tim gave a short presentation on some of the new things he’d picked up at TechEd. As it was a small group of us around a table at the back of the restaurant it was much easier to engage the speaker and the presentation was more interactive as a result. By the end of the presentation a number of conversations had been started based on what had been discussed previously.

Afterwards came the food. The conversation continued in its technology vein. Everyone had something to share and some advice to give.

What was most interesting about this over other geek dinners that I have attended is the combination of the presentation at the start and the small (less than 10) number of attendees. I think the combination made for an excellent more intimate evening and I think that everyone got something useful out of it.

This is something I’d like to replicate in Scotland and interestingly earlier this evening Frank Kerrigan, my co-conspirator at the west of Scotland branch of Scottish Developers, was suggesting the same kind of thing.

Watch this space – we will be arranging something in the new year.

To much to do, not enough time to blog about it!

I’ve not posted any blog entries because I’ve been working on some things that I’m hoping will benefit a lot of people.

Right now I’m installing SQL Server 2008 CTP 5 (November 2007) in a virtual machine. Earlier today I downloaded the RTM of Visual Studio 2008 Team Suite and I’ll be installing that on my desktop PC shortly.

So, what have I been working on? Mainly something that is going to get announced at DDD6 if I can hold back my enthusiasm. It is going to be good and I’m really excited.

I’ve still got some preparation to do before DDD6 for my session on Recruitment. Although not as much as other speakers as most of my session will be impromptu and I’ll just be going with the flow (so long as it isn’t flowing over too many rapids).

Scottish Developers have also been active. Last week we had a double bill in conjunction with the Scottish SQL Server User Group with an Introduction to Reporting Services by Frank Kerrigan and SMO by Steve Phillips. One bit of feedback we will be taking into account is not having two main presentations back to back. Although everyone appreciated the content given many people also commented that there just wasn’t enough time for both subjects to be covered well. So, rather than trying to cram as much as possible in, we will be returning to one main speaker for future evening events.

Frank’s Introduction to Reporting Services slide deck is available on Scottish Developers, as is Steve’s SQL Server Managed Objects slide deck and source code. (Note: you will have to register, for FREE, in order to download the information)

We also managed to give away a lot of swag. Everyone got a SQL Server fleece, which was absolutely perfect for a cold November and one lucky person won a complete set of “Inside SQL Server 2005” books worth nearly £100.

So, what does Scottish Developers have lined up for next year. Well, I don’t really know – I’ve not got anything firmed up yet, but the feedback we’ve been getting from events consistently puts “Patterns and Practices” as the number one item people want to see talks about. So, in this regard, Gary Short has agreed to make the trek from Dundee again to talk about his favourite patterns (which he will be doing at DDD6) or the web client software factory (a repeat of the talk he did at the VBUG conference) or possibly even something else patterns and practices related. I’ll also be keeping a look out for any other talks in the area of patterns and practices.

We have some close ties to the Scottish SQL Server User Group so no doubt there will be a SQL Server talk on the agenda at some point – I’ll even be weighing in on this front as I am very keen on all the new spatial querying that is going to be available in SQL Server 2008.

DALs and the DAAB

I’ve been pondering something that was raised in passing earlier this week and that is the relationship between a DAL (Data Abstraction Layer) and the DAAB (Data Access Application Block).

It was briefly mentioned in a conversation that I had that the DAAB provides the functionality of a DAL because the developer doesn’t need to worry about the back end database that is being used. I suppose to some extent that is true. However, I don’t believe that it fully functions as a DAL.

To my mind a DAL abstracts the access of data away from the rest of the application. Most people seem to restrict this view to data being held in a database.  But databases are not the only repository of data. Data can be held in plain text files, CSV files, XML files and many other formats. It doesn’t need to arrive by file, it could be data from a service or other mechanism.

If you treat sources of data as being more than a database then the DAAB is not a suitable substitute for building a DAL.

Also the DAAB has some limitations in that it cannot translate the SQL itself. For example the flavour of SQL in Oracle has differences to the flavour of SQL in SQL Server. This means that any SQL code will have to be translated. One possible solution is to ensure that everything is done through stored procedures. Then all that the DAAB needs is the stored procedure’s name and the values for the parameters.

But what of passing stored procedure names and parameters to the DAAB? Wouldn’t they need to be known in the business layer? Surely the business layer should know absolutely nothing about the database? Absolutely, the business layer should not be concerning itself at all with the database. It shouldn’t know about stored procedure names, parameters or anything else, even if the DAAB takes care of figuring out the details under the hood from information picked up from the config file. The Business Layer should just need to know there is a DAL and a method on the DAL can be called and some results come back. How the DAL does anything is of no concern to the business layer.

A quick and dirty test, in my opinion, is to look out for any classes from the System.Data namespace in the business or presentation layer to determine if the DAL is well designed.

In my mind the DAAB is just a tool that can be used to make the creation and maintenance of a DAL easier when dealing with databases. It makes it easy to change the location of databases as the development process moves along from the developers’ machines, to continuous integration, test, pre-live and finally live (or what ever your process calls for). The argument that the DAAB makes it easy to swap out one type of database for another isn’t something that is actually going to be done all that often. From what I’ve seen, companies generally run two systems concurrently until the old one is discontinued. Rarely do they ever actually update the old system to use the new database and when they do it is usually via some form of orchestration system so the old system doesn’t need to be changed in any great way. If it isn’t broke don’t fix it.

Tags: