Inserting geometry through a .NET Application

THIS POST REFERS TO THE NOVEMBER 2007 CTP (CTP 5) OF SQL SERVER 2008

Following from my previous posts (Getting started with Spatial Data in SQL Server 2008, Spatial Data in a .NET application) on the new spatial features of SQL Server 2008 I’ve been looking at how to get spatial data into SQL Server from a .NET application. This has not been as easy as expected.

I suspect I just have not found the right piece of documentation because my eventual solution isn’t one I’m entirely happy with.

I was unable to add a SqlGeometry as a parameter to the collection of parameters on the SqlCommand object in my .NET application. The SqlGeometry does not appear in the enumeration for SQL Server data types. My first thought was to put the data through as binary as I vaguely remember reading something about using binary in something I read recently, but I couldn’t quite remember where. However, when I created the geometry object in .NET then used .STAsBinary() on it so the parameter object would accept it the INSERT failed. The exception message was:

A .NET Framework error occurred during execution of user defined routine or aggregate 'geometry':
System.FormatException: One of the identified items was in an invalid format.
System.FormatException:
   at Microsoft.SqlServer.Types.GeometryData.Read(BinaryReader r)
   at Microsoft.SqlServer.Types.SqlGeometry.Read(BinaryReader r)
   at SqlGeometry::.DeserializeValidate(IntPtr , Int32 )
.
The statement has been terminated.

 

Yes, that was the MESSAGE from the exception. The stack trace above comes from within SQL Server itself. There is a separate stack track for my application. (I’m guessing if you are familiar with CLR stored procedures you may have seen this sort of thing before. I’ve not used the CLR code in SQL Server before, so it is a new experience)

The solution I found was to mark the parameter as an NVarChar, skip the creation of an SqlGeometry object and use a string containing the WKT (Well Known Text) representation. For example:

cmd.Parameters.Add("@Point", SqlDbType.NVarChar) = "POINT (312500 791500)";

I mentioned earlier that I’m not all that happy with this solution. That is because if I already have an SqlGeometry object I don’t want to have to convert it to a human readable format and have SQL Server parse it back again. Human Readable formats tend not to be the most efficient way of representing data and the geometry type already has a more efficient (as I understand it) format. Although in this case I bypassed the creation of an SqlGeometry object in my code, I can foresee cases where I might have created an SqlGeometry object through various operations and would have produced a fairly complex object. In those cases I wouldn’t want the additional overhead formatting it into WKT, passing it over the network and parsing it on the other side.

If I find a better solution I will post it.

Getting started with Spatial Data in SQL Server 2008

THIS POST REFERS TO THE NOVEMBER 2007 CTP (CTP 5) OF SQL SERVER 2008

This post is probably going to be a wee bit random. After the running around over the last couple of weeks with the MSDN event (Sharepoint for Developers) in Edinburgh, trying to get the Developer Day Scotland website up, an invite to a VBUG event in Reading, the Community Leaders Day at the IoD, DDD6 and the TechNet event of Andrew Fryer‘s 8 Reasons to migrate to SQL Server 2008 I’ve finally got round to trying CTP5 of SQL Server 2008. I actually installed it in a virtual machine within hours of it becoming available for download, but it is only now I’m getting round to trying it out.

First off lets start with the way spatial data is held in SQL Server. There are two spatial types, geometry and geography. Although they sounds very close there is a fair difference between then and it is probably best not to confuse them. However, both types are very well named.

Geometry is a simple two-dimensional grid with X,Y coordinates. The British National Grid is an example of this. I would guess geometry would be most useful in systems where data comes in a specific planar/flat-earth projection, or where mapping of small areas (such as the internals of a building) are needed. Lengths and areas for geometry are easy to work out. The coordinates will have a unit of measurement attached, for instance the National Grid in the UK is in metres, so the distance between any two points can be worked out by simple Pythagorean maths and the result returned in the same unit of measurement as the coordinates.

Geography fits the spatial data on a sphere with lat/long coordinates. This is a better choice for international data or for countries where the land mass is simply too big to fit in one planar projection. However, it is important to realise that lat/long is still projected. There are various schemes for fitting a lat/long position to a place on the earth. It is important to know which is being used otherwise data from different sources may not match up. It is not so simple to calculate distances and areas on a geography type as the distance between two coordinates changes depending on where those coordinates are. For example, a line that is 5º from east to west is smaller the closer to the pole it gets with the largest distance at the equator.

According to the documentation geography also has some other limitations. No geography object may be greater in size that a hemisphere. When loading data into the database it will generate an ArgumentException if tried, and if the result of an spatial operation results in a geography greater than a single hemisphere then the result will be null.

Finally, before getting on with some code, a note on SRID (Spatial Reference Identifier). Each piece of spatial data must be tagged with an SRID. Geometry types can have a SRID of 0 (which means undefined) but geographies must have a defined SRID. By default geography types use an SRID of 4326 which equates to WGS84. Spatial operations can only occur between spatial types with the same SRID. The result of spatial operations between two pieces of data with different SRIDs is null.

With that brief introduction to geometry and geography how do you create data in the database.

CREATE TABLE Town
(
    TownId int NOT NULL,
    Name nvarchar(256),
    TownGeom geometry)

To populate the column there are a number of ways of getting the data in. Currently SQL Server supports WKT (Well Known Text), WKB (Well Known Binary) and GML (Geographic Markup Language). For other data types converters will need to be written. The following example shows WKT:

INSERT Town
VALUES(1, 'Pitcardine',
       geometry::STGeomFromText(
       'POLYGON ((0 0, 370 0, 370 160, 200 250, 0 250, 0 0))', 0));

It is also possible to use the more precise method STPolyFromText. Naturally the parser will be more strict about what WKT it accepts when using the more specialised methods. For example if the WKT for a line string is sent to the STPolyFromText method the error would look like this:

Msg 6522, Level 16, State 1, Line 2
A .NET Framework error occurred during execution of user defined routine or aggregate 'geometry':
System.FormatException: 24142: Expected POLYGON at position 1. The input has LINESTR.
System.FormatException:
   at Microsoft.SqlServer.Types.OpenGisWktReader.RecognizeToken(String token)
   at Microsoft.SqlServer.Types.OpenGisWktReader.ParsePolygonTaggedText()
   at Microsoft.SqlServer.Types.OpenGisWktReader.ReadPolygon()
   at Microsoft.SqlServer.Types.SqlGeometry.STPolyFromText(SqlChars polygonTaggedText, Int32 srid)
.

At present it doesn’t seem to be possible to return the column as GML as the method isn’t found. The documentation for the method doesn’t work either so I suspect it is a feature that isn’t ready yet.

SQL Server 2008 (July CTP): More installation woes

I’ve done it again. I’ve attempted to install SQL Server 2008 (this time the July 2007 CTP) and it hasn’t quite gone the way I expected.

This time there was no Management Studio. Why? Well, when I tried to go through the set up process again, I got a message that said “The following components you chose to install are already installed on the machine… Workstation components, Books Online and development tools 9.2.3042.00”. The first time round I just ignored it, but this time I paid more attention and clicked the button to give me details. It said:

Name: Microsoft SQL Server 2005 Tools Express Edition
Reason: Your upgrade is blocked. For more information about upgrade support, see the “Version and Edition Upgrades” and “Hardware and Software Requirements” topics in SQL Server 2008 Setup Help or SQL Server 2008 Books Online.

Okay – So, I already have SQL Server 2005 express edition installed, but there is still no management studio and no books online (that I can see). I espcially like the way it helpfully directs me to go to a topic in a help file it failed to install so I can resolve my problem.

Not to worry though… I discovered that the books online for the July CTP are available as a separate download too.

Also, I’m not entirely sure why it is talking about upgrades. I was planning to have it sit side-by-side, just like all the previous versions. But, it looks like I can’t do that. The BOL says “Presence of SQL Server 2005 Management Tools or BI Development Studio will block installation of SQL Server 2008 Management Tools and BI Development Studio.”

Oh, well… I guess intalling the July CTP in the same virtual machine as the Orcas Beta 2 isn’t going to work – I’m just going to have to install it in its own virtual PC now.

Tags:

Installing SQL Server 2008 (Katmai)

I’m just installing SQL Server 2008 (Codenamed Katmai) June 2007 CTP. The installation isn’t without its problems – which is fine for the moment. It is a CTP after all.

The main problem is with the prerequisits. The installer will try an install them for you, but it fails to get as far as installing even the .NET Framework. It would seem that these things have to be installed manually in advance at the moment.

If installing the full thing SQL Server 2008 does require access to IIS. I don’t think I’ll by using any of those features so I’m not going to bother – we’ll see how that works out. It will be interesting to see if it is easy to go back and add features or if it will require a complete reinstall. I’d imagine that most people will install just the minimum they need and go back and install more as they need it. This helps keep the attack surface minimised.

So, other than the installation of the prerequisits being a bit a bit dodgy everything else seems to have gone very smoothly. Hopefully, I’ll be playing with all the new spatial stuff. That’s what I’m really looking forward to.

Tags:

The benefits of Stored Procedures

There are a number of ways to access data in SQL Server, or any enterprise DBMS. There are lots of books that discuss getting data in and out of databases and the best ways to do that. Many advocate the use of stored procedures to ensure the safety of the data.

The tree main benefits that I see on stored procedures are:

  • Abstraction
  • Security
  • Performance

Stored Procedures add an extra layer of abstraction in to the design of a software system. This means that, so long as the interface on the stored procedure stays the same, then the underlying table structure can change with no noticable consequence to the application that is using the database.

For instance, if the database has to be denormalised to get a little extra performance in certain situations then the stored procedures can handle the additional updates and inserts necessary to ensure the integrity of the data across the tables. Without this the each of the callers would have ensure that these changes had taken place. Of course, the use of stored procedures does not in anyway grant a waiver from properly designing the data model, but it can help if the perfect normalised model has to give way for performance improvements.

This layer of abstraction also helps put up an extra barrier to would be intruders. If access to the data in SQL Server is only ever permitted via stored procedures then permission does not need to be explicitly set on any of the tables. Therefore none of the tables should ever need to be exposed directly to outside applications. For an outside application to modify the database, it must go through stored procedures.

Stored procedures can be written to validate any input that is sent to them to ensure the integrity of the data beyond the simple constraints otherwise available on the tables. Parameters can be checked for valid ranges. Information can be cross checked with data in other tables.

Even if it is thought that someone attempting to crack into a website will never get this far in, from a security perspective, anything that can reduce the attack surface is beneficial. 

Performance can be improved by the use of stored procedures. They are precompiled so when they are run there is no additional lag as the SQL is parsed, compiled, execution plans drawn up and then run, they just run because all that extra work is done at the time the CREATE PROCEDURE or ALTER PROCEDURE commands are run rather than when procedures themselves are run.

Another area in which stored procedures improve performance is that is pushes all the work onto the server in one go. A stored procedure can perform a series of queries and return many tables in, what is to the outside world, one operation. This saves the calling process from making many requests and the additional time of several network roundtrips. It also means that, if the contents of one set of data being returned is dependent on the results of a previous set of data that is being retrieved through the same stored procedure, that the data only has to flow from the database server to the application. If stored procedures were not being used it would mean that the data from the first database call has to get sent back to the database for the second call in order for it to continue retrieving the information needed by the application.

For instance. Lets say that Northwind traders send out a quarterly statement to its customers, and that for each statement certain information needs to be extracted from the database. The tables Customer, Order and Order Details are used. This information could be retrieved in several steps by calling the database for each set of information as it is needed to generate the statements. First with a SELECT * FROM Customers WHERE CustomerID = @CustomerID. This gets the details for the head of the statement. Then a SELECT * FROM Orders WHERE CustomerID = @CustomerID AND OrderDate>=@StartDate AND OrderDate<=@EndDate to get the details for each individual order by that customer. And finally a series of calls (one for each of the Order records that were retrieved) like SELECT * FROM [Order Details] WHERE OrderID = @OrderID

Assuming that the customer in question is “Rattlesnake Canyon Grocery” and the period for the statement is Q1 1998 then that is 5 roundtrips to the database and 5 times the database has to parse some SQL. This could be done by a single stored procedure that takes only one trip to the database and is precompiled.

 

CREATE PROCEDURE GetQuarterlyStatement
@CustomerID nvarchar(5),
@StartDate datetime,
@EndDate datetime
AS
SELECT * FROM Customers
    WHERE CustomerID=@CustomerID
SELECT * FROM Orders
    WHERE CustomerID=@CustomerID
    AND OrderDate>=@StartDate
    AND OrderDate<=@EndDate
    ORDER BY OrderDate DESC
SELECT [Order Details].* FROM [Order Details]
    INNER JOIN Orders ON [Order Details].OrderID = Orders.OrderID
    WHERE CustomerID=@CustomerID
    AND OrderDate>=@StartDate
    AND OrderDate<=@EndDate
    ORDER BY OrderDate DESC
GO

 

The stored procedure is now doing in one trip what previously took 5 trips. Of course, this example is somewhat contrived for brevity, in a real application there would be joins to the product tables, and the columns would be listed rather than using SELECT * and so on.

NOTE: This was rescued from the Wayback Machine. The original date was Friday, 1st October, 2004.

Tags:


Original comments:

A drawback using stored procedures is portability. For each targetted DBMS, you may have to rewrite SP code. If a client side abstraction such as ODBC or ADO is used, one can keep a common code base, making CM and Release Engineering activities easier. Of course it comes with a price – performance.

10/12/2004 12:23 AM | Jörgen

 

Jörgen, While that is a potential drawback, in my experience when moving from one DBMS to another the client side code has to change anyway regardless of client side abstraction. However, with a well structured DAL, the impact can be kept to a minumum.

10/12/2004 12:28 AM | Colin Angus Mackay

SQL Exception because of a timeout

You think it would be easy to find information on exactly what error number a SqlException has when the command timed out. But, it isn’t. MSDN, for all that it is normally an excellent resource, fails to even mention that the Number property of a SqlException or SqlError can even be negative. (I suppose if I am to be fair, it doesn’t say it can be positive either). What it does say is exactly this:

This number corresponds to an entry in the master.dbo.sysmessages table.[^]

If you look at the sysmessages table, you will notice that all the numbers are positive. So, why am I concerned about negative numbers? Because sometimes the SqlException.Number property returns negative numbers, and therefore it does not correspond to an entry in the sysmessages table. So, the documentation is not telling the whole story.

I want to find out when an SqlException happened because of a timeout. The easiest, and I should imagine, the more reliable way of checking a SqlException for any particular error message is to check the Number property in case the message description has been localised or changed for some other reason. For most cases this is perfectly fine. I can check the sysmessages table for the description for the error that I am wanting to handle and I can check for that number when I catch a SqlException. But, there isn’t any error number for a timeout.

The exact error message that is in the SqlException is

Error: System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

So, if the error is potentially because the server isn’t responding then the error message cannot be generated on SQL Server itself. How would it generate this error message if the server isn’t responding? The error, I can only assume, is being generated by ADO.NET. Why do I say I can only assume? Because after spending some time fruitlessly googling for a definitive answer I have yet to find one.

The best answer I’ve found is some sample code, that happens to check the Number property of the SqlException in a switch statement and next to case -2: is the comment /* Timeout */

So, at the moment I’m working on the assumption that -2 just means a timeout caused the execution of the SqlCommand to fail. I don’t like developing software in this way because if there is a bug in the future I cannot be certain that this code is okay or if sometimes -2 is something else, or perhaps sometime a timeout is something other than -2

NOTE: This was rescued from the Wayback Machine. The original date was Monday 17th October, 2005.

Tags:


Original Comments:

-2 is the error code for timeout, returned from DBNETLIB, the MDAC driver for SQL Server. So, its not ADO.NET.

A full list of possible error numbers can be found by using Reflector on System.Data.SqlClient.TdsParser.ProcessNetLibError.

10/18/2005 1:35 AM | Steve Campbell

Steve: That is most helpful. I feel much more confident that my code is doing the right thing now. Many thanks.

10/18/2005 6:48 AM | Colin Angus Mackay

Oh No! More on SQL Injection Attacks

I’ve not written about this in a while becuase it seemed that people were getting the message. But today I was asked, on Code Project, “I am wondering why injecting values into the [SQL] string is considered a security risk?” Here is my response: Because if you inject strings into the SQL, especially ones that come straight from the user interface, then an attacker can produce malformed SQL and gain access to your system. (Where do you live? I can come and do one of my SQL Injection Attack presentations in your town if you want a real live demonstration where I compromise a SQL Server into divulging the inner most secrets of the server it is running on. And I mean the whole server, not just the SQL Server process.*) Lets say you have a simple bit of SQL like this:

cmd.CommandText = "SELECT * FROM Products where Name = '"+txtSearch.Text+"'";

What happens if the user types in the following?

'; DELETE FROM Products; --

The whole string becomes:

SELECT * FROM Products where Name = ''; DELETE FROM Products; --

That will return a dataset back to the application, which is what it expects, and then deletes all the products from the database. When the next customer comes to the website what is it going to show when there are no products in the database? Okay – there may be some constraints on the table (foreign key constraints) that don’t permit the rows to be deleted. How about something equally damaging to the company. Let’s set their entire inventory to a penny! The mallicious user then types:

'; UPDATE Products SET Price = 0.01; --

The word will quickly spread around the internet and the company will soon be out of business or have a huge number of very pissed off customers. If you don’t secure your system the possibilities for attack are endless. Finally, if you want to know more, I encourage you to read my article SQL Injection Attacks and Tips on How To Prevent Them

* The demonstration is done on a server box that I own. Performing a SQL Injection Attack on a system without the permission of the system owner is a breach of the 1990 Misue of Computers Act and can carry a penalty of 5 years in jail.

NOTE: This was rescued from the Google Cache. The original date was Wednesday 7th June, 2006.


Original comments: As a MCSD I was entertained by the examples that you provided. I’m relieved however that your not a chemist or a physics guru or something… Nicely done…

7/10/2006 10:03 PM | Dee Illuminati

SQL Injection Attacks and executing dynamically created SQL

There is a very important difference between EXEC[UTE] and sp_executesql that anyone who executes dynamically generated SQL statements ought to know.

Typically dynamic SQL is generated when a particular construct is not possible by using parameters alone or when certain parts are added to the statement depending on other conditions. In the latter case, sp_executesql trumps EXEC[UTE] by allowing the developer the ability to pass in parameters to the dynamic SQL statement.

For example, consider this code:

SELECT *
FROM MyTable
WHERE a = @wantedA
AND b = @wantedB

If you were dynamically building this and were using EXEC then the code to dynamically build and execute it might look like this:

DECLARE @sql NVARCHAR(4000);
SET @sql = N'SELECT * FROM MyTable '+
           N'WHERE a = '''+@wantedA+N''' AND b = '''+@wantedB+N'''';
EXEC(@sql)

As you can probably guess, without extreme care as to the values of @wantedA and @wantedB an SQL Injection Attack is possible. However, it is possible to dynamically create the SQL statement and still use parameters within it like this:

SET @sql = N'SELECT * FROM MyTable '+
           N'WHERE a = @dynWantedA AND b = @dynWantedB';
sp_executesql @sql, N'@dynWantedA varchar(100), @dynWantedB varchar(100)',
              @dynWantedA = @wantedA, @dynWantedB = @wantedB;

As you can see in the second example, instead of injecting the value of the parameters we can just write parameters directly into the dynamic SQL statement and then pass them in.

There is, of course, caution to be exercised. Certain things cannot take parameters. For example, in SQL Server 2000 the TOP keyword must be followed by a literal value. It isn’t possible to write TOP @numRows* so if that must be dynamic then the value would have to be injected into the SQL statement like this:

SET @sql = 'SELECT TOP '+CAST(@numRows AS varchar(10))+' * FROM MyTable';

So, using sp_executesql is not a panacea that will make all issues with SQL injection go away when building dynamic SQL, but it does help in certain cases.

* This is possible in SQL Server 2005, but not SQL Server 2000

NOTE: This was rescued from the Google Cache: The original date was Thursday, 26th January, 2006.

Moving Databases

If you ever move a database from one SQL Server to another you may come across the situation where the logins no longer map to the users in your database (and that’s assuming that the SQL Server you’ve moved the database to has the same logins).

If the new SQL Server does have the same logins then you can fix the mapping by using sp_change_users_login. The neat thing is that if the user and login names already match then there is an “Auto Fix” setting. And if you just don’t know what is mismatched there is a “Report” option too.

NOTE: This was rescued from the Google Cache. The original was dated Saturday 1st July, 2006.

Tags: