Normalising the data model

Sometimes I see on forums someone who is trying to get some SQL statement to wield data in a particular way but the data model is just thwarting their attempts, or if they do get something to work the SQL statement that does the job is horrendously complex. This tends to happen because the data is not normalised (or “normalized” if you are using the American spelling) to the third normal form. Denormalised data models tend to happen for two reasons. Firstly, because the modeller is inexperienced and does not realise the faux pas they have made in the model, and, secondly, because the modeller has found the a properly normalised data model just doesn’t have the performance needed to do the job required.

The Scenario

In the example scenario that I am going to present an private education company is looking to build a system that helps track their tutors and students. So as not to be overwhelming I am only going to concentrate on one aspect of the system – the tutor. A tutor may be multilingual and can teach in a variety of languages and they may also be able to teach a number of subjects. The Tutor table has joins to a table for languages and a table for the subjects. The model looks like this:

The denormalised data model
Partially denormalised data model

As you can see there are 3 joins from Tutors to Languages and 4 joins from Subjects to Tutors. This makes joins between these tables particularly complex. For example, to find out the languages that a tutor speaks then a query like this would have to be formed.

SELECT  l1.Name AS LanguageName1,
        l2.Name as LanguageName2,
        l3.Name as LanguageName3
FROM Tutors AS t
LEFT OUTER JOIN Languages AS l1 ON l1.LanguageID = t.Language1
LEFT OUTER JOIN Languages AS l2 ON l2.LanguageID = t.Language2
LEFT OUTER JOIN Languages AS l3 ON l3.LanguageID = t.Language3
WHERE t.TutorID = @TutorID

So, what happens if the tutor is fluent in more than three languages? Either the system cannot accept the fourth language it will have to be changed to accommodate it. If the latter option is chosen imagine the amount of work needed to make that change.

A similar situation occurs with the join to the Subjects table.

Solution

A better way to handle this sort of situation is with a many-to-many join. Many database systems cannot directly create a many-to-many join between two tables and must create an intermediate table. For those database systems that appear to be able to model a many-to-many join directly (GE-Smallworld comes to mind) what is actually happening is that an intermediate table is being created in the background that isn’t normally visible and the database takes care of this automatically.

The resulting data model will look like this

The normalised data model
Normalised data model

This allows a tutor to be able to register any number of languages or subjects. It also means that any joins on the data are easier as there are no duplicate joins for each Language or Subject. The above SELECT statement can be rewritten as:

SELECT  l.Name AS LanguageName
FROM Tutors AS t
INNER JOIN TutorLanguage as tl ON tl.TutorID = t.TutorID
INNER JOIN Languages as l ON tl.LanguageID = l.LanguageID
WHERE t.TutorID = @TutorID

This will result in one row being returned for each language rather than all the languages being returned into one row. It is possible to pivot the results back to one row, but currently in SQL Server 2000 that would add more complexity to the query than I am willing to discuss in this article. If you want to know how to pivot results in SQL Server 2000 then see the page on Cross-Tab Reports in the SQL Server books-online. SQL Server 2005 will allow PIVOTed results directly. For more information between the SQL Server 2000 and 2005 way of doing things see: Pivot (or Unpivot) Your Data – Windows IT Pro

Migrating existing data

Naturally, if you have existing data using the denormalised schema and you want to migrate it to the normalised schema then you will need to be careful about the order in which changes are made lest you lose your data.

  1. Create the intermediate table.
  2. Change any stored procedures using the denormalised schema to the normalised schema.
    • You may also need to change code outside the database. If you find yourself needing to do this then I strongly recommend that you read about the benefits of stored procedures.
  3. Perform an insert for each of the denormalised joins into the intermediate table
  4. Remove the old joins.

If possible the above should be scripted so that the database changes occur as quickly as possible as, depending on your situation, you may have to take your production system off-line while making the change. Testing the changes in a development environment first should ensure that the scripts are written well and don’t fall over when being run on the production database.

To move the denormalised Language joins to the normalised schema some SQL like this can be used.

INSERT INTO TutorLanguage
    SELECT TutorID, Language1 AS LanguageID
    FROM Tutors
    WHERE Language1 IS NOT NULL
UNION
    SELECT TutorID, Language2 AS LanguageID
    FROM Tutors
    WHERE Language2 IS NOT NULL
UNION
    SELECT TutorID, Language3 AS LanguageID
    FROM Tutors
    WHERE Language3 IS NOT NULL

It can, of course, be written as a series of individual INSERT INTO…SELECT statements rather that a large UNIONed SELECT

NOTE: This was rescued from the Google Cache. The original date was Sunday 3rd April 2005.

Tags:

Creating an informative workspace

Earlier this evening Agile Scotland ran a presentation by Rachel Davies called Creating an Informative Workspace which I found really interesting. The basic idea is that you permit developers to define their workspace to ensure that communication improves. The “Informative Workspace” is concentrated around “Big Visible Charts”. The following is a write up of my notes from the presentation.

The idea of the Big Visible Chart is to communicate the plan, the progress, the test results and the domain model. This is simply becuase it is too much information to be able to keep it all in our heads at once.

At the smaller end of the scale are the charts that each team member puts in their own space. These charts contain information that the individual developer finds useful, however they remain visible to all who want to view them. They become “Information Radiators”. That is, they radiate information so that other people can see the state of things without having to actually interrupt the developer. It is well known that an interruption that maybe only lasts a few seconds has a knock on effect as the developer then has to get back into the topic that they were concentrating on in the first place.

It is important that the developer is careful about what they present. The chart needs to be understandable so that people don’t go away with incorrect ideas or have to interrupt the developer. It also needs to be readable from a distance.

The team should be permitted to maintain their own space. Management should not bog down the developers’ charts with information that is not useful to the developers. If the information is not useful to the developers then it will distract them from the job at hand.

The team could, for example, use a team wall with lots of index cards pinned to it. Coloured stickers could be added to the cards that increase visibility and add information that can be easily spotted from a distance. At the more high-tech end of the scale a plasma screen could be used to display information such as the build results from cruise control.

The Big Visible Chart should also include the “big picture” so that everyone is reminded of what is being worked towards.  For instance, a scrum burn-down chart could be included; timelines to show what had been done and what is still to do; putting up the release date of the iteration and showing what is in the release and what is not.

All the information on the charts are a form of feedback that can be used to modify actions to keep everything on track. However, be aware of too much feedback which will overload the developers with information. If there is too much noise then people will filter it out and valuable information will not be recognised.

The process of creating the charts and defining what is put on them should be reviewed periodically throughout the project. It is important to have a “retrospective” meeting to discuss what went well, or not well, or is just puzzling. Update the charts as a result to ensure that the best infromation is available. These meeting could be about once per month, and the length could vary from an hour to an afternoon.

In the retrospective meeting do root cause analysis. Sometimes the process needs to be rethought rather than just speeded up as the observations made might just be a symptom of a deeper problem. Also, be aware of focusing too much on one problem because it is often possible that by moving another part on the solution to the problem becomes more apparent.

Getting positive feedback to developers is also important. If a developer can find out how useful they system they are building actually is to the users, or how it has benefited the business it can increase moral and improve productivity.

Some teams have mood indicators on their Big Visible Chart. This works well for indicating the mood of individual developers, however it isn’t so good at showing the mood of the whole team.

Since the majority of Big Visible Charts are paper based systems and have to be updated manually it is important to ensure that it is updated regularly. For example, there could be a daily standup meeting around the big visible chart where it is reviewed and updated. It is important that when information is no longer relevant it is removed.

NOTE: This was rescued from the Google Cache. The original date was Tuesday, 26th April 2005.

Tags:


Original comments:

Hmmm. I read this and I can’t but help thinking that it’s pop psychology coupled with poor information management. The worst thing about information, when working in a group, is that everyone expresses it differently. There’s a lot to be said about consistent approach, legibility, etc. That’s why the team lead should be the one orchestrating this stuff.

A lot of this is a rehash of the classic ghant chart showing milestones, who’s working on what, critical paths, who’s accomplished what, etc.

I guess the thing I’ve never understood about XP is that “they”, the XP proponents, feel they have to come up with all sorts of goofy communication tactics, like “scrum” and “mood indicators”.

XP is not a “one size fits all” solution. Nothing really is. This “Big Visible Chart” idea, sure, is a tool, in fact it’s something I’ve employed before XP was even invented. It’s not appropriate all the time though. Sometimes it can be a big time waster.

In my book, a better starting place would be to take the existing management tools and figure out how they need to be tweaked for the particular team makup, project requirements, and management style.

My 2c. Yeah, I guess I’m pretty opinionated.

4/26/2005 10:59 PM | Marc

Hmm… I guess i go more for Little Visible Tables. Gobs of ’em, tacked all over my walls. Get the colors and layout right, and i can pick out what i’m looking for just by turning my chair, no leaning necessary.

Oh, well, this doesn’t really do anyone else much good, but… by the time someone is standing near enough to my cube to read anything, they’re probably already asking me.

That said, having a great big whiteboard around can be a very handy thing on occasion.

4/27/2005 1:27 AM | Shog9

Marc,

XP is not Scrum. They are both different ideas within the Agile movement.

XP has never said it was a one size fits all. A common theme in the Agile movement is picking the correct methodology for the project. That is partly why it should be the team that chooses their workspace. Also, you said it should be the team lead that orchastrates “this stuff”. Remember that the team lead is part of the development team. It is important to get feedback from the team otherwise the developers in the team will feel that their ideas are not worth anything. If after some discussion some ideas get left at the side then that’s fine, at least the team thought about an idea and weighed it with the goals of the team. This is as much about moral boosting as it is getting information efficiently over to developers. If the information is no use to the developers then it is waste (another part of the Agile movement is Lean Development which seeks to eliminate as much waste as possible)

The problem with Gantt charts is that they become out of date very quickly. Especially as the project moves on and new information comes to light as the client is actually coaxed to ‘fess up about what they really want rather than the vague fuzzy requirements they came out with in the first place.

The “Big visible chart” is just a tool, I quite agree with that, and if I suggested otherwise then that was not my intention. It should be used along with many other tools.

Yes, Marc, you are opinionated 🙂 But that is good, because now I have to try and justify myself rather than just blindly accept what the presenter’s viewpoint.

4/27/2005 12:21 PM | Colin Angus Mackay

eXtreme Programming in .NET

This is a summary of a presentation by Dr. Neil Roodyn for the Scottish Developers that took place in Microsoft’s offices in Edinburgh on the 21st of July, 2005. At the end of the presentation I won a copy of Dr. Neil’s book eXtreme .NET: Introducing eXtreme Programming Techniques to .NET Developers which I started reading on the train to and from work today and my initial impressions are very positive.

What is eXtreme Programming?

XP is a set of five values, although only the first four are well known.

  • Communication
  • Simplicity
  • Feedback
  • Courage
  • Respect

Communication is very important, but it is often underrated. One of the important aspects is that everyone should be located together to improve communication flow. This is one of the reasons that Microsoft get all their developers together in Redmond.

Simplicity, or just keeping it simple, makes it easy to communicate and reduces the possibility of bugs. If the developers don’t understand how the code works then how are they going to understand what is causing a bug and, importantly, how to fix it without creating new bugs in the process.

Rigorous Feedback loops improves the software. Customers always ask for change but if they don’t see the software evolving the change request usually comes after much additional work has been done. This is prevalent in traditional software development. It is therefore important to show the customer the software frequently.

Courage is required as many of the XP practices seem hard and they don’t naturally make much sense. For example, making changes when they are needed or throwing away code. However, the XP values as a whole are like a safety harness that ensure that the project can proceed quickly and safely. It is important to ensure that the other values are adhered to as without them it would be like jumping out an aeroplane without a parachute.

Respect everyone in the team so that everything runs smoothly. The team means the software developers, the project manager, the customer, and so on – basically everyone who has an interest in developing the software. If there is no respect then poor software development results as the developers will grumble about the customer being stupid, and the customer will grumble that the developers don’t understand the business needs and so forth.  This lack of respect is also a symptom of a break down in communication.

Traditional engineering says that cost of change increases exponentially. This concept was stolen by software engineering but it is inappropriate. Consider, for example, the cost of having to move a concrete structure once it has set in comparison to creating the structure in the correct location in the first place versus the cost of moving a method in a piece of software. However, the traditional engineering approach has been pervasive in software development placing a burden on developers that simply does not exist if the XP values are taken on board.

Traditionally, software development has been “out of focus”. When a problem comes along the first thing that is thought about is the technology that can be used to solve the problem, however technology is just made up of features and toys. Then the process, that is the methodologies and best practices, are considered. Finally, the people involved are considered. However, this is the wrong way around. First and foremost the people should be considered first, then the processes, and then the technology. If the people working on the project are happy then they write better code and they tend to meet business objectives more readily.

What is software?

Software is just code in an executable  form. Code is just a set of instructions that tell a dumb box of silicon what to do. Code is the core of software development. The end result does not exist without code. However, it is often overlooked. Most companies write reams of documentation before any code is ever written.

In order to produce a better product it must be easy to install. The easier it is to install then the the easier the customer can test the product themselves.

The software must have features the customer wants. Often people are focused on unimportant things without realising it. XP has the “planning game” to ensure that the customer can create a priority list. This priority list can be changed at any point by the customer.

The software must be of high quality, which means that it repeatedly works. Every bug is treated as a high priority task so that it must be fixed before a lower priority task.

If every bug is treated as a high priority task then the need for a bug database is removed as new features are not permitted to be added until the bug is fixed which means that at any one point the known bug count will always be close to zero. Pair programming and frequent code reviews help find bugs early so they can be fixed early.

The attitude of the developers is very much different if they are adding yet another bug to a database with 500 bugs in it than adding a bug to a database with close to zero bugs in it. An analogy in the non-software world is the broken window syndrome. If you see a house that is clean and all the windows intact you may think is it a well maintained house. If one of the windows gets smashed and is not repaired quickly then more windows get smashed and perhaps some graffiti is added and the house will look dilapidated and rundown very quickly.

Also, to be a better product the software must also be upgradeable. This ensures that new features can be added easily.

It isn’t so hard to do!

If it isn’t so hard to do then why, depending on the statistics you read, do somewhere between 60% – 85% of software projects fail.

The main reason is likely to be politics. People may be are not interested in the software. They may a vested interest in ensuring the software does not get written, for instance, they may lose their job once the software goes live. There may be a lack of respect (see above) between the software developers and the customer.

Some companies base their business model on making money from the RFCs. They charge a lot more for the changes than for the initial development. They deliberately produce poor poor requirements and specification documents to ensure a high number of RFCs down the line. It is often the budgeting system in place in the customer that drives this area of poor quality.

A lot of energy is wasted arguing over petty things such as what language to use, what technology, or that the team should use X set of complex design patterns. It should be emphasised that a team should not use a set of design patterns just because they are there. Patterns should be used as a vocabulary to show what has been done, rather than what should be done.

A software company may impose a set of practices. But having one set of practices imposed over a whole company is counterproductive. The practices used should be tailored for each project. The practices must be examined to determine whether they will add value to a project or hinder it.

Why do developers make software that is so complex?

There are three main reasons for this. (1) is to make themselves look smart; (2) is to justify their “high” salary; (3) is to cover their backsides, for example, if they can exclaim “It was a tough project, look at how hard it was” then it can be used as an excuse if things fail.

Software Development the XP way

First and foremost do the simplest thing that could possibly work. Be careful not to interpret simplest as easiest. Simple does not mean easy.

Eliminate (or reduce) comments in the code. Comments are a sign that the code is unreadable and that the block of code being commented should be refactored into a method of its own with an appropriately descriptive method name.

Remove duplicate code. There are many patterns that can be used to remove code duplication. Once the code is refactored then it will be easier to read and there will be a single point in the code to change if the functionality is to be changed.

Limit the number of classes to only those that are necessary to get the software to work. It is not necessary to create unnecessary classes for “future requirements” as these may change and it would be extra work to alter these classes to fit the direction in which the software is going. Also, if old classes are no longer required then they must be removed.

As quickly as possible get feedback; interpret it; act on it. Feedback can come from many areas, for example the tests, the customer (via story cards or their response to a new iteration) or daily stand-up meetings.

Assume simplicity on day by day basis. Each day create a task list with the average task being about 30 minutes; some may take 5 minutes some may take 2 hours. If a task looks like it will take more than 4 hours then it needs to be broken down into smaller tasks. That way most problems will be easy to solve. If 90+% of tasks are easy then less than 10% will require more effort, but the ability to get through so many tasks and cross them off the task list will improve moral and improve the quality of the code.

Make changes incrementally because big changes don’t work as there is too much disruption caused and large changes are harder to understand.

Keep the quality of the work consistently high – the only two choices for quality level are “excellent” and “high”

Everyone should learn from everyone else. It is important to teach everyone to learn and think about how to teach others the information that you have. That way information flows around the team and enables everyone to contribute at a high standard all the time.

It is important to make sure that the software is in a state that is ready to ship on a regular basis.

Put in time for experimentation (called “spiking” in XP). Each spike should be limited to 4 hours. If it is longer than that then break down the initial task in to smaller tasks.

Everyone should be able to work in an environment where honesty and openness is encouraged. This aids communication and any problems can be avoided or fixed as quickly as possible.

Everyone should go with their instincts. They are there for a reason.

Everyone shares the responsibilities. For example, if a developer finds a bug they should fix it (or pair with the developer that created the code). They should not put it in a big database and wait for the other developer to come back from their holiday to fix it. Sharing responsibilities also means that everyone is “aligned” and going in the same direction.

Everyone needs to be adaptable because change is to be expected. Adaptability also means not carrying unnecessary baggage. For example, if the class is no longer needed then get rid of it, or if there is duplicate code then refactor out the duplication.

It is important to make realistic measurements of the time it will take to do something. Functionality is not complete until the customer is using it.

Back to basics

The basic stuff is:

Coding In this order
Testing
Listening
Designing

Without code there is no program.

Without tests then nothing is known about the quality of the program

Without listening the developers won’t know what the other developers are doing and won’t understand the business problem that is to be solved.

Without designing there is no organisation and no plane. But, designing is last on the list – Why do everything up front when it is going to change. Design just has to be for enough flexibility but no more. Too much design makes things more rigid and inflexible.

Iteration Zero

The very first iteration before any code is written is to set up the build machine to create automated builds and an installer for the software.

The Planning Game

This takes place during a customer meeting. The customer is asked to come up with a set of user stories. The developers then break the story down in to tasks that they can work on.

User stories are at the level of things such as “the user can log on to the system” which is a basic step the user would have to take to accomplish a larger overall task.

When the user stories are written down the customer must then priorities them. Preferably this would be stacking the cards in order, but if they are unwilling to commit to that level of detail then having the customer create, say, three stacks for high priority (critical and must be done to succeed), medium priority (software should have these implemented) and low priority (it would be nice to have these implemented).

Test Driven Development

Although Test Driven Development (TDD) is used by many, including Dr. Neil Roodyn, to mean “test first” this isn’t a universal accepted definition. Many people make a distinction between the two and use TDD to mean that there is testing involved and that development cannot proceed until the tests are written and pass.

Writing the tests up front means that the developer has to think about the interface more than the implementation. It ensures that the least possible solution is delivered. It means that the tests can be run the moment the code is written. It puts quality first. It helps the developer understand the problem better. And it gives the developer confidence that they are doing the right thing.

Refactoring

The purpose of refactoring is to allow the next piece of code to be written faster and provides a mindset of constant improvement. It ensures that code is reread and reviewed constantly which improves the quality. It makes life easier as refactored code is easier to read and understand. It also means that it is cheaper to add new features in the future as the code is clean and easy to understand.

Testing the GUI

It is possible to use reflection to drive the GUI in a test environment. There is also a very positive side effect that controls are named better from the start and that user feedback through the GUI is improved as the test framework needs to know what has happened.

Spiking the Unknown

When the developer finds an area that they don’t understand they need to explore it, experiment with it and be able to explain it to someone else (which is part of learning – see above)

Why do customers back off from XP?

Do they want the project to fail?

In fact, people still give Object Orientation lip service just as eXtreme Programming is paid lip service now. The reason is that these people don’t adhere to the values of the practice.

In some software development companies the business analysists  make great proxy customers. However, if the business analysists don’t really understand the business, and therefore don’t understand their job, then they get scared because they will be discovered as a fraud.

NOTE: This was rescued from the Google Cache. The original date was Friday, 22nd July, 2005.

Tags:


Original Comments:

This is why I think that while XP has some great ideas, they totally lose it on presentation. To quote from your blog (nice post, BTW, I’m not criticizing you at all):

Why do developers make software that is so complex?
There are three main reasons for this. (1) is to make themselves look smart; (2) is to justify their “high” salary; (3) is to cover their backsides, for example, if they can exclaim “It was a tough project, look at how hard it was” then it can be used as an excuse if things fail.

This is plain BS, IMO. Developers make software too complex because they’re taught that OOD is the cat’s meow, and programming “the Microsoft way” leads to a lot of awful design. It all comes down to education and experience, not puffing the feathers.

I wish XP would lose it’s own “attitude” and simply discuss the merits of its practices, and also talk about when and why and how XP fails. No system is perfect, and XP would be stronger if it looked at its own imperfections.

My 2c. 🙂

Marc

7/23/2005 12:37 AM | Marc

I accept your 2c and offer in return my tuppenny-worth.

I don’t take what you said as any form of critisism of me as this blog entry is, as I said at the top, a summary of a presentation that I attended. That said, I am a big fan of agile software methods (not just XP).

The presentation that I attended did discuss the weaknesses. In fact some of it is above: “A software company may impose a set of practices. But having one set of practices imposed over a whole company is counterproductive. The practices used should be tailored for each project. The practices must be examined to determine whether they will add value to a project or hinder it.”

So, if the practices of XP don’t fit the organisation then simply don’t use it.

As to other limitations – From what I’ve seen of XP, team member buy in is a big stumbling block. If they don’t buy in to the ideas then they’ll oppose it, even if they don’t mean to and are trying to stay neutral. The presenter gave an example of one team where one person just wouldn’t do pair programming – eventually he quit, even although the team were willing to accomodate him and permit him to code alone. However, if a more substantial number of team members don’t buy into the idea then XP simply won’t work.

Continuing with the example of pair programming. In the company that I work I have pair programmed on a small number of occasions. Each time the knowledge transfer was fantastic and we managed to get the work done in a fraction of the time. However, if I suggest it to most people they back off and say it isn’t necessary or it is a waste of time. I recon that a piece of work that took me almost a month to complete as I was stumbling through other peoples code would have been completed in less than a week if they guy who wrote the code, or was involved with it when it was created – however he wasn’t having any of it.

On to the “puffing the feathers”. I have to say that it is a problem that I’ve seen. In my younger days I may even have been guilty of it myself.

What is “programming the Microsoft way”? I’ve briefly looked as MSF (Microsoft Solution Framework) and it seems a reasonable way of constructing software. It is obviously biased towards MS technologies, but many of the principles can be transferred to other technologies. If, however, you are talking about some of these awful examples that can be seen in MSDN magazine on how to use some funky control then I agree that is often bad, but that is on a different level as it deals with the implementation and not the way the team works.

You mentioned that “XP would be stronger if it looked at its own imperfections”. I think that if you follow the values of XP then you are being critical of it all the time. The tight feedback loop provides a fast way of discovering what is wrong, not just of the software, but of the process and practices that are in place.

7/23/2005 1:02 AM | Colin Angus Mackay

I’ve often heard the problems of XP from the XP community, many of which are actually project problems rather than specific XP issues. To take a few….

1. Doesn’t scale to large teams
However, the majority of really large projects are probably a series of smaller projects if someone took the time to look hard enough.
The ideal XP team size is 12 or less. Can you imagine a billion pound health system being done by a team of 12! Perhaps XP and small teams is actually the answer to government projects?

2. Designing and coding for today
Although this is a strength it is also a weakness. Sometime we can take the time to code for something we’ll use tomorrow. This approach also means we can see significant refactoring effort as the project design evolves.

3. It requires good people
Any successful team requires a majority of good / experienced people to be part of the project. It doesn’t matter whether they are an XP team or not, a bad / inexperienced team are in danger of failing.

4. Distributed Teams
It is certainly more difficult to run distributed XP teams than using one of the more traditional development methods.

5. Feedback
Get effective communication between the business oriented client and the techies in the development team can be difficult.

I could go on and on. The weaknesses are actually challenges and many are present in non-Agile projects as well.

I have heard from many hardened XPers how and when it has failed them. Usually they learn the lessons from this and hopefully don’t repeat the same experience in future projects. All you need to do is ask the right questions of the XP community and they will tell you where it breaks or fails, sometimes suggestion alternatives or workarounds to mitigate the risks.

Remember, the original XP project was considered a failure!

Kent wouldn’t have brought out a 2nd edition of the white book if he didn’t want to modify and change some elements of XP to improve things and on occasions correct stuff that may be at risk of failing.

I think the previous commentator was referring to the flawed model of dropping components onto forms and running wizards, amongst other such intrinsically bad practices. This isn’t just a flaw of Microsoft – just about all the major compiler / tool developers has this poor software engineering approach to development. Oh! I’m in agreement with you. I hate the idea of a component that knows how to display itself, can control what it does and can do things with the persistence layer! This really is dreadful software engineering IMHO!

I agree that this is a really good post. Glad you enjoyed the events so much. It was rather good as someone else who was there.

Regards

John

7/24/2005 1:16 AM | John A Thomson

More information about this session can be found on Craig Murphy’s blog: http://craigmurphy.com/blog/?p=111

8/2/2005 7:16 PM | Colin Angus Mackay

I find this all very interesting.

I have been in the role of the client/business analyst on a complex software development project, using the XP programming philosophy for the last 8 months.

For people who own or manage a software development company here are a few observations about XP programming from “the other side”:

For XP programming to work, the development company needs to:

-commit to hiring programmers for their attitude and communications skills. They need to be able to talk to the client.

-look for programmers that are interested in learning, not just about technology, but about their clients business models. Programmers that are truly interested ask more questions, and produce results that better reflect what the client wants.

-find out what would make the client want to spend more time at their company (their own workstation, extension etc.)

-encourage interaction between the client and the developers in a non-work environment (arrange team lunches for the clients and the developers, without upper management)

It really comes down to building trust. This trust goes a long way with the client when things get rough (and a story estimated as 2 points turns into 8 points 😉

The programmers need to be gAs a client I believe I will get a better product from the developer who asks me 50 questions about my business, than from the developer who makes assumptions, even if the second developer is a more “skilled” programmer.

2. The development company should do what they can to encourage client, developer interaction. Here are a few things that have worked for our project:

– The development company that I work with has set up a fully equipped workstation for me in the project room. As a result I spend more time working there, where I am available to answer developers questions, than I do at my own company.

-The development team is given a budget for team lunches, and I am invited as a member of the team. These team lunches happen without upper management from the development company attending, which makes it a more relaxed environment.

-The development company arranges sporting events and invites the clients.

8/17/2005 7:25 AM | Jen

Thanks Jen, that sounds like some excellent advice.

8/17/2005 10:19 AM | Colin Angus Mackay

SQL Exception because of a timeout

You think it would be easy to find information on exactly what error number a SqlException has when the command timed out. But, it isn’t. MSDN, for all that it is normally an excellent resource, fails to even mention that the Number property of a SqlException or SqlError can even be negative. (I suppose if I am to be fair, it doesn’t say it can be positive either). What it does say is exactly this:

This number corresponds to an entry in the master.dbo.sysmessages table.[^]

If you look at the sysmessages table, you will notice that all the numbers are positive. So, why am I concerned about negative numbers? Because sometimes the SqlException.Number property returns negative numbers, and therefore it does not correspond to an entry in the sysmessages table. So, the documentation is not telling the whole story.

I want to find out when an SqlException happened because of a timeout. The easiest, and I should imagine, the more reliable way of checking a SqlException for any particular error message is to check the Number property in case the message description has been localised or changed for some other reason. For most cases this is perfectly fine. I can check the sysmessages table for the description for the error that I am wanting to handle and I can check for that number when I catch a SqlException. But, there isn’t any error number for a timeout.

The exact error message that is in the SqlException is

Error: System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

So, if the error is potentially because the server isn’t responding then the error message cannot be generated on SQL Server itself. How would it generate this error message if the server isn’t responding? The error, I can only assume, is being generated by ADO.NET. Why do I say I can only assume? Because after spending some time fruitlessly googling for a definitive answer I have yet to find one.

The best answer I’ve found is some sample code, that happens to check the Number property of the SqlException in a switch statement and next to case -2: is the comment /* Timeout */

So, at the moment I’m working on the assumption that -2 just means a timeout caused the execution of the SqlCommand to fail. I don’t like developing software in this way because if there is a bug in the future I cannot be certain that this code is okay or if sometimes -2 is something else, or perhaps sometime a timeout is something other than -2

NOTE: This was rescued from the Wayback Machine. The original date was Monday 17th October, 2005.

Tags:


Original Comments:

-2 is the error code for timeout, returned from DBNETLIB, the MDAC driver for SQL Server. So, its not ADO.NET.

A full list of possible error numbers can be found by using Reflector on System.Data.SqlClient.TdsParser.ProcessNetLibError.

10/18/2005 1:35 AM | Steve Campbell

Steve: That is most helpful. I feel much more confident that my code is doing the right thing now. Many thanks.

10/18/2005 6:48 AM | Colin Angus Mackay

What's new in Visual Source Safe 2005?

I’ve been looking at the new version of Visual Source Safe as the company I work for is desperately looking for something that isn’t VSS6.0 and isn’t going to cost as much as Team System. So far, initial investigations look very promising. There are other source code control providers out there that aren’t being investigated, such as SourceGear’s Vault, which may also provide a good (or even better) solution but due to budget constraints are being ruled out without being looked at. That isn’t to say that Vault is expensive, just that because we will receive VSS 2005 bundled with our MSDN subscription it would have to compete with an effectively zero cost alternative.

Anyway, this entry is all about Microsoft Visual SourceSafe 2005 so here is some initial stuff that I’ve looked at.

It is now possible to open VSS connections over the internet through a remote source control provider. This will be excellent for people who are on the move. We also have some clients who would like access to the source tree for their projects. Previously this was not possible unless a third party product was purchased, like SourceGear’s SourceOffSite.

There are limitations when using Visual SourceSafe using the remote provider. For example there is no access to viewing an individual file’s history. The reason given for this is that people who are accessing the source remotely are unlikely to need access to that sort of functionality. From a personal perspective, I have worked on distributed projects where the developers were spread out across many sites and that sort of feature is actually useful in those scenarios. Perhaps for a developer who’s just taken his laptop away from the office for a few days might get by without that functionality, but I can see there being many projects where that is something that will be a requirement. In those instances then solutions like SourceGear’s Vault may be more suitable.

Integration with Visual Studio is much better. The open solution dialog now has the option to browse the source code control tree as well as the file system, so there is no need to remember to click Open from Source Control the first time the project is opened.

Visual Studio retrieves the files asynchronously from Visual SourceSafe so that the developer can get to work faster as files can be checked out and edited before the whole source tree has been built on disk. The only caveat that I saw to this was that, naturally, you cannot build the solution until all the files have been retrieved.

Visual Studio also makes it easy for developers who operate on and off site by providing the facility to switch between multiple source code control providers. This was always possible, but it was a tedious task and prone to problems.

Visual SourceSafe now supports Windows Authentication which means that you don’t have to use the log in dialog as the connection from the client to the server will be authenticated depending on the account that you are logged in with.

Plugins are available so that third party diff engines can be used to allow the differences between various versions of a file to be seen where previously it wasn’t possible. This is especially useful for viewing the differences between certain proprietary or binary formats.

All in all, VSS 2005 looks like a good step forward however, there are a number of cheap or free alternatives that might suit some people better.

NOTE: This was rescued from the Google Cache. The original date was: Friday, 28th October, 2005.

Tags:


Original comments:

You forget to mention my most favorite VSS2005 addition….

File transfers are compressed giving a speed increase of 2x-5x during check in and check out. This can be especially important if the VSS server is remote on the network or internet where speed might be a prime. In the rare case of a dialup connection, this is a life saver.

And don’t quote be on this, but I believe file check in and checkout is now official transaction based. I know I’ve come across 2 files in my day that appear to have been corrupt in VSS.

Finally, keep in mind VSS6 came out in 1998 making it 7yrs old, which is ancient history in computer terms. Ok there’s been a couple patches/service packs but comon, VSS6’s age has been showing since VS.Net came out. I think we were all surprised when VS2003 didn’t include a new VSS.

11/3/2005 8:55 PM | Travis Owens

Additional: It should be pointed out that VSS is not transaction based. Team System is transaction based.

 

Sony DRM Hides Trojan

Further to my post last week about Sony’s malware disguised as DRM it seems that a trojan is now taking advantage of the Sony malware.

From The Register: “This means, that for systems infected by the Sony DRM rootkit technology, the dropped file is entirely invisible to the user. It will not be found in any process and file listing. Only rootkit scanners, such as the free utility RootkitRevealer, can unmask the culprit,” warns Ivan Macalintal, a senior threat analyst at security firm Trend Micro.

The full story can be found here: First Trojan using Sony DRM spotted

This was rescued from the Google Cache. The original date was Thursday 10th November, 2005.

Tags:

To be scammed, or not to be scammed

A little while ago I wrote about the poor security procedures that some banks had in place. The BBC have an article on today’s edition of their news website about tactics scammers use called “How to stay off the suckers list“. The common theme is that you have to be constantly vigilent about the situation or the scammers will get away with your money or belongings. However, how do you tell the difference. One reader summed it up succinctly:

The thing that always amazes me is when your bank rings up and asks you to answer some security questions. They could be anyone, and yet they always seem surprised when you ask them to prove who they are.
John James, London

And another wrote:

A further bit of advice when checking oseut credentials is not to ring the number on the ID card shown but to get the official number via the telephone book.
Peter Lockwood, Loughborough

I totally agree with both these sentiments. As I mentioned previously when my bank’s fraud department rang, I verified the phone number left in the voicemail message and when I couldn’t correlate it to any existing correspondance I had with my bank I phoned their customer service department. I spoke at length about the security implications of what they had done, but despite the assurances of the person I spoke to, I still have the nagging feeling that it wasn’t going to be taken any further.

NOTE: This was rescued from the Google Cache. The original was dated: Tuesday, 7th February 2006.

Tags:


Original comments:

We got cold-called today by some kind of business directory company. I didn’t talk to them, my colleague did. Towards the end of the conversation, as a ‘security question’ he got asked his place of birth. He refused to give it. The telesaleswoman said that she calls 400 people every day and he’s the first to refuse. He refused again and asked why she needed it. Allegedly it was to confirm to her supervisor, should he call, that she had indeed spoken to us.

In the end to get rid of her he simply lied.

My dad says that for his online bank account, he actually hasn’t answered any of the questions as stated. Instead he’s supplied other information which he can remember based on the information he was asked for. I’m not that smart – I couldn’t even remember the right answers to some of the questions (e.g. ‘memorable name’ – clearly not that memorable!)

2/7/2006 10:48 PM | Mike Dimmick

I completely agree with John James’ comments too about two way verification. If I get called by my bank, telco etc, I always request certain information from them to make sure they are who they say they are. It absolutely works both ways. I am also always surprised when they do not expect it. Recently, BT receoved a call from my partner to report a fault on our line. She is not the account holder, nor is she documented anywhere as living there (apart from council tax, data BT does not have access to) however BT were more than happy to disclose details about my account and even went so far as to divert calls to her mobile number (big security risk – what if I was having an affair or what if she wasn’t indeed my partner – easy trick to pull off!!!). Obviously, this is all going in my letter to them (they finally managed to fix my fault after 6 weeks).
Anyway, back to work for me.
Thanks Colin for the SQL injection attacks article on codeproject.com

2/10/2006 4:16 PM | Andrew Lewis

Types of join

Occasionally there is a post on a forum asking what a certain type of join is all about, so I thought it would probably be good to have a stock explanation to refer people to so that I don’t re-write near enough the same response each time the question arises.

First lets consider these two tables.

A

Key         Data
----------- ----------
1           a
2           b

B

Key         Data
----------- ----------
1           c
3           d

We can see that the only match is where Key is 1.

INNER JOIN

In an INNER JOIN that will be the only thing returned. If we use the query

SELECT A.[Key] AS aKey, A.Data AS aData, B.[Key] AS bKey, b.Data AS bData
FROM A
INNER JOIN B ON a.[Key] = b.[Key]

the returned set will be

aKey        aData      bKey        bData
----------- ---------- ----------- ----------
1           a          1           c

In the case of the various outer joins non-matches will be returned also.

LEFT OUTER JOIN

In a LEFT OUTER JOIN everything on the left side will be returned. Any matches on the right side will be returned also, but if there is no match on the right side then nulls are returned instead.

The query

SELECT A.[Key] AS aKey, A.Data AS aData, B.[Key] AS bKey, b.Data AS bData
FROM A
LEFT OUTER JOIN B ON a.[Key] = b.[Key]

returns

aKey        aData      bKey        bData
----------- ---------- ----------- ----------
1           a          1           c
2           b          NULL        NULL

RIGHT OUTER JOIN

The RIGHT OUTER JOIN is very similar to the LEFT OUTER JOIN, except that, of course, the matching is reversed. Everything on the right side is returned, and only matches on the left side are returned. Any non-matches will be filled with nulls on the left side.

The query

SELECT A.[Key] AS aKey, A.Data AS aData, B.[Key] AS bKey, b.Data AS bData
FROM A
RIGHT OUTER JOIN B ON a.[Key] = b.[Key]

returns

aKey        aData      bKey        bData
----------- ---------- ----------- ----------
1           a          1           c
NULL        NULL       3           d

FULL OUTER JOIN

A FULL OUTER JOIN returns a set containing all rows from either side, matched if possible, but nulls put in place if not.

The query

SELECT A.[Key] AS aKey, A.Data AS aData, B.[Key] AS bKey, b.Data AS bData
FROM A
FULL OUTER JOIN B ON a.[Key] = b.[Key]

returns

aKey        aData      bKey        bData
----------- ---------- ----------- ----------
1           a          1           c
2           b          NULL        NULL
NULL        NULL       3           d

CROSS JOIN

The CROSS JOIN doesn’t obey the same set of rules as the other joins. This is because it doesn’t care about matching rows from either side, so there is no ON qualifier within the join clause. This is a simple join that joins all rows on the left side to all rows on the right side. Where the inner join and left/right outer join cannot return more rows than exist in the most populous of the source tables and the full outer join’s maximum result set if the sum of the source rows, the CROSS JOIN will return the product of rows from each side. If you have 5 rows in Table A, and 6 rows in Table B it will return a set containing 30 rows.

The query

SELECT A.[Key] AS aKey, A.Data AS aData, B.[Key] AS bKey, b.Data AS bData
FROM A
CROSS JOIN B

returns

aKey        aData      bKey        bData
----------- ---------- ----------- ----------
1           a          1           c
2           b          1           c
1           a          3           d
2           b          3           d

NOTE: This was rescued from the Google Cache: The original date was Monday, 27th February 2006.

Tags:

Confucius Say….

Man who stand on hill with mouth open will wait long time for roast duck to drop in.”
— Confucius

If you want something you are going to have to put in some effort to get it, it will not arrive to you exactly as you want it. This is especially true in on-line forums. Many times I see questions from people that just want the answer to their homework. There is no intention to actually understand the problem, they just want something they can hand to their tutor the next morning. This is really frustrating because I spend some of my lunch hour or free time on these forums trying to help people. Most people are genuinely stuck and cannot make sense of the documentation (you will have noticed from other blog entries where I’ve written up a clarification of some documentation because it wasn’t written as I’d have liked) or they’ve been trying various things to get it to work and they’ve got their code in a “richt fankle”* and as a result they’ve lost the thread somewhat. Getting back to the analogy that confucius made, these people have actually attempted to obtain a duck, pluck it and roast it, but somewhere along the way it isn’t working out. These people deserve to get help because they have shown a willingness to learn by themselves.

What about the students that need to get their homework assignments in on time? Well, as it is obvious they’ve not done a jot of work themselves (unless you count copy and pasting their assignment to an online forum) their needs will most likely go unmet.

People get help because they deserve it, not because they need it. Does this sound unfair? Is it fair for me to waste my time helping someone who cannot even help themselves. To use another famous quote “Give a man a fish and you feed him for a day, teach a man to fish and you feed him for a life time“. If I just answer their question I use my time to just give them a fish even although I am trying to teach – I give them my fish that I caught to demonstrate during the lesson. If I can see that they are willing to learn then I know that if I teach them to fish, they will learn and they can then help themselves to as many as are in the sea.

A “richt fankle” is a Scots expression that means to get something in a complete mess. Code that is in a “richt fankle” would be most likely also be described as spaghetti code.

NOTE: This was resuced from the Google Cache. The original date was Wednesday 6th October, 2005.

Tags:


Original Comments:

I couldn’t have said it better Colin!

10/5/2005 4:00 AM | Rob Manderson

I continually experience co-worker brain death. When they realize that they can ask me for help, their brain stops working and they ask me the stupidest questions, that they could figure out themselves if they didn’t go into “I’ll ask Marc” mode. Sigh.

It is a balance though, to decide how much time to spend figuring out the answer oneself vs. asking someone for help.

Marc

10/5/2005 1:57 PM | Marc