Monday, 13 October 2003

a testing time for all

The problem is not the defects themselves, although, admittedly, they are a problem. The problem is recognising where the defects come from and what causes them, I′ve found that opinions on this subject can differ quite widely. For example, I once worked for a boss whose argument was that defects can only ever come from a programmer. For him there were no two ways about it, only programmers wrote code and so only programmers could create defects, in his mind defects were a problem for programmers and programmers only. He believed the one and only cause of defects was bad workmanship on the part of programmers and even proposed a league table so we could identify (and remove) the worst offenders.

The funny thing is, this same company had a very strict recruitment policy, putting candidates through a long interview process and quite extensive technical tests so they only took on the very highest calibre of developers. I would be surprised if more than one in a hundred applicants actually survived the selection process and was accepted. Bearing in mind programmers are usually considered to have the highest level of technical knowledge among all IT workers and knowing the team as well as I did, I found it difficult to reconcile the boss′s opinion with the reality of life in the development department. Nevertheless, the problem remained. Despite the high quality of our development staff, we had an unacceptably high number of defects.

Before we go any further, let′s define what a defect is because once we know what we are looking for, we can start looking at where they come from.

A defect, to put it simply, is when reality does not meet expectation. In other words, when the customer is expecting certain functionality and the product does something different. It might be only slightly different or it may be completely different. Either way it′s still a defect. To find a defect you need to compare the product against the specification. You do this by performing a test that measures the performance of the product against its specification. From this it follows that you cannot have a defect unless you have a test. Furthermore, you cannot have a test unless you have a specification. No test or no specification means no defects.

Yeah, yeah! So we have a system where work is measured against a specification and a test report is issued. We all know this and there are few software development companies now that don′t have testing or Quality Assurance (QA) departments but why do we still have too many defects?

In this type of situation, the first question I always ask myself is, "Do we believe the employees are all acting with the best intentions? i.e. do we believe the staff are all acting honestly and no-one is deliberately creating defects?"

If the answer to the question is no, the problems are much, much worse than anything that can be solved by a magazine article. If the answer is yes, we have a situation where we have a good quality, trusted workforce producing poor quality products. If they′re not the problem maybe we should examine the production process itself?

The way the system usually works is we spend a good deal of time at the beginning of the project gathering requirements. Anyone familiar with the cost of change curve observed by Barry Boehm in 1982 will know that this is accepted wisdom and we should spend as much time getting things right at this stage in the process as possible. Boehm′s cost of change curve says that any mistakes we make here will cost ten times as much to fix in the next stage, one hundred times as much to fix in the stage after that and so on. The cost of change is exponential and rises tenfold as we proceed through each stage.

Unfortunately, everybody knows this, including the customer and the requirements engineers. Their natural response to this is twofold. Firstly, to delay signing-off the requirements specification document for as long as possible, knowing that from the moment they sign, the requirements are set in stone and any errors or omissions may be blamed on them. Delays at this stage of the project have a domino effect on the rest of the project, with slip progressing through each stage. If the project deadline cannot slip, the last stage of the project cannot slip either and must, therefore, be shortened. As everyone knows, the last stage of the project is always testing and so the delays at the beginning (and anywhere else in the project) have the effect of reducing the time allowed for testing, probably the most important stage of all.

The second response is to couch the requirements in the vaguest and most ambiguous terms imaginable, so that should the finished product not perform to specification, they can argue about their exact meanings. This is an even better approach than the first one. What happens here is that one copy of the specification is given to the programming department and another exact copy is given to the testing department.

Human beings are unique and no two are exactly alike. Even identical twins are subject to different personal experiences and personal experience helps shapes the way we think, having a profound effect on our beliefs and our outlook on the world. The guys from the learning organisation talk about something called, "the ladder of inference". The theory is that we all think our beliefs are true and the truth is obvious, we also think our beliefs are based on real data and the data we use is the real data.

This causes major problems. In a development process, the programmers read their copy of the specification, which is written in English - an ambiguous language - and couched in extraordinarily ambiguous terms. When they find an omission, they fill in the missing item with an assumption based on personal experiences of similar situations. Similarly, any ambiguities are resolved by drawing conclusions from previous experiences. Each assumption and conclusion drawn is a tiny step towards the top of a ladder of inference.

Now this wouldn′t be too much of a problem if it were only the programmers we had to worry about but, unfortunately, we have the QA department to deal with too. They have gone through a very similar process. Usually working away in isolation, in a separate department, maybe in a separate building or even a separate country, filling in omissions and resolving ambiguities in the specification document, making assumptions, drawing conclusions based on their experiences of the world, climbing to the top of their own ladder of inference.

Depending on the project schedule, it may be many months before the code is ready to go into QA. When it does, each side has invested a great deal of resources in producing their contribution to the testing phase. In addition, each side also believes that the conclusions they have drawn at the top of their respective ladders, are the true conclusions, the only conclusions that could possibly be made based on the available data. Is it any wonder that animosity can develop between these two departments.

As an exercise, think of a simple requirement, or take one from the nearest requirements document, and count how many questions you could ask about that requirement. Write them down, examine them question individually and think of how many answers could be given to each.

We do a similar exercise, based on the simple requirement for a print button, on one of our course and every group of students has come up with at least four reasonable questions and at least three valid answers for each question. When we show them that this means a total of 81 combinations, giving odds of 80:1 against getting it right, they are usually quite surprised. They are even more surprised when we tell them that the odds against the programmers and testers actually agreeing on the specification for the print button are 6,560:1 against.

And this is for just one tiny requirement. Expand that to a reasonably sized project and take into account the assumptions made and conclusions drawn by the requirements engineer too, it starts to look remarkable that we ever deliver software to any degree of customer satisfaction at all.

The reason we do sometimes deliver is that our experiences aren′t totally unique and we share some of them with each other, bringing the odds down quite a bit. Still, I wouldn′t like to gamble my wages on guessing it right, which is what a lot of companies seem to do. Last year′s 66% project failure rate reported by the Standish Group seems to bear that out.

So the problem lies with human nature but how can we solve it given that we need to use human beings to develop software?

The answer is simple: we know that defects are caused by ambiguous requirements, we also know that tests are unambiguous, you either pass or you don′t. So why don′t we couch our requirements in the form of tests.

We could get the testers to meet with the customer and developers prior to starting the coding phase of the project. If it′s not practical for the customer to attend, we can appoint a ′proxy′ customer, maybe the requirements engineer. In the meeting all three parties (the ′Holy Trinity′) could contribute to the design of the tests for the system and agree them. After the meeting the programmers could go away and code knowing exactly what they have to do and the testers could go away and automate the tests so the programmers can run them quickly and easily while they′re working to make sure they′re still on track. The programmers know that when all the tests have passed it is time to stop coding and they can hand over their code to the QA department who can run the tests again, just to make sure.

Have everybody work close together, maybe even in the same room. That way, if more ambiguities arise, or we find another omission, or even if the requirements change, we just have to tell everybody and change the tests. We could just simply call over and let them know.

No, that would be much too easy!

It couldn′t possibly work, could it?


  • Software Engineering Economics, Barry W. Boehm, Prentice Hall PTR; 1982
  • The Fifth Discipline Fieldbook, Peter Senge et al, Currency; 1994

First published in Application Development Advisor

the business value of telemetry

Dynamic technologies and infrastructure allow server failures and network issues to be quickly addressed, easily mitigated and, in many cas...