Figures 15-12 and 15-13 are two examples of trends that work together. Figure 15-12 shows a trend of the total number of methods each iteration. Figure 15-12 is the matching code coverage. These examples show why graphs need to be looked at in context. If you only look at the first graph showing the number of methods, you’ll only get half the story. The number of methods is increasing, which looks good, but the coverage is actually decreasing. We do not know the reason for the decreased coverage, but it should be a trigger to ask the team, “Why?”
Figure 15-12 Number of methods trend
Figure 15-13 Test coverage
Remember that these tools can only measure coverage of the code you’ve written. If some functionality was missed, your code coverage report will not bring that to light. You might have 80% code coverage with your tests, but you’re missing 10% of the code you should have. Driving development with tests helps avoid this problem, but don’t value code coverage statistics more than they deserve.
Know What You Are Measuring
Alessandro Collino, a computer science and information engineer with Onion S.p.A. who works on agile projects, told us about an experience where code coverage fell suddenly and disastrously. His agile team developed middleware for a real-time operating system on an embedded system. He explained:
A TDD approach was followed to develop a great number of good unit tests oriented to achieve good code coverage. We wrote many effective acceptance tests to check all of the complex functionalities. After that, we instrumented the code with a code coverage tool and reached a statement coverage of 95%.
The code that couldn’t be tested was verified by inspection, leading them to declare 100% of statement coverage after ten four-week sprints.
After that, the customer required to us to add a small feature before we delivered the software product. We implemented this request and applied the code optimization of the compiler.
This time, when we ran the acceptance tests, the result was disastrous; 47% of acceptance tests failed, and the statement coverage had fallen down to 62%!
What happened? The problem turned out to be due to enabling compiler optimization but with an incorrect setting. Because of this, a key value was read once as the application started up and was stored in a CPU register. Even when the variable was modified in memory, the value in the CPU register was never replaced. The routine kept reading this same stale value instead of the correct updated value, causing tests to fail.
Alessandro concludes, “The lesson learned from this example is that the enabling of the compiler optimization options should be planned at the beginning of the project. It’s a mistake to activate them at the final stages of the project.”
Good metrics require some good planning. Extra effort can give you more meaningful data. Pierre Veragen’s team members use a break-test baseline technique to learn if their code coverage metric is meaningful. They manually introduce a flaw into each method and then run their tests to make sure the tests catch the problem. Some tests just make sure the code returns some value, any value. Pierre’s team makes sure the tests return the
Code coverage is just one small part of the puzzle. Use it as such. It doesn’t tell you how good your tests are but only if a certain chunk of code was run during the test. It does not tell you if different paths through the application were run, either. Understand your application and try identifying your highest risk areas, and set a coverage goal that is higher for those areas than for low-risk areas. Don’t forget to include your functional tests in the coverage report as well.