When metrics are used as guideposts—telling the team when it’s getting off track or providing feedback that it’s on the right track—they’re worth gathering. Is our number of unit tests going up every day? Why did the code coverage take a dive from 75% to 65%? It might have been a good reason—maybe we got rid of unused code that was covered by tests. Metrics can alert us to problems, but in isolation they don’t usually provide value.
Metrics that measure milestones along a journey to achieve team goals are useful. If our goal is to increase unit test code coverage by 3%, we might run the code coverage every time we check in to make sure we didn’t slack on unit tests. If we don’t achieve the desired improvement, it’s more important to figure out why than to lament whatever amount our bonus was reduced as a result. Rather than focus on individual measurements, we should focus on the goal and the trending toward reaching that goal.
Metrics help the team, customers included, to track progress within the iteration and within the release or epic. If we’re using a burndown chart, and we’re burning up instead of down, that’s a red flag to stop, take a look at what’s happening, and make sure we understand and address the problems. Maybe the team lacked important information about a story. Metrics, including burndown charts, shouldn’t be used as a form of punishment or source of blame. For example, questions like “Why were your estimates too low?” or “Why can’t you finish all of the stories?” would be better coming from the team and phrased as “Why were our estimates so low?” and “Why didn’t we get our stories finished?”
Metrics, used properly, can be motivating for a team. Lisa’s team tracks the number of unit tests run in each build. Big milestones—100 tests, 1000 tests, 3000 tests—are a reason to celebrate. Having that number of unit tests go up every day is a nice bit of feedback for the development and customer teams. However, it is important to recognize that the number itself means nothing. For example, the tests might be poorly written, or to have a well tested product, maybe we need 10,000 tests. Numbers don’t work in isolation.
Lisa’s Story
Pierre Veragen told me about a team he worked on that was allergic to metrics. The team members decided to stop measuring how much code their tests covered. When they decided to measure again after six months, they were stunned to discover the rate had dropped from 40% to 12%.
How much is it costing you to not use the right metrics?
—Lisa
When you’re trying to figure out what to measure, first understand what problem you are trying to solve. When you know the problem statement, you can set a goal. These goals need to be measurable. “Reduce average response time on the XYZ application to 1.5 seconds with 20 concurrent users” works better than “Improve the XYZ application performance.” If your goals are measurable, the measurements you need to gather to track the metrics will be obvious.
Remember to use metrics as a motivating force and not for beating down a team’s morale. This wisdom bears repeating: Focus on the goal, not the metrics. Maybe you’re not using the right metrics to measure whether you’re achieving your team’s objectives, or perhaps you’re not interpreting them in context. An increased number of defect reports might mean the team is doing a better job of testing, not that they are writing more buggy code. If your metrics aren’t helping you to understand your progress toward your goal, you might have the wrong metrics.
What Not to Do with Metrics
Mark Twain popularized the saying, which he attributed to Benjamin Disraeli, “There are three kinds of lies: lies, damned lies, and statistics.” Measurable goals are a good thing; if you can’t gauge them in some way, you can’t tell if you achieved them. On the other hand, using metrics to judge individual or team performance is dangerous. Statistics by themselves can be twisted into any interpretation and used in detrimental ways.
Take lines of code, a traditional software measuring stick. Are more lines of code a good thing, meaning the team has been productive, or a bad thing, meaning the team is writing inefficient spaghetti-style code?
What about number of defects found? Does it make any sense to judge testers by the number of defects they found? How does that help them do their jobs better? Is it safe to say that a development team that produces a higher number of defects per lines of code is doing a bad job? Or that a team that finds more defects is doing a good job? Even if that thought holds up, how motivating is it for a team to be whacked over the head with numbers? Will that make the team members start writing defect-free code?
Communicating Metrics