Wednesday, December 15, 2010

Signal to Noise ratio in software testing

When using techniques like automated functional testing we want a high signal to noise ratio, we want a failing test to tell us something has really broken and not just that we changed something.
Tracking this ratio of 'useful failures' to 'wasteful failures' gives us a signal to noise ratio for our tests. Ideally we want something like 10:1, say, so only every 10th failure is spurious and just due to a software change and/or fragility in our tests. Unfortunately for many teams this is more like 1:10, so most test failures are due to test fragility and not to something really being broken.
When we get too much noise and not enough signal we tend to start ignoring problems and disabling tests, while this might make the ratio better we do so at the cost of losing some of the signal. You might want to try tracking this ratio for a few weeks, tracking the trend over time can give a way to focus attention on eliminating areas of high noise.
My experience is the sources of noise are often related and are often due to things like timeout issues, hard coding id's or creating unnecessary dependencies on the order things get displayed or happen. Another common source of noise is teams working with very large backlogs of 'low impact' bugs, the issue here is more complex and probably worth a post on it's own - but when prioritizing bugs it is worth considering the impact they have on team productivity and not just the production impact.
Whatever the cause a relatively small effort can sometimes dramatically improve the signal to noise ratio.

1 comments:

Pavan said...

Hey Ian,

I really like the idea of Signal to Noise ratio. I work on the Go team and have worked on Twist before that. In the past 4 years, I have not worked on a non-flaky functional test build.

Its not a horrendously bad pragmatic decision to follow this idea and live with the noise, as long as we are diligent about not calling bugs "noise" - which we inadvertently do sometimes as people stop checking functional test results on every checkin.