Posted 2016-06-21 10:45:00 GMT
Testing improves software. So much so that lack of unit tests is called technical debt and blanket statements from celebrated engineers like
A good test
— covers the code that runs in production
— tests behaviour that actually matters
— does not fail for spurious reasons or when code is refactored
For example, I made a change to the date parsing function in Wine, Here adding a unit test to record the externally defined behaviour is uncontroversial.
Tests do take time. The MS paper suggests that they add about 15-35% more development time. If correctness is not a priority (and it can be reasonable for it not to be) then adding automatic tests could be a bad use of resources: the chance of the project surviving might be low and depend only on a demo, so taking on technical debt is actually the right choice. More importantly, tests take time from other people: especially if some subjective and unimportant behaviour is enshrined in a test, then the poor people who come later to modify the code will suffer. This is especially true for engineers who aren't confident making sweeping refactorings, so that adding or removing a parameter from an internal function is turned into (for them) a tiresome project. The glib answer is not to accept contributions from these people, but that's really sad — it means rejecting people from diverse backgrounds with specialised skills (just not fluent coding) who would contribute meaningfully otherwise.
Unit tests in particular can enshrine a sort of circular thinking: a test is defined as the observed behaviour of a function, without thinking about whether that behaviour is the right behaviour. For example this change I made to Pandas involved more changing of test code than real code that people will use. This balance of effort causes less time to be spent on improving the behaviour.
In my experience, the worst effect of automatic tests is the shortcut they give to engineers — that a change is correct if the tests pass. Without tests, it's obvious that one must think hard about the correctness of a change and try to validate it: with tests, this validation step is easy to rationalise. In this way, bugs are shipped to production that would have been easy to catch by just running the software once in a setting closer to the production one.
It's hard to write a good test and so, so much easier to write a bad
test that is tautologically correct, and avoids all behaviour relevant
to production. These bad tests are easy to skip in code review as
they're typically boring to read, but give a warm fuzzy feeling that
things are being tested — when they're not. Rather than counting
the coverage of tests as a metric, we could improve it by using
test coverage of the real code that runs in production.
Unfortunately, these are not the same thing. False confidence
from irrelevant tests measurably reduces reliability.
Post a comment