Tuning the DRY/DAMP trade-off in tests
Unlike the semantics of the words dry and damp, the principles DRY and DAMP are not antonyms and can be complementary.
🇧🇷 This article is also available in Portuguese.
When talking to other developers — no matter if you’re in a bar or in an interview — it’s common to hear b̶u̶z̶z̶words like good engineering practices, clean and reusable code, software craftsmanship, etc. We all want to be good developers and write code that makes us proud.
Those good practices can include whether we’re avoiding unnecessary code duplication and thus respecting the Don’t Repeat Yourself (DRY) principle. Or whether we’re writing idiomatic, meaningful, and readable code and thus respecting the Descriptive and Meaningful Phrases (DAMP) principle.
There’s a popular opinion that the application (implementation) code must prioritize DRY over DAMP while the test code must prioritize DAMP over DRY. This encourages us to treat implementation and test code differently and we shouldn't: from a maintenance point of view, they're no different. Both principles are equally important and can coexist in both cases.
Knowing that we're less strict with the test code and willing to change this, I will focus on how to achieve the DRY and DAMP balance in our test suite.
There are some reasons why I write tests and one of them is to provide living documentation. This emphasizes the importance of our tests to be DAMP: we fancy docs that are well structured, easy to read and understand.
Speaking about test structure: respecting its phases by making them visually separated helps a lot. Of course there are exceptions but, in most cases, you should glance at a test and easily identify its phases.
Regarding DRY: we must be careful with obsessing with it and thinking that every duplicated code is a design problem. It's not.
"(…) copying and pasting may well be the right thing to do if the two chunks of code evolve in different directions. If they don’t — that is, if we keep making the same changes to different parts of the program — that’s when we get a problem." — Software Design X-Rays
Unlike the semantics of the words dry and damp, which are a̶l̶m̶o̶s̶t̶ antonyms, the uppercased DRY and DAMP can be complementary: by avoiding unnecessary duplication we can make the test more descriptive as well. Let's see how to do it.
Show me the code
This is a really d̶u̶m̶b̶ simple example but the concept can be expanded and applied to real cases. The class
Person is our System Under Test (SUT) and we will iteratively test
❌ DRY ❌ DAMP
In this example, we can clearly see the tests phases, good! 👏
On the other hand, we're repeatedly instantiating
Person with all its required attributes in each test. In real cases, our SUT will probably have much more attributes and that's when the problem arises. Besides the duplication per se, the relevant data is not explicit: to test
fullName() , only
lastName are relevant; to test
isUnderaged() , only
age is relevant; they should be highlighted.
❌ DRY ✅ DAMP
To minimize code duplication and make it more DRY, we decided to extract
Person creation to the
buildPerson() factory method. But the relevant data is still not explicit and we introduced another problem: fragility. Anyone can change
buildPerson() implementation and break our tests. This is an unnecessary risk, especially considering that factories are normally reused (that's why we created one!).
We realized that
buildPerson() can't be reused to test
isUnderaged(), so we thought about using another two methods: one to build kids and other to build adults.
In more complex cases, I highly recommend the use of factory methods with meaningful names that refer to the ubiquitous language. But be careful not to have an explosion of factory methods: one for each variation of the SUT. In this example, the three methods are almost identical. It's unnecessary to have them and it's not DRY enough.
I consider it's a little bit "DAMPer" than the previous example, but there's still room for improvement.
✅ DRY ❌ DAMP
Another approach is to extract the
Person creation to the test class body or to use before helpers. Now, most of the problems from the previous example persist, and the tests are less DAMP because the setup phase is not explicit anymore.
It may seem it's not a big deal in this simple case but, in real life, we might force the reader into an expedition — jumping between methods — to find out what is relevant to the test.
✅ DRY ✅ DAMP
Finally, the test that found the balance! ⚖️
Its phases are clearly separated, the data that matters to each test is explicit, still the implementation details of
buildPerson() are hidden. It's concise, readable code and the reader doesn't have to drill it down to understand it: "what you see is what you get".
⚠️ Special case
There's one special case with this approach and it happens when the SUT receives dependencies injected.
In those cases, achieving the balance is not so trivial. If we extract the
Service creation to a factory method, we would omit that the class has too many dependencies. And this might be a sign that we have a design problem: it could be a missing abstraction, for example.
Leaving the SUT creation with all its dependencies explicit in every test, which may not seem DRY, is more appropriate in cases like this. That way we are leaving points of attention in the code, and they will be screaming for a redesign or refactor.
Balancing the DRY/DAMP principles in the examples above may seem a little too much, but the impact of achieving it in real cases is remarkable: it gives us more robust tests, more flexibility to support refactorings, and more confidence in our test suite. Win-win!
As a bonus, I highly encourage the use of fixtures that generate random values to build more complex objects: that way we’re forced to pass only the relevant data to the test. But this might be a subject for another article…