While I am a big fan of automated tests and TDD, I have seen code bases that were unnecessarily difficult to change, because of sub-optimal tests. The following list contains the 7 most common problems I have seen in such test suites.
In this first sin, a test contains multiple assertions on different outcomes or behaviors. Creating such tests slows us down in multiple ways. It leads to our tests becoming unfocused, there is no possibility to immediately tell why this test has failed without looking at the exact assertion that failed. Even worse, fixing such a test could require multiple test-run & fixing cycles since the test will abort at the first failing assertion.
But maybe the biggest problem with this is how in the long term such tests tend to become very unfocused duplicating the same assertion between different test cases and making refactoring or changing the application a daunting task.
Mocking typically comes at a huge cost, especially if it requires heavy mocking libraries and/or includes a large number of mocked objects.
Each mock we create needs to be maintained and to be held in sync with what they are actually mocking. Therefore, depending on the number of mocks used, the effort to change parts of the systems might easily be multiplied by the number of mocks.
Furthermore, excessive mocking often indicates poor software design. Instead of using powerful mocking libraries, it is often a wise choice to utilize tests as an additional driver for clean software design.
Remember the Open-Closed Principle of the SOLID design principles?
software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification
Something similar applies to our tests. In cases where it is okay to add additional data and fields to return values of the tested code, our tests should absolutely allow doing so.
We know we should test mainly at the unit level, but… what exactly is a unit?
Imagine you write a service to map from one representation of an object to another.
Depending on the language, your preferences, and coding style, this could be a class or function which contains all logic within its inner scope, or it could be composed of several different sub-mappers (even one for each specific field).
This should NOT influence your tests in any way. The mapper stays the reusable and exposed “unit”, providing a stable interface to the rest of the application. Possible sub-mappers are an implementation detail and MUST NOT be tested separately.
In general, you can view a unit as the smallest reusable collection of code.
Assertions should be as strong as needed, but as weak as possible.
This point ties into point three above. We do not want to assert behavior or results that we do not depend upon. If it is not a requirement, it should not be asserted by a test.
Think about code that returns a URL with query parameters. It is perfectly legit to assert the presence of required query parameters, but should our test really break if we add another parameter? Should it break if the order of the parameters changes?
Definitely not! We should not have to adapt our tests, just because we decide to assemble the query string in a different way.
What do I mean by non-requirement? Calls and assertions you wouldn’t find in a user story.
There are cases where it is necessary to spy on an object and verify that it has been called. For example, when sending a notification email is the side-effect of creating an order.
But, typically, when testing a method that has to use mocks, we should not focus our testing on the exact parameters passed to the mock object.
Don’t add assertions like
.hasBeenCalledTimes(1)
unless the number of calls is critical to assert on.
The moment a test starts to act flaky and starts to randomly break builds, you must act immediately. Fixing this should have the highest priority of all tasks.
If there is no feasible way to fix a flaky test immediately, it is perfectly okay to delete the test for now and to create a follow-up ticket to create a proper test in its place.