HOW TO UNIT TEST WITHOUT TESTING EVERYTHING
This article is about how you verify only the behaviour of the function you are writing and not include all the other code that the function calls. To isolate the behaviour from the rest of the application, focusing on doing that right. Keeping it simple, clear and understandable.
But how do you avoid executing the code that is called from your function?
To explain that I have to show you another thing first:
Dependency Injection.
Have you ever heard of SOLID code? Principles that helps you write good code. It is an acronym that stands for:
- Single responsibility principle
- Open-closed principle
- Liskov substitution principle
- Interface segregation principle
- Dependency inversion principle
I will talk about the last one, the ‘D’, because that is totally necessary when writing unit tests.
You maybe noticed that the principle is called …inversion, but I said …injection before? It is “the same”. By applying the dependency inversion principle, you are able to do dependency injection.
I highly recommend that you learn the SOLID principles. They will make you a better developer. Here’s a great video with Uncle Bob Martin to start with:
https://www.youtube.com/watch?v=TMuno5RZNeE
And a link to an article about SOLID:
https://scotch.io/bar-talk/s-o-l-i-d-the-first-five-principles-of-object-oriented-design
Code that does not apply the DI principle
What is the problem? Let’s start with defining that. And what better place than in code that does not apply the dependency inversion principle.
The problem
- We want to test the function we are writing.
- It calls other functions in other classes to do its work.
- We do not want to include that functionality in our tests.
The code with the problem
Our function calls the data layer and gets all users. We want to write a test that checks what our function does if the data layer returns an empty list. To do that with tightly coupled code means that we have to set up the test so it executes all the way down to the database, and the database has to be prepped so it has no users in it. Now, that is not a unit test, and it tests way more than our function. Just imagine the setup code needed to do all this…
What should we do instead then? Wouldn’t it be great if we could tell the data layer to return an empty list without executing the code that gets the users? To just short circuit it?
If we write code tightly coupled, like this:
internal class ClassUnderTest
{
public void TheFunction()
{
var dataLayer = new DataLayer(); // Tightly coupled DataLayer.
var users = dataLayer.GetUsers();
// Do stuff with users.
}
}
There’s no way we can take control over the functions it calls. The function does the allocation and there’s nothing we can do to change that. It is as hard wired as it gets.
So, what do we do instead?
Enter Dependency Inversion Principle.
It is pretty simple. Instead of letting the function create the sub class, we create it elsewhere and hand it to the class. We inject the class that the function depends on.
Then we can send in whatever class we want. It could be the real class, which is probably a good idea in the production code. But it could also be a class that we have prepared to do just what we want it to do, and that is what we do in test code.
“Wait a minute?”, you say. A class is a class! You can’t have two classes with the same name, doing different stuff. True. And that’s why we have interfaces.
Interfaces to the rescue
An interface decouples the signature from the implementation. That way, you can have two classes that have the same signature, but different functionality. Because all the receiver has to know is that whatever comes in through the IDataLayer variable will fit in. It will have all the functions and properties that the interface declares, or the code won’t compile. And the sender can send in whatever function it wants as long as it implements everything the interface requires. So, if it looks like a duck and sounds like a duck, the code thinks it is a duck.
For instance, for an interface that declares the function ReadFromFile(), one implementation reads a file directly, and another reads the file and stores it in a cache so that when you read it often you save time. Both do the same thing, the end result is the same, they just do it slightly differently.
If you don’t know how interfaces work, I suggest that you read up on it. It is important to understand to get this working.
And finally, some code. An example:
public interface IDataLayer
{
List<User> GetUsers();
}
internal class DataLayer : IDataLayer
{
public List<User> GetUsers()
{
// Read users from database...
return users;
}
}
internal class MainClass
{
public void MainFunction()
{
IDataLayer dataLayer = new DataLayer();
var theClass = new ClassUnderTest( dataLayer ); // Injection
theClass.TheFunction();
}
}
internal class ClassUnderTest
{
private readonly IDataLayer _dataLayer;
public ClassUnderTest( IDataLayer dataLayer )
{
// Is the dataLayer a testclass or a real implementation?
// Right here, right now, who cares?
_dataLayer = dataLayer;
}
public void TheFunction()
{
var users = _dataLayer.GetUsers(); // Use the injected class
// Do stuff with users.
}
}
Now, we’re halfway there. We have a way to take control over all external code that our function is depending on. That means that we can provide any response we need. If we want the real implementation, as we do in production code, we create the real DataLayer.
But if we write a test that verifies what our function does if it gets an empty list of users, we can create a class that does just that, and never anything else. That is crucial to the test. To be able to do the same verification every time it runs it has to know that the data layer never returns anything but an empty list. It has to be 100 % repeatable.
Mocking tools
Now, it is not practical to create a new mock class every time you need a new behaviour, or even a single class with logic to set up whatever return values you need. There are better ways, there are tools that helps you do this with a minimal effort.
Tools like FakeItEasy and Moq for C# or Mockito and JMock for Java, to mention just a few, lets you declare the expected behaviour without actually coding the classes. The tools do that for you. Example time!
[Test]
public void TestFunction()
{
// Arrange
var fakeSubClass = A.Fake<IDataLayer>(); // Create an empty canvas
A.CallTo(() => fakeSubClass.GetUsers())
.Returns(new List<User>()); // Add customised behaviour.
var sut = new ClassUnderTest(fakeSubClass); // Inject
// Act
var result = sut.TheFunction();
// Assert
Assert.That(result, Is.EqualTo(0));
}
By the way, I use FakeItEasy in the example.
The fun stuff happens in the Arrange section.
var fakeSubClass = A.Fake<IDataLayer>();
This line creates an empty object that implements the IDataLayer interface. It has absolutely no functionality. So instead of coding a whole mock class, you only add this one line in the test.
The next line defines the behaviour of the class.
A.CallTo(() => fakeSubClass.GetUsers()).Returns(new List<User>());
Now you have an object with no real behaviour, that takes two lines to create, and it will always return an empty list. Short, readable and repeatable.
An object graph
What happens if the class I am injecting also needs a class?
Now that we can do injection, we have to make sure that all classes gets everything that we inject. And because it all starts at the very top of the application, we have to send everything form there into wherever it is needed. That’s not a problem. Just construct it and inject it. Like this:
internal class MainClass
{
public void MainFunction()
{
IUserSorter userSorter = new UserSorter();
IDataLayer dataLayer = new DataLayer( userSorter );
var theClass = new ClassUnderTest( userSorter );
theClass.TheFunction();
}
}
This is called an object graph. Simply put, it is the tree with all the objects that are allocated. Google it, I think you will get a better explanation than I can give you.
This will grow as your application grows, and it will become big. When that happens, you have a choice to make. Either continue building your graph or introduce a Dependency Inversion Container, or DI-container for short.
You will move from explicitly declaring the graph with ‘new’ which is strongly typed (the compiler catches errors early), to a more declarative way of building it, where the tool builds the graph for you. There are both benefits and drawbacks of both. I can’t make the decision for you.
Here is an article from Mark Seemann on choosing:
https://blog.ploeh.dk/2012/11/06/WhentouseaDIContainer/