TDD and ansible - getting started - Moozer projects and such

How to combine test driven development with ansible?

Test driven development has been a thing for me since I discovered it 10-15 years ago. Back then it was mostly as unit tests for C++ and .NET (vb.NET actually).

I have since started doing a lot of ansible, but never really made the effort of formalizing my approach to using TDD in that context. I have been doing tests using vagrant, and later molecule. My colleagues have mostly not understood the value of automated tests or in some case outright thought that it was a hindrance to “doing actual work”

In my latest projects, I have tried to be true to the TDD way of thinking, and I will outline here how I approach it now.

Basics

TDD is about testing first, and then implement. This is an alien thought for a lot of developers. It is even more alien, when we add that the first test must fail.

The process

Create a failing test covering a feature to add or bug to fix
Verify that the test is failing
Add or change the code needed
Verify that the test succeeds
Refactor for readability and maintainability (and add some documentation)
Go to 4 until you are happy with the code

Steps #1 and #2 are the most important ones here.

Doing a test-first strategy will force you to consider what you are trying to implement. This will help you keep focus, (mostly) avoid scope creep and make your interfaces nicer. The latter was an interesting spin-off that I had not considered, when I started doing tests. It comes down to the fact that you will be using classes/roles/functions/structures in multiple different ways, and that nudges you to make better interfaces.

The second step is in many ways magical. You think you have made a nice test that shows what the problem is, and then the test doesn’t fail. Then what?

When system becomes large and complexity rises, you loose overview of details buried in the code, and your idea of how to test something is off. The logical conclusion is that you don’t know the problem well enough to actual state (using a test) what the problem is you want to solve - and if you cannot do that, how can you implement a solution. This problem gets even worse when you are in team, and you are not the only one adding code to a system. Sometimes tests are the only “documentation” showing how an interface should be used.

Step 3 is where most people start and end. Add code, run once, and then call it a day. This is not the recipe for a high quality maintainable system.

Step 4 is you quality control that you have implemented what you have stated you wanted to implement. It is also a vital step to be able to perform as part of the refactoring. Do a git commit when it works, to ensure that you don’t break things too badly when refactoring.

Step 5 is refactoring, which includes cleaning up. This would be deleting stale code, turning magic values into variables, update comments, and so on.

Step 6 is your check that refactoring still is a valid solution to your problem.

There are some version control strategies that are relevant to consider also, but we leave that for a later time.

Applying it to ansible

The simplest test sequence is similar to the following.

From roles to system

Each role is as self contained as possible and has tests to verify their proper functioning. Depending on the system, you will also have integration testing, and perhaps multiple levels of system tests before reaching production.

When testing, we want to test interfaces between components. In ansible, I am using that an “interface” is a role or a collection. It doesn’t matter if it is a mono-repo with lots of roles, or separate repos with one role each. That is a version control and pipeline strategy discussion for another time.

My process is as follows

Add an entry in the readme.md for the role and state the requirements
In the molecule scenario for the role, add the test in verify.yml
Run molecule converge
Run molecule verify and see that it fails
Change the code
Go to 3, until step 4 does not fail
Refactor, rerun test and update repo docs
molecule test to finish up to ensure that we haven’t forgotten anything

After the above, there is a good chance that my new feature is working, and we want to use it.

If you are lucky enough to have a test environment (that is different from your production environment), start there, otherwise Ansible-playbook -i prod -C -D someplaybook.yml will tell you the changes you are introducing on your servers.

This step often highlight some small difference between the molecule test scenario and the real world. You will have to evaluate if you need to go back and create new feature/test or if you believe it difference to be benign.

I find that adding

ignore_errors: "{{ ansible_check_mode }}"

needs to be added to selected tasks.

Finalize with

Ansible-playbook -i prod -D someplaybook.yml

well - maybe wait with that command until you have a maintenance window.

Closing notes

Tests are broad topic, and I just wanted to present my thoughts on my TDD strategy for ansible.

In a team there would be a lot of decisions related to the amount of tests, duration of tests, integration test, version control, review, pipelines and so on that is needed also, but we leave that for a later time.