
TDD and ansible - getting started
How to combine test driven development with ansible?
Test driven development has been a thing for me since I discovered it 10-15 years ago. Back then it was mostly as unit tests for C++ and .NET (vb.NET actually).
I have since started doing a lot of ansible, but never really made the effort of formalizing my approach to using TDD in that context. I have been doing tests using vagrant, and later molecule. My colleagues have mostly not understood the value of automated tests or in some case outright thought that it was a hindrance to “doing actual work”
In my latest projects, I have tried to be true to the TDD way of thinking, and I will outline here how I approach it now.
Basics
TDD is about testing first, and then implement. This is an alien thought for a lot of developers. It is even more alien, when we add that the first test must fail.
The process
- Create a failing test covering a feature to add or bug to fix
- Verify that the test is failing
- Add or change the code needed
- Verify that the test succeeds
- Refactor for readability and maintainability (and add some documentation)
- Go to 4 until you are happy with the code
Steps #1 and #2 are the most important ones here.
Doing a test-first strategy will force you to consider what you are trying to implement. This will help you keep focus, (mostly) avoid scope creep and make your interfaces nicer. The latter was an interesting spin-off that I had not considered, when I started doing tests. It comes down to the fact that you will be using classes/roles/functions/structures in multiple different ways, and that nudges you to make better interfaces.
The second step is in many ways magical. You think you have made a nice test that shows what the problem is, and then the test doesn’t fail. Then what?
When system becomes large and complexity rises, you loose overview of details buried in the code, and your idea of how to test something is off. The logical conclusion is that you don’t know the problem well enough to actual state (using a test) what the problem is you want to solve - and if you cannot do that, how can you implement a solution. This problem gets even worse when you are in team, and you are not the only one adding code to a system. Sometimes tests are the only “documentation” showing how an interface should be used.
Step 3 is where most people start and end. Add code, run once, and then call it a day. This is not the recipe for a high quality maintainable system.
Step 4 is you quality control that you have implemented what you have stated you wanted to implement. It is also a vital step to be able to perform as part of the refactoring. Do a git commit when it works, to ensure that you don’t break things too badly when refactoring.
Step 5 is refactoring, which includes cleaning up. This would be deleting stale code, turning magic values into variables, update comments, and so on.
Step 6 is your check that refactoring still is a valid solution to your problem.
There are some version control strategies that are relevant to consider also, but we leave that for a later time.
Applying it to ansible
The simplest test sequence is similar to the following.
Each role is as self contained as possible and has tests to verify their proper functioning. Depending on the system, you will also have integration testing, and perhaps multiple levels of system tests before reaching production.
When testing, we want to test interfaces between components. In ansible, I am using that an “interface” is a role or a collection. It doesn’t matter if it is a mono-repo with lots of roles, or separate repos with one role each. That is a version control and pipeline strategy discussion for another time.
My process is as follows
- Add an entry in the readme.md for the role and state the requirements
- In the molecule scenario for the role, add the test in
verify.yml
- Run
molecule converge
- Run
molecule verify
and see that it fails - Change the code
- Go to 3, until step 4 does not fail
- Refactor, rerun test and update repo docs
-
molecule test
to finish up to ensure that we haven’t forgotten anything
After the above, there is a good chance that my new feature is working, and we want to use it.
If you are lucky enough to have a test environment (that is different from your production environment), start there, otherwise Ansible-playbook -i prod -C -D someplaybook.yml
will tell you the changes you are introducing on your servers.
This step often highlight some small difference between the molecule test scenario and the real world. You will have to evaluate if you need to go back and create new feature/test or if you believe it difference to be benign.
I find that adding
ignore_errors: "{{ ansible_check_mode }}"
needs to be added to selected tasks.
Finalize with
Ansible-playbook -i prod -D someplaybook.yml
well - maybe wait with that command until you have a maintenance window.
Closing notes
Tests are broad topic, and I just wanted to present my thoughts on my TDD strategy for ansible.
In a team there would be a lot of decisions related to the amount of tests, duration of tests, integration test, version control, review, pipelines and so on that is needed also, but we leave that for a later time.