When developing new modules, or editing existing ones, the following standards should be met to ensure a consistent workflow is used.
But before we delve into the rules surrounding modules, considering the following: do you need to write a module to begin with? You should consider copying code before you consider making a module - more often than not it that code you’re writing doesn’t need to be a module
This is a big one because modules are great and it’s easy to get carried away with them, but they’re also not the ideal choice in most cases. There are some (loose) rules regarding modules that should be answered before you begin development…
- Are you managing at least five resources?
- Do you have a need to use the module more than three times right now?
- Are you just trying to organise code?
- Are you willing to place this module/code into its own repository AND implement and manage all the security that comes with that task?
In most cases code that you’re repeating once or twice can just be copy and pasted. The costs associated with creating module aren’t always obvious:
- Every resource inside your module can no longer be directly configured so you have to expose some or all properties via variables (inputs.tf) which is probably repetition than just copying the code a few times
- (You also have to provide/output{} all the attributes too)
- The variables and functionality you expose with a module constitute an API and APIs are fragile - if the underlying systems they abstract change your API breaks and so does any code using your module (which is why version pinning is important)
- There is a learning curve associated with a module: what inputs does it take? What outputs will it give me? Can I safely use this module multiple times in one state?
Before writing your code as a module stop and consider if you can’t simply copy/paste some code instead.
The following rules apply and are broken down into more detail below:
- All module inputs should be stored in
inputs.tf
- All module outputs should be stored in
outputs.tf
- Any
locals{}
that are required should go inlocals.tf
- All
variable{}
s should use input validation and type constraints - All
variable{}
s should have a description property defined - Any
variable{}
s that are defined should go invariables.tf
- Filenames should be named after the service they manage (
ec2.tf
,rds.tf
) - Filenames may include a specific service name such as
ami
if the nameec2.tf
is too broad, for exampleec2_ami.tf
for code that manages AMIs - Resource names should never contain the type of resource they’re managing - it’s in the resource type already
- Using dashes (
-
) in strings is preferred whenever it’s possible - Modules should always be stored in their own git repository
- Git repositories should make use of semantic versioning and use git tags to mark release candidates
- All modules require a
README.MD
file explaining how-to use the module effectively
Below we’ll break down the rules a bit more and provide examples (when applicable or appropriate.)
Providing an inputs.tf
file makes it clear and obvious where an engineer has to look to review or manage a module’s variable{}
code.
Providing an outputs.tf
file makes it clear and obvious where an engineer has to look to review or manage a module’s variable{}
code.
Any locals{}
that are defined should be stored in a central location called locals.tf
so that it’s easy to locate, review and manage such information.
Terraform affords us the ability to validate input as of 0.13
(it’s considered beta in 0.12.29
). This grants us some obvious benefits but also the not-so-obvious benefit of saving time: instead of waiting for the AWS or Azure APIs to tell us a field’s value (such as an AMI ID, or a Key Vault secret name) is incorrect we’ll learn of this fact at compile time, locally, without any network activity at all.
Using validation is easy:
variable "image_id" {
type = string
description = "The id of the machine image (AMI) to use for the server."
validation {
# regex(...) fails if it cannot find a match
condition = can(regex("^ami-", var.image_id))
error_message = "The image_id value must be a valid AMI id, starting with \"ami-\"."
}
}
We can also provide our variable with a type constraint. This further allows Terraform to catch problems at compile time and also acts as a form of documentation. An example of this can be seen above with the type property.
This should be self explanatory - it documents the purpose of the variable.
This is about discovery again and is similiar in concept to inputs.tf
and outputs.tf
- it makes it easy to find where variables are being defined as opposed to hunting around loads of files or using an IDEs go to reference capabilities (which can take people out of their flow.)
Being able to locate the configuration for an Auto Scaling Group by going to ec2_asg.tf
is obviously easier than trying to find it among many ambiguous files. This is also about discoverability and making it easy to find the code you’re looking for.
Filenames may include a specific service name such as ami if the name ec2.tf
is too broad, for example ec2_ami.tf
for code that manages AMIs
Files can get big quickly if you’re using a lot of EC2 services and you’ve got them all in ec2.tf
. Instead it’s acceptable to break them out into individual files such as ec2_ami.tf
, vpc_subnets.tf
, vpc_sg.tf
and so on. This is a better option than using sub-directories (which are Terraform modules by definition) to organise code, which just results in modules within modules and causes a lot of extra work and problems further down the line.
Resource names should never contain the type of resource they’re managing - it’s in the resource type already
Writing resource “aws_instance” “ec2-web-server”
is pointless because the same information is present in the resource type - instance
- so we know it’s an EC2 resource. The same with virtually every other resource type.
Remember that all the information you need about resources is available in the console using various Terraform commands.
This makes it easier to move around the code using keyboard shortcuts. For example using Alt+<Arrow Keys>
on the following strings yields different results:
my-web-server
will allow the cursor to move from the very beginning of the string to the beginning of the next word, web- but the string
my_web_server
pushes the cursor to the end of the string, making it more difficult and time consuming to edit the word web inside of the string
This is a small detail but when you’re editing this code every day it quickly becomes very convenient to move between words regardless of your IDE choice.
If you’re going to write a module and just keep it alongside the code that import the module then you’re best not even using a module to begin with. The sole purpose of modules existing as a concept is they can be shared with others - sticking them in a sub-directory isn’t sharing it organising code.
When we agree that some code is repeated often enough and is going to be encapsulated in a module we must keep the module in its own repository. This comes with many advantages:
- You can version control the individual module and have code bases pull the module based on a specific version;
- This is good practice and prevents breaking changes to legacy environments;
- Others can also pin their use of your module to specific version;
- You can control access to the module, locking it down to specific users;
- So a module aimed at managing IAM policies should be locked down to security personnel versusing allowing anyone to edit their own access to the system;
- It becomes a lot easier to share with others given its isolated nature;
- You can use a private CI pipeline for the module to do tests and linting on changes;
- You get a separate commit log for easy auditing of who did what, when, etc;
But the most important advantage is the sharable nature of a module in a git repository.
Because documentation is important. At minimum the README should contain:
- What the module manages for the user
- What inputs it takes and which have default values (and what that value is)
- What attributes it exposes
- Who wrote it and how they can be contacted
And anything else that you believe can assist others to better understand the nature and function of the module.