Terraform vs AWS CloudFormation
Infrastructure as Code (IaC) allows creating and managing remote infrastructure using definitions in code, rather than manually. This gives you the ability to create resources and make changes that are easily replicable, making it safer, more consistent and easy to collaborate on.
We knew we wanted to manage our infrastructure in this way, and with most of our infrastructure hosted on AWS we found our decision came down to two tools, top contenders for doing this: AWS CloudFormation and HashiCorp’s Terraform. Here in our latest Tenzo Tech blog, we want to share how we came to a decision and what we learned along the way.
We had lots of questions and a number of requirements, including:
- How easy are CloudFormation and Terraform to work with?
- Can we easily replicate environments?
- Can they fit into our existing deployment pipeline?
- Are they secure?
- How reliable are they?
How easy are these tools to work with?
Terraform has its own configuration language that is easy to get started with and understand, and manages to remain clear even when defining multiple resources with complex relationships across various services.
Terraform is also flexible with how it lets you organise files. Resources can all be contained in one giant file, or arbitrarily split across multiple files and Terraform will treat them the same. This has allowed us to logically group resources (usually by AWS service) into separate files. For example, we might have:
- main.tf - contains setup and providers, which allow interaction with various services and APIs (like AWS)
- variables.tf - contains variables we can tweak, which is especially useful in provisioning different environments (production, staging)
- network.tf - here we define networking resources like a VPC, subnets, and route tables
- ec2.tf - define instances and EC2 launch configurations
- outputs.tf - using outputs to expose information about resources we’ve created
The language itself has been designed to avoid the complex nesting of JSON and YAML (which CloudFormation relies on), with features allowing complexity when required. We found the resources encapsulate AWS services well, with clear documentation, meaning it is easy and fast to develop and deploy infrastructure. For example, below is an example definition for an EC2 instance using Terraform.
Launching an EC2 instance using Terraform’s configuration language
To create groups of related resources easily, Terraform has modules that are containers for multiple resources that are reusable and shareable. CloudFormation provides similar functionality using StackSets. We found these great to get started, being able to quickly create multiple resources and test things out, but we later decided to use the individual resources which map more clearly to what’s being created in AWS, making it easier to understand exactly what is happening just by looking at the definitions.
CloudFormation allows writing definitions as templates in either a JSON or YAML formatted text file. These can be written manually, or created using the drag-and-drop graphic tool AWS CloudFormation Designer. Each file, containing the definition for a group of AWS resources, can be saved as a Stack which can then be reviewed and launched.
Defining an EC2 Instance, Subnet and VPC in AWS CloudFormation Designer
We found that the Designer can be difficult to work with. It can take time to find the right resource, and it’s not always clear to see the relationships between resources once they are defined. It’s also not clear to see all the various configuration options available for each resource.
Another issue with the Designer is that it lives entirely outside of our codebase. We wanted to be able to review changes as part of our pull request review process, meaning these definitions would live inside our code repository. In order to make changes using the designer, we’d need to copy the relevant template from our code repository into the Designer, make our changes, then copy it back to the code. We felt this would create friction in the development process.
Alternatively, we could define our templates manually, directly in code. There are plugins that help with this (for example the Typeformation and the JetBrains CloudFormation plugins in PyCharm), but even with these, we found that the templates quickly became unclear and complex when defining multiple resources with relationships to other resources. We felt these definitions didn’t have the clarity and flexibility of Terraform’s configuration files.
Example from the AWS Docs. Launching an EC2 instance using JSON
Example from the AWS Docs. Launching an EC2 instance using YAML
After experimenting with both, we found Terraform easier to work with. The configuration language does a good job of representing complex resources, allowing for fine-grained control when required, whilst managing to remain user-friendly. The flexible file structure meant we could organise our resources clearly, reflecting the logical distinctions between our services (e.g. API vs Web Application). The answer here may vary, and we encourage trying both out for yourself, but for us, Terraform came out as the clear winner.
Can we easily replicate environments?
A key requirement for us is the ability to easily provision multiple environments, replicating the resources across production and staging environments, whilst being able to scale up or down extra environments as required.
For example, using Terraform, we might have prod.tfvars containing:
env = "prod"
And stage.tfvars containing:
env = "stage"
We’d then be able to switch to the right workspace and run a plan with:
terraform workspace select prod
With some resources defined like an EC2 instance (see above), a Route 53 record and a load balancer, we’d be able to launch these resources into a production environment, using the variables in prod.tfvars, and another copy of these into a separate staging environment, replicating the setup as closely as possible.
In CloudFormation we can define the same resources as a stack, using parameters to pass in different values for our environments.
We could then pass different values for each parameter as part of a deployment script.
We like Terraform’s solution for its clarity - the variable files live side-by-side with the resource definition files, and having the ability to define the variables separately keeps our resource definitions simple and easy to maintain. The workspace system is well-suited to maintaining parity between environments and works well. Both Terraform and CloudFormation give us the ability to replicate environments, but we prefer the Terraform solution.
Can Terraform and CloudFormation fit into our existing deployment pipeline?
We use CircleCI for deployment, so ideally our solution would fit nicely within that. We also wanted it to fit side-by-side with our code in the repository, so making modifications becomes part of the same pull review process and part of the same deployment pipeline. It would be possible to use the CloudFormation Orb to create and deploy stacks, and for Terraform we’d use a couple of CircleCi jobs to create and apply a plan.
We liked the idea of using a Hold in CircleCI (which pauses the deployment pipeline until a human approves it) combined with the plan to be able to review the changes Terraform will make before applying them. When generating the plan, Terraform doesn’t make assumptions about what exists (based on what it knows from the state file) and instead scans the state of existing infrastructure, which would catch unintended changes made outside of Terraform.
Example section of a Terraform Plan, a detailed execution plan generated before deciding to apply or discard these changes
To deploy changes using CloudFormation, we could make use of the AWS command-line interface (AWS CLI), using the deploy command to apply changes to our templates. CloudFormation also allows checking for drift, giving the ability to detect and correct unintended changes to infrastructure.
We found that CloudFormation couldn’t give us the same detailed feedback on what would be changing that the Terraform plan gives us. The plan is really powerful and is a huge win for Terraform, giving us the confidence that we’re not making any unintended changes.
We also had a lot of existing infrastructure. Fortunately, both tools have the ability to ‘import’ existing resources, allowing them to be managed by the tool going forwards without having to destroy and recreate them, avoiding any downtime. In CloudFormation, it’s possible to create a stack from existing resources, and in Terraform each resource has an import command, which is visible in the docs (for example here for an EC2 instance).
Are they secure?
The main consideration here surrounds how the state is stored, which tracks the existence and condition of remote resources (how they are) vs how they’re defined (how they should be). CloudFormation manages the state automatically within AWS, whereas Terraform requires a bit of configuration and consideration.
Setting up remote state using S3 or Terraform Cloud are two options that allow for secure storage of the state, which may contain sensitive data. These must be treated carefully and can take some configuration to get right.
How reliable are they?
CloudFormation can monitor the state of resources in real-time, checking infrastructure is provisioned correctly, whereas Terraform will only do this when the commands are run - in our case as part of the deployment pipeline, though a task could be created to run regularly and ensure infrastructure is always up to date.
If CloudFormation fails to modify, it will attempt to roll back to the previous working state. If something goes wrong in Terraform, it will simply fail and resources could be left in an unintended state.
CloudFormation also has wait conditions, giving the ability to wait until a service is running before proceeding. Terraform has relationships between services that define the order in which actions will be taken, and this is customisable using depends_on, but this won’t check that a service has actually started and is running. Therefore, it’s usually best to make changes in a secondary staging environment first.
Overall we found that both tools have the functionality to do everything we required, but Terraform executes it better. It manages to be simple to work with whilst supporting more complex configurations without becoming unwieldy. CloudFormation does the job, but Terraform makes it easy, taking the headache out of managing AWS infrastructure. Whilst we encourage you to try both and make up your own mind, we went with Terraform and can happily say it has served its purpose very well so far.