Automating Terraform: Terrateam vs. GitHub Actions

Automating Terraform: Terrateam vs. GitHub Actions blog post

Introduction

In this post we will compare GitHub Actions to Terrateam for managing a Terraform workflow. We’ll show that Terrateam provides a safer and more robust solution than using GitHub Actions alone.

Terrateam uses GitHub Actions to execute Terraform. However, the Terrateam backend provides the necessary functionality to ensure changes are applied safely.

We will compare GitHub Actions to Terrateam across the following:

  • Safety - How well do both integrations guarantee the safety of an operation?
  • Robustness - How robust is the solution? How well does it handle failures? How difficult is it to add new functionality?
  • Configurability - How configurable is the solution? Is modifying the workflow easy or cumbersome?

Terrateam and GitHub Actions

Terrateam uses GitHub Actions, so one might wonder: what is this blog post about?

GitHub Actions: on-demand compute

Almost no decision logic happens in the action. Instead, Terrateam has a backend service that listens to GitHub events, evaluates those events, then decides whether or not to run the GitHub Action.

The standalone GitHub Actions solution compared in this post is attempting to solve the same problem solely with GitHub Actions.

Therefore, when this post refers to “standalone GitHub Action” or just “GitHub Actions”, it means a Terraform automation workflow implemented using solely GitHub Actions.

TL;DR

The purpose of this post is not to dismiss a standalone GitHub Actions workflow for Terraform automation. This approach can take a team very far. However, there is a trade-off. As the complexity of a workflow grows and the team grows, a GitHub Actions solution starts to show its limitations. Some workflows are simply not possible using GitHub Actions without giving up on safety. A complicated GitHub Action is a piece of software that needs to be maintained like any other software and that is a burden a team should take into account.

Terrateam is able to simultaneously scale with the workflow and team while maintaining safety, robustness, and configurability. Because Terrateam is a service, users benefit from improvements and bug fixes without having that maintenance burden.

Terraform and GitHub Actions

The basis for comparing Terrateam to GitHub Actions is the blog post: Elevate Your Terraform Workflow With GitHub Actions by Andrew Walker. This a great blog post that develops a GitHub Action that supports linting, planning, and applying. He also describes how to use GitHub Branch Protection to give some safety guarantees.

Safety

Terrateam excels at safety. Being a service, it easily tracks the state of the changes in the repository, which is imperative to ensuring changes are applied safely.

The GitHub Actions solution does provide some guarantees, but they are not nearly as strong as the Terrateam guarantees.

  • All changes are up to date
    • GitHub Actions - By using Branch Protection rules, all code is guaranteed to include the latest updates from the destination branch.
    • Terrateam - Terrateam tracks all runs that occur and updates to the destination branch and will not perform an apply if the plan does not reflect the latest code.
  • Requirements are met prior to applying
    • GitHub Actions - Can only perform an apply after merge and uses Branch Protection to control when a merge can be performed.
    • Terrateam - Can perform an apply before or after a merge. Supports configurable requirements around when an apply can be performed.
  • Applies are serialized
    • GitHub Actions - Cannot guarantee this. It ensures that all pull requests have the latest code prior to merge, but it is possible to merge a pull request right after another one has been merged, before the first apply has finished. In this case, multiple apply operations would execute at the same time, but with locking of Terraform state enabled, one apply would fail. There is no way to execute the failed apply without another code change.
    • Terrateam - Guarantees only a single apply is executing at a time. In the case that a second pull request is merged before the previous apply has finished, Terrateam will also prevent any future apply from being performed until the second pull request has been successfully applied.
  • Locking
    • GitHub Actions - Only provides locking during the Terraform run via Terraform state locking. This prevents two operations from overlapping, but in the case of a failure, there is no guarantee that no further change can be applied until the previous failure is resolved.
    • Terrateam - Guarantees that a change is both merged and applied before allowing other applies to be performed. A lock is associated with the pull request, not with a specific Terraform run. A pull request owns a lock once it has either been merged or any of its changes have been applied, and it releases the lock once all changes have been successfully applied and it is also merged.

Safety: More than the act of applying changes. Also security.

Terrateam, being a backend service, is able to make decisions using more information than just what is available in the pull request.

For example, Terrateam will fetch sensitive configuration, such as permissions, from the default branch rather than the branch in the pull request.

Terrateam can also verify that the GitHub Action workflow file is correct. A standalone GitHub Action workflow will blindly run whatever is in the branch without any external verification or validation that it is running the intended code.

Robustness

To some degree, this is an unfair comparison.

Terrateam is a backend service that uses GitHub Actions to execute Terraform in a trusted environment. Being a backend service, Terrateam is written in a programming language with a transactional database.

On the other hand, GitHub Actions can call other programs specified by a workflow written in YAML which limits what can be expressed.

The theme for this category is the following:

Most things are possible in a GitHub Action but the devil is in the details. Implementation can often times be challenging and brittle.

Failed operation

  • GitHub Actions - Reports it to the user via a comment on the pull request. However, there is no way to trigger a re-run except for performing a code change. This is problematic for a plan or apply operation. There are many situations where they may fail, not due to an error in the code, such that retrying will resolve it.
  • Terrateam - Reports to the user via a comment. Operations can be re-run by commenting terrateam plan or terrateam apply on the pull request. Terrateam also provides actionable feedback for some errors, and is adding more on each release.

Actionable feedback

  • GitHub Actions - This solution provides nothing beyond the error messages Terraform provides. A user can always add their own code to improve the output.
  • Terrateam - Constantly improving its feedback. A great developer experience is a goal of the product. This applies to Terraform messages, for example if one is trying to apply a pull request and another pull request owns the lock, the error message includes which pull request has the lock, so the user can take action. Terrateam also looks at the Terraform changes to provide help. For example, if it sees an error that looks like it is related to credentials not being configured, it provides additional information to the user.

Adding new functionality

  • GitHub Actions - Possible, however at some point the amount of YAML will make it hard to maintain and modify. Being a GitHub Action also puts a limit on how sophisticated new features can be implemented without running one’s own infrastructure.
  • Terrateam - Easy to add new functionality and gaining new ones on each release. A lot of custom functionality can be added to a Terrateam workflow via hooks and workflows. But, the Terrateam product is maintained by the Terrateam company, which means one depends on us adding functionality that cannot be achieved with hooks and workflows. Customer requests and feedback is always considered for each release.
  • Bugfixes and Improvements
    • GitHub Actions - Each GitHub Actions solution is a custom solution and does not benefit from a community addressing a bug or adding a new feature.
    • Terrateam - Improves on each release. Using Terrateam means that even if one has not come across a particular bug, if any one customer has, the service will improve based on that collective experience.

Manual overrides

  • GitHub Actions - Does not provide strong safety guarantees so there are not many escape hatches to provide. As mentioned previously, the solution does not support running a plan or apply without a code change. In the case of an outage, this is a bad experience to have to perform code changes when needing to re-run an apply.
  • Terrateam - Allows a user to get around the safety guarantees if necessary. Commenting terrateam unlock in a pull request will remove any locks that pull request owns. This should not be part of everyday usage, but in an emergency the system should not be an unnecessary hindrance. Terrateam also supports manually re-running a plan or apply. In the case of an outage, if an apply fails due to reasons outside of the code, it is trivial to re-run.

Configurability

This is where Terrateam really shines compared to GitHub Actions.

A goal of Terrateam is to fit into its users’ workflow, not impose a workflow on its users. This makes Terrateam very configurable. While it is very configurable, the default behaviors of Terrateam work out-of-the-box for most use-cases, requiring very little upfront setup. Reasonable defaults are configured against Terraform repositories with safety as the number one priority and usability as a close second.

For sophisticated workflows, some configuration will be required. Terrateam supports and continues to support more complex workflows through configurations such as hooks, workflows, tagging, and directory globbing.

Standalone GitHub Actions workflows don’t have any runtime Terraform defaults in place requiring lots of manual setup and configuration for even a simple workflow. Additionally, standalone GitHub Actions workflows don’t have off-the-shelf features to satisfy complex Terraform workflow requirements. Writing custom scripts and injecting other GitHub Actions into a workflow file are typical solutions which can lead to more complexity and fragility.

See how stark the difference is:

Linting

Before running any change with Terraform, lots of teams often times want to lint the new code. The Terraform CLI makes this easy with the terraform fmt command.

To configure Terrateam to run terraform fmt:

workflows:
- tag_query: ""
plan:
- type: run
cmd: ["terraform", "fmt", "-check", "-diff", "."]
capture_output: true
- type: init
- type: plan

The configuration in the GitHub Action is a fair bit longer:

name: Plan / Test On PR
on:
pull_request:
branches:
- main
jobs:
lint:
name: Lint
runs-on: ubuntu-20.04
steps:
- name: Check out code
uses: actions/checkout@v2
- name: Setup Terraform
uses: hashicorp/setup-terraform@v1
with:
terraform_version: 1.0.9
- name: Run terraform fmt check
run: terraform fmt -check -diff -recursive ./terraform

But these experiences are not the same. On a successful plan, Terrateam only provides the plan output on a pull request comment. It’s only when something fails that Terrateam shows more details in the comment. In the case of the GitHub Action, if it fails, one must go to the action output to investigate.

Additionally, what if we only want to run fmt against specific directories? In Terrateam this is easy using tagging. The below configuration will tag any file under the production directory with the production tag and only run fmt on those files.

dirs:
production/**:
tags: [production]
workflows:
- tag_query: production
plan:
- type: run
cmd: ["terraform", "fmt", "-check", "-diff", "."]
capture_output: true
- type: init
- type: plan

Planning

For this workflow, we’d like to run the following steps for each environment (dev, stage, prod):

  • Check out code
  • Run linting
  • Setup Terraform
  • Configure AWS Credentials using GitHub Secrets
  • Initialize Terraform
  • Plan Terraform
  • Post Plan to GitHub PR

To configure Terrateam:

Modify the Terrateam configuration .terrateam/config.yml:

workflows:
- tag_query: ""
plan:
- type: run
cmd: ["terraform", "fmt", "-check", "-diff", "."]
capture_output: true
- type: env
name: TF_VAR_allowed_account_id
cmd: ["echo", "$ALLOWED_ACCOUNT_ID"]
- type: init
- type: plan

To perform this in the GitHub Action:

name: Plan / Test On PR
on:
pull_request:
branches:
- main
jobs:
lint:
name: Lint
runs-on: ubuntu-20.04
steps:
- name: Check out code
uses: actions/checkout@v2
- name: Setup Terraform
uses: hashicorp/setup-terraform@v1
with:
terraform_version: 1.0.9
- name: Run terraform fmt check
run: terraform fmt -check -diff -recursive ./terraform
plan:
name: Plan
env:
TF_VAR_allowed_account_id: ${{ secrets.ALLOWED_ACCOUNT_ID }}
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
path:
- dev
- stage
- prod
steps:
- name: Check out code
uses: actions/checkout@v2
- name: Setup Terraform
uses: hashicorp/setup-terraform@v1
with:
terraform_version: 1.0.9
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-region: us-east-1
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
- name: Initialize Terraform
run: |
cd terraform/${{ matrix.path }}
terraform init -input=false
- name: Plan Terraform
id: plan
continue-on-error: true
run: |
cd terraform/${{ matrix.path }}
terraform plan -input=false -no-color
- name: Post Plan to GitHub PR
uses: mshick/add-pr-comment@v1
with:
allow-repeats: true
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: "github-actions[bot]"
message: |
## ${{ matrix.path }} plan
${{ steps.plan.outputs.stdout || steps.plan.outputs.stderr }}

But, again, these experiences are not the same. Some key differences:

  • The GitHub Action solution mixes business logic (how to run Terraform), with configuration (what unique values to use). Modifying a configuration change could break the business logic, purely by accident.
  • The GitHub Action uses a build matrix to run multiple directories (dev, stage, prod) on every pull request, even if no change has been made to those directories, or even to a Terraform file. This means updating the README will result in a plan and apply of all environments.
  • Terrateam only runs Terraform on those directories that have Terraform changes in them. A pull request opened without Terraform changes is ignored.
  • It is possible for a plan to be too large to fit in a comment. The GitHub Actions solution simply fails in this case and the user receives no output. Terrateam will try to remove unneeded output from the plan to make it smaller and if that still does not fit in a comment it will comment with a warning and direct the user where to see the full output.

The blog post goes on to show various improvements around planning, such as removing extraneous output and modifying the output to fit the diff syntax. All this comes at the cost of more lines of YAML.

Terrateam does this already, and more, combining all of the outputs into a concise single comment.

The output of Terrateam requires no configuration:

Terrateam Plan Screenshot

and the output of GitHub Action:

GitHub Actions Plan Screenshot

The configuration to achieve this output:

- name: Put Plan in Env Var
run: |
PLAN=$(cat plan.txt)
echo "PLAN<<EOF" >> $GITHUB_ENV
echo "$PLAN" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Plan Terraform
id: plan
continue-on-error: true
run: |
cd terraform/${{ matrix.path }}
terraform plan -input=false -no-color -out=tfplan \
&& terraform show -no-color tfplan
- name: Reformat Plan
run: |
echo '${{ steps.plan.outputs.stdout || steps.plan.outputs.stderr }}' \
| sed -E 's/^([[:space:]]+)([-+])/\2\1/g' > plan.txt
- name: Put Plan in Env Var
run: |
PLAN=$(cat plan.txt)
echo "PLAN<<EOF" >> $GITHUB_ENV
echo "$PLAN" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Post Plan to GitHub PR
uses: mshick/add-pr-comment@v1
with:
allow-repeats: true
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: "github-actions[bot]"
message: |
## ${{ matrix.path }} plan
${{ env.PLAN }}

Applying

Terrateam supports pre-merge and post-merge apply workflows. Terrateam supports running operations both on pull request create, update, and merge, but also through commenting on the pull request terrateam plan or terrateam apply. In a pre-merge workflow, Terrateam can also automatically merge the change after all changes have been applied successfully.

The GitHub Actions workflow only supports post-merge apply, and it has to be this way to maintain the few safety guarantees it offers. As mentioned before, this solution uses Branch Protection rules to ensure that all pull requests have the latest changes before merging, which is how applies are guaranteed to always have the latest changes. But this means that there cannot be a pre-merge workflow because then an apply could be performed while the pull request is not mergeable.

The inability to perform an apply without a code change is a serious limitation for the GitHub Action workflow. Our customers commonly see an apply fail for various reason unrelated to the code, for example an IAM permission not being visible when the change using it executes.

Finally, the GitHub Action workflow does not store the plan between the plan and apply, and instead it performs a second plan before applying. The consequence is the apply that is performed may not be the plan that was reviewed in the pull request. How big of a risk this is depends on your organization, but this is a silent failure mode that will not be visible until it’s too late.

Terrateam stores the plan and applies exactly that plan, one is guaranteed that the plan they reviewed is the plan they applied.

For the sake of completeness, here is what the Terrateam configuration looks like to in order to match the apply-after-merge workflow of the Github Action:

when_modified:
autoapply: true
workflows:
- tag_query: ""
plan:
- type: run
cmd: ["terraform", "fmt", "-check", "-diff", "."]
capture_output: true
- type: env
name: TF_VAR_allowed_account_id
cmd: ["echo", "$ALLOWED_ACCOUNT_ID"]
- type: init
- type: plan

And here is what the GitHub Actions configuration has grown to:

name: Plan / Apply On Merge
on:
push:
branches:
- main
jobs:
inform_about_apply:
name: Inform About Apply
runs-on: ubuntu-20.04
steps:
- name: Inform on PR that Apply is Running
uses: mshick/add-pr-comment@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: "github-actions[bot]"
message: |
***Running terraform apply***
Results will display here momentarily...
plan_and_apply:
name: Plan and Apply
env:
TF_VAR_allowed_account_id: ${{ secrets.ALLOWED_ACCOUNT_ID }}
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
path:
- dev
- stage
- prod
steps:
- name: Check out code
uses: actions/checkout@v2
- name: Setup Terraform
uses: hashicorp/setup-terraform@v1
with:
terraform_version: 1.0.9
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-region: us-east-1
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
- name: Initialize Terraform
run: |
cd terraform/${{ matrix.path }}
terraform init -input=false
- name: Plan Terraform
id: plan
continue-on-error: true
run: |
cd terraform/${{ matrix.path }}
terraform plan -input=false -no-color -out=tfplan \
&& terraform show -no-color tfplan
# Sed is taking all lines that begin with one or more spaces followed by a `+` or `-`.
# It stores the amount of spaces in `\1` and the +/- in `\2`.
# Then replace that portion of the line with `\2\1` (+/- followed by the number of matched spaces).
- name: Reformat Plan
if: steps.plan.outcome == 'success'
run: |
echo '${{ steps.plan.outputs.stdout || steps.plan.outputs.stderr }}' \
| sed -E 's/^([[:space:]]+)([-+])/\2\1/g' > plan.txt
- name: Put Plan in Env Var
if: steps.plan.outcome == 'success'
run: |
PLAN=$(cat plan.txt)
echo "PLAN<<EOF" >> $GITHUB_ENV
echo "$PLAN" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Apply Terraform
if: steps.plan.outcome == 'success'
id: apply
continue-on-error: true
run: |
cd terraform/${{ matrix.path }}
terraform apply \
-input=false \
-no-color \
tfplan
- name: Post Plan and Apply to GitHub PR
if: steps.plan.outcome == 'success' && steps.apply.outcome == 'success'
uses: mshick/add-pr-comment@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: "github-actions[bot]"
message: |
Applying **${{ matrix.path }}**:
${{ env.PLAN }}
${{ steps.apply.outputs.stdout }}
- name: Post Plan Failure
if: steps.plan.outcome == 'failure'
uses: mshick/add-pr-comment@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: "github-actions[bot]"
message: |
Plan failed for **${{ matrix.path }}**:
${{ steps.plan.outputs.stderr }}
- name: Post Apply Failure
if: steps.apply.outcome == 'failure'
uses: mshick/add-pr-comment@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: "github-actions[bot]"
message: |
Apply failed for **${{ matrix.path }}**:
${{ steps.apply.outputs.stderr }}

Extras

Terrateam supports even more features for pull requests:

  • Cost estimation
  • Static analysis
  • RBAC

Cost estimation and static analysis can be added to the GitHub Action solution, but at the cost of more YAML, that has already grown quite large.

RBAC, on the other hand, is simply not feasible to implement in this way. Any RBAC solution in this YAML can be turned of in a pull request, circumventing its own protection.

Conclusion

One might think this comparison is unfair. The GitHub Actions solution implements some of what Terrateam does using only GitHub Actions, and this blog post has only shown the configuration file for Terrateam. Not only that, but the GitHub Actions solution is in a few hundred lines of YAML, surely Terrateam is more code than that.

And Terrateam is more lines of code. Terrateam is supporting a more diverse range of workflows. It’s providing stronger safety guarantees with more flexibility. But consider a small change to one’s workflow: wanting to apply before merge. This would require a significant change to the GitHub Actions YAML, and may not be possible to do safely without running an external service. Standing up a homegrown solution with GitHub Actions can quickly lead to a tangled web of workflow steps with very little safety or visibility.

And that is why we don’t dismiss a standalone GitHub Actions solution outright. A lot can be accomplished with it. Despite the strong words in this blog post, if the GitHub Actions workflow matches your needs, use it. If your workflow outgrows it, switching to Terrateam is as easy as removing the existing GitHub Actions workflows and replacing it with Terrateam.

With Terrateam, we have a lot of out-of-the-box functionality that will allow your team to make safe Terraform changes with very little required configuration.

Please give Terrateam a shot by following our getting-started guide.

GitOps-First Infrastructure as Code

Ready to get started?

Build, manage, and deploy infrastructure with GitHub pull requests.