Import AWS S3 Buckets into Terraform

Import AWS S3 Buckets into Terraform blog post

Introduction

This tutorial covers how to import AWS S3 buckets using version 4.0 of the HashiCorp AWS provider.

In version 4.0, the HashiCorp AWS provider split the aws_s3_bucket resource into several resources, for example aws_s3_bucket_acl and aws_s3_ownership_controls and aws_s3_bucket_public_access_block, for managing the different aspect of the bucket.

Bucket Setup

For this tutorial, I created a bucket to demonstrate the import. I recommend making a bucket that you can modify as you want in order to test importing different attributes. The bucket made for this blog post has the following configuration:

  • Tag named purpose with the value blog-test
  • Static website hosting enabled.
  • Block all public access is disabled.
  • Object ownership is set to “object writer”.

The bucket name terrateam-test-bucket will be used throughout this tutorial, replace it with the name of your bucket.

Create the bucket with default settings and then modify the configuration. This is necessary for the `import` steps to work for some resources. See the Error section below describing why.

Import Workflow

The overall workflow for importing resources is as follows:

  1. If you haven’t already, run terraform init.
  2. Write the rough Terraform code reflecting the resources to be imported. It doesn’t have to exactly match, we’ll fix it up after the import.
  3. Run terraform import on each resource.
  4. Run terraform plan.
  5. Use the output of terraform plan and update the Terraform code to reflect any changes it lists.
  6. Repeat planning and updating the code until you’re happy with the output.

Importing a Single S3 Bucket

Write Initial Terraform Code

Put the following code in a Terraform file (I called mine s3.tf, but the name isn’t important).

terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
alias = "default"
region = "us-east-1"
}
resource "aws_s3_bucket" "bucket" {
bucket = "terrateam-test-bucket"
}
resource "aws_s3_bucket_public_access_block" "bucket" {
bucket = aws_s3_bucket.bucket.id
}
resource "aws_s3_bucket_ownership_controls" "bucket" {
bucket = aws_s3_bucket.bucket.id
rule {
object_ownership = "ObjectWriter"
}
}
resource "aws_s3_bucket_website_configuration" "bucket" {
bucket = aws_s3_bucket.bucket.bucket
index_document {
suffix = "index.html"
}
}

Perform import

The bucket:

$ terraform import 'aws_s3_bucket.bucket' terrateam-test-bucket
aws_s3_bucket.bucket: Importing from ID "terrateam-test-bucket"...
aws_s3_bucket.bucket: Import prepared!
Prepared aws_s3_bucket for import
aws_s3_bucket.bucket: Refreshing state... [id=terrateam-test-bucket]
Import successful!
The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.

The public access block:

$ terraform import 'aws_s3_bucket_public_access_block.bucket' terrateam-test-bucket
aws_s3_bucket_public_access_block.bucket: Importing from ID "terrateam-test-bucket"...
aws_s3_bucket_public_access_block.bucket: Import prepared!
Prepared aws_s3_bucket_public_access_block for import
aws_s3_bucket_public_access_block.bucket: Refreshing state... [id=terrateam-test-bucket]
Import successful!
The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.

The ownership controls:

$ terraform import 'aws_s3_bucket_ownership_controls.bucket' terrateam-test-bucket
aws_s3_bucket_ownership_controls.bucket: Importing from ID "terrateam-test-bucket"...
aws_s3_bucket_ownership_controls.bucket: Import prepared!
Prepared aws_s3_bucket_ownership_controls for import
aws_s3_bucket_ownership_controls.bucket: Refreshing state... [id=terrateam-test-bucket]
Import successful!
The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.

The website configuration:

$ terraform import 'aws_s3_bucket_website_configuration.bucket' terrateam-test-bucket
aws_s3_bucket_website_configuration.bucket: Importing from ID "terrateam-test-bucket"...
aws_s3_bucket_website_configuration.bucket: Import prepared!
Prepared aws_s3_bucket_website_configuration for import
aws_s3_bucket_website_configuration.bucket: Refreshing state... [id=terrateam-test-bucket]
Import successful!
The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.

Iterate on the Code with a Plan

When we wrote our initial Terraform code we just wrote the basic resource but our actual resource has some configuration that we imported into our state file. We want to make our Terraform code match what was imported. We will perform a terraform plan and then manually update our code to reflect what the plan outputs. We want to keep on doing this until our plan has no changes.

$ terraform plan
aws_s3_bucket.bucket: Refreshing state... [id=terrateam-test-bucket]
aws_s3_bucket_ownership_controls.bucket: Refreshing state... [id=terrateam-test-bucket]
aws_s3_bucket_public_access_block.bucket: Refreshing state... [id=terrateam-test-bucket]
aws_s3_bucket_website_configuration.bucket: Refreshing state... [id=terrateam-test-bucket]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# aws_s3_bucket.bucket will be updated in-place
~ resource "aws_s3_bucket" "bucket" {
+ force_destroy = false
id = "terrateam-test-bucket"
~ tags = {
- "purpose" = "blog-test" -> null
}
~ tags_all = {
- "purpose" = "blog-test"
} -> (known after apply)
# (10 unchanged attributes hidden)
# (4 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.

In this case, we just need to add the tags to the aws_s3_bucket.bucket resource:

resource "aws_s3_bucket" "bucket" {
bucket = "terrateam-test-bucket"
tags = {
purpose = "blog-test"
}
}

Perform a terraform plan again, there will be no differences.

Importing S3 Buckets in a for_each

If you have a lot of buckets, it might make sense to manage them with a for_each. I think this is a great idea if you have a lot of buckets that all have, roughly, the same configuration. Having all of them defined in one location can make it easier to understand and update your infrastructure. You know that if you add a bucket to the list of buckets, it will get created with a known set of attributes that have already been reviewed and approved. Doing this does not stop us from separately defining those buckets that might have a very different configuration.

In this example, I will make another bucket called terrateam-test-bucket2 which will be like terrateam-test-bucket. To make it interesting, this bucket will not have public access enabled in order to demonstrate how to provide different configuration when using a for_each.

Write Initial Terraform Code

This code is to get us to the point where we can modify it using terraform plan. At that point, we will add the ability to modify some of the configuration of each bucket independently.

terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
alias = "default"
region = "us-east-1"
}
locals {
buckets = {
"terrateam-test-bucket": {}
"terrateam-test-bucket2": {}
}
}
resource "aws_s3_bucket"
resource "aws_s3_bucket" "buckets" {
for_each = local.buckets
bucket = each.key
}
resource "aws_s3_bucket_public_access_block" "buckets" {
for_each = local.buckets
bucket = each.key
}
resource "aws_s3_bucket_ownership_controls" "buckets" {
for_each = local.buckets
bucket = each.key
rule {
object_ownership = "ObjectWriter"
}
}
resource "aws_s3_bucket_website_configuration" "buckets" {
for_each = local.buckets
bucket = each.key
index_document {
suffix = "index.html"
}
}

Perform Import

The import is going to look a little bit different because we’re using for_each. We have a resource called aws_s3_bucket.buckets but because it is a resource with a for_each it is not a single resource but actually a collection of resources. We need to tell import which entry we want to import. The syntax for this is:

terraform import 'aws_s3_bucket.buckets["terrateam-test-bucket"]' terrateam-test-bucket

In the import, the terrateam-test-bucket in aws_s3_bucket.buckets["terrateam-test-bucket"] is the key in our for_each (the each.key value). I chose to make that key the name of our bucket but that is a decision I made, not something Terraform requires.

We need to do an import for combination of bucket and resource:

terraform import 'aws_s3_bucket.buckets["terrateam-test-bucket"]' terrateam-test-bucket
terraform import 'aws_s3_bucket_public_access_block.buckets["terrateam-test-bucket"]' terrateam-test-bucket
terraform import 'aws_s3_bucket_ownership_controls.buckets["terrateam-test-bucket"]' terrateam-test-bucket
terraform import 'aws_s3_bucket_website_configuration.buckets["terrateam-test-bucket"]' terrateam-test-bucket
terraform import 'aws_s3_bucket.buckets["terrateam-test-bucket2"]' terrateam-test-bucket2
terraform import 'aws_s3_bucket_public_access_block.buckets["terrateam-test-bucket2"]' terrateam-test-bucket2
terraform import 'aws_s3_bucket_ownership_controls.buckets["terrateam-test-bucket2"]' terrateam-test-bucket2
terraform import 'aws_s3_bucket_website_configuration.buckets["terrateam-test-bucket2"]' terrateam-test-bucket2

I made a little shell script to make this easier:

Terminal window
! /usr/bin/env bash
set -e
set -u
for bucket in terrateam-test-bucket terrateam-test-bucket2; do
terraform import 'aws_s3_bucket.buckets["'"$bucket"'"]' "$bucket"
terraform import 'aws_s3_bucket_public_access_block.buckets["'"$bucket"'"]' "$bucket"
terraform import 'aws_s3_bucket_ownership_controls.buckets["'"$bucket"'"]' "$bucket"
terraform import 'aws_s3_bucket_website_configuration.buckets["'"$bucket"'"]' "$bucket"
done

Iterate on the Code with a Plan

I know that terrateam-test-bucket2 has a different configuration than terrateam-test-bucket but our Terraform code does not support configuring our buckets differently. We will use the output of terraform plan to guide what features we need to add to our Terraform code.

$ terraform plan
aws_s3_bucket_ownership_controls.buckets["terrateam-test-bucket"]: Refreshing state... [id=terrateam-test-bucket]
aws_s3_bucket_ownership_controls.buckets["terrateam-test-bucket2"]: Refreshing state... [id=terrateam-test-bucket2]
aws_s3_bucket_public_access_block.buckets["terrateam-test-bucket2"]: Refreshing state... [id=terrateam-test-bucket2]
aws_s3_bucket_public_access_block.buckets["terrateam-test-bucket"]: Refreshing state... [id=terrateam-test-bucket]
aws_s3_bucket_website_configuration.buckets["terrateam-test-bucket"]: Refreshing state... [id=terrateam-test-bucket]
aws_s3_bucket_website_configuration.buckets["terrateam-test-bucket2"]: Refreshing state... [id=terrateam-test-bucket2]
aws_s3_bucket.buckets["terrateam-test-bucket2"]: Refreshing state... [id=terrateam-test-bucket2]
aws_s3_bucket.buckets["terrateam-test-bucket"]: Refreshing state... [id=terrateam-test-bucket]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# aws_s3_bucket.buckets["terrateam-test-bucket"] will be updated in-place
~ resource "aws_s3_bucket" "buckets" {
+ force_destroy = false
id = "terrateam-test-bucket"
~ tags = {
- "purpose" = "blog-test" -> null
}
~ tags_all = {
- "purpose" = "blog-test"
} -> (known after apply)
# (10 unchanged attributes hidden)
# (4 unchanged blocks hidden)
}
# aws_s3_bucket_public_access_block.buckets["terrateam-test-bucket2"] will be updated in-place
~ resource "aws_s3_bucket_public_access_block" "buckets" {
~ block_public_acls = true -> false
~ block_public_policy = true -> false
id = "terrateam-test-bucket2"
~ ignore_public_acls = true -> false
~ restrict_public_buckets = true -> false
# (1 unchanged attribute hidden)
}
Plan: 0 to add, 2 to change, 0 to destroy.

With this output we can see that we need to support two configurations:

  • The tags for a bucket.
  • The public_access_block. In our case terrateam-test-bucket2 has public access disabled. The resource, by default, enables public access.

To support these configuration options, first we’ll modify our resources to look up configuration in the for_each.

First, tags in aws_s3_bucket, modify the resource to look like:

resource "aws_s3_bucket" "buckets" {
for_each = local.buckets
bucket = each.key
tags = lookup(each.value, "tags", null)
}

This says to look up the tags value in the each.value, and if it is not present, use the default value of null.

Next, aws_s3_bucket_public_access_block:

resource "aws_s3_bucket_public_access_block" "buckets" {
for_each = local.buckets
bucket = each.key
block_public_acls = lookup(each.value, "block_public_access", false)
block_public_policy = lookup(each.value, "block_public_access", false)
ignore_public_acls = lookup(each.value, "block_public_access", false)
restrict_public_buckets = lookup(each.value, "block_public_access", false)
}

This says to look up the block_public_access value in each.value, and if it is not there use the default value for this resource, which is false. For simplicity, I configure all of these options using one value, block_public_access, but you can separate them out if you want.

With those updates, we need to modify out buckets variable:

locals {
buckets = {
"terrateam-test-bucket": {
tags = {
purpose = "blog-test"
}
}
"terrateam-test-bucket2": {
block_public_access = true
}
}
}

With these changes, terraform plan and we’ll see no differences.

Now, if we decided that terrateam-test-bucket should actually block public access, we can update the buckets variable:

locals {
buckets = {
"terrateam-test-bucket": {
tags = {
purpose = "blog-test"
}
block_public_access = true
}
"terrateam-test-bucket2": {
block_public_access = true
}
}
}

We can add these configurations for any resource. For example, as an exercise for the reader, make the object_ownership attribute of aws_s3_bucket_ownership_controls configurable.

Adding A Resource For Just One Bucket

It’s possible that we might have one bucket in our list that requires a special resource. Adding this configuration like we did above might not be worth the effort because it is unique. We can still accomplish this without taking that bucket out of the for_each. For example, we want to add an aws_s3_bucket_logging resource for terrateam-test-bucket, and we want it to use the terrateam-test-bucket2 for logs. We can add the following:

resource "aws_s3_bucket_logging" "terrateam-test-bucket" {
bucket = aws_s3_bucket.buckets["terrateam-test-bucket"].id
target_bucket = aws_s3_bucket.buckets["terrateam-test-bucket2"].id
target_prefix = "logs/"
}

This creates a resource called aws_s3_bucket_logging.terrateam-test-bucket which adds logging to the terrateam-test-bucket bucket with the target bucket of terrateam-test-bucket2. We can access those resources through the same syntax we used for terraform import.

Gotchas

What Resources to Import?

One major downside to how the S3 resource was refactored in version 4 is that you have to know all of the types of resources to create an import. If you are creating a new bucket, it’s easy to go from the idea of configuring your bucket as a website to creating a aws_s3_bucket_website_configuration resource. But if you’re importing a bucket, how do you know that you need to import an aws_s3_bucket_website_configuration resource?

There are two options, neither of them good:

  1. Manually compare your Terraform code to the bucket and find all things that are different than the default configuration.
  2. Import your bucket into every S3 resource that is available. This will work, but is not ideal.

The other downside to splitting out all of the S3 attributes to different resources is that drift becomes much harder to detect and resolve. If someone modifies an attribute of an S3 bucket that a resource has not been created for, it will not be detected during drift or planning. This could be quite serious, for example if a user makes a bucket public and aws_s3_bucket_public_access_block does not exist in your Terraform code, there is no way to detect with Terraform.

If someone is concerned about this, their best option is to implement option (2): for every bucket, create all of the possible S3 resources. For more discussion on HashiCorp splitting out the S3 resources, see the GitHub issue.

Error: Cannot import non-existent remote object

The AWS S3 API has a bug where some configuration that is modified during bucket create cannot be found by the AWS API. This is not a bug in Terraform or the AWS provider, it is a bug in AWS. I recommend that if you make a new bucket by hand then: create it, and then change its configuration. You will be able to import it into Terraform.

To resolve this, there are two options:

  1. Modify the configuration then modify it back, and then import. This seems to resolve the API being able to find the configuration.
  2. If you know the configuration, just write your Terraform code to match and do not bother importing that resource. Be sure to still import the bucket but you do not need to import, for example, the website configuration. Be sure to execute a terraform apply if you do this approach, and double check that your bucket still has the configuration you expect after the apply.

Closing Thoughts

While Terraform provides a standard tool for importing a resource, one needs to understand the resources that the provider offers. In version 4.0 of the HashiCorp AWS provider, the S3 resources were split from one resource to several, one for each aspect of a bucket that can be configured.

Be sure to look at the documentation for the S3 resources.

GitOps-First Infrastructure as Code

Ready to get started?

Build, manage, and deploy infrastructure with GitHub pull requests.