August 29, 2025terrateam

Deploying an AWS EKS Cluster with Terraform and GitHub Actions

What you'll find in this guide

This guide shows you how to provision a production-ready Amazon EKS cluster using Terraform and automate deployments with GitHub Actions. You'll build a complete infrastructure from VPC to node groups, set up secure CI/CD automation, and deploy a working Kubernetes cluster.

Introduction to deploying an AWS EKS Cluster

Spinning up an EKS cluster manually through the AWS console is easy. Building a production-ready cluster that won't break under load is hard. You need proper networking, security groups, IAM roles, node groups with autoscaling, and monitoring. Dozens of interdependent resources must work together perfectly.

Manual provisioning creates inconsistency. Every environment is a unique snowflake with enough subtle differences to cause deployment failures. Your staging cluster works fine, but production mysteriously fails because someone clicked the wrong subnet during setup.

This article discusses ways to reduce the chaos by using infrastructure as code. You'll write Terraform configurations that create everything from VPCs to worker nodes, then automate deployments with GitHub Actions.

⚡ ⚡ ⚡

Provisioning a production-ready EKS cluster using Terraform

To run EKS clusters in production, you need networking, security, worker nodes, and monitoring; all configured correctly, or you'll face outages and security holes.

Essential components include:

  • A dedicated VPC with public/private subnets across multiple AZs
  • Security groups restricting traffic to necessary ports
  • IAM roles with least-privilege access
  • Managed node groups with autoscaling
  • Key add-ons like AWS Load Balancer Controller

Terraform's strength is consistency. It also handles complexity better than manual provisioning, especially when managing dependency relationships. EKS clusters depend on VPCs, node groups depend on clusters, and security groups reference each other. If your infrastructure is large, be certain you'll spend hours figuring out the correct sequence if you try to build this manually.

The AWS provider includes dedicated EKS resources: aws_eks_cluster, aws_eks_node_group, aws_eks_addon. These understand EKS requirements like proper subnet tagging and load balancer discovery. When someone modifies your cluster through the console, Terraform detects the drift and lets you fix it.

⚡ ⚡ ⚡

Integrating the Amazon EKS cluster into a CI pipeline

As mentioned previously, manual EKS deployments create configuration drift and contribute to human errors. Your staging and production clusters diverge over time, causing mysterious application failures.

A CI pipeline treats EKS infrastructure like application code. Changes go through pull requests, get reviewed, and deploy automatically:

Code Change → PR → Plan → Review → Merge → Deploy → Verify

Benefits include predictable deployments, identical environments across stages, and simple rollbacks through Git reverts. Advanced workflows, such as security scanning, add-on validation, and automated testing, are then easy to implement.

GitHub Actions works well here because it integrates with both Terraform and kubectl:

- name: Deploy Infrastructure
  run: terraform apply
- name: Configure Cluster
  run: |
    aws eks update-kubeconfig --name ${{ vars.CLUSTER_NAME }}
    kubectl apply -f manifests/

The pipeline handles everything from VPC creation to running workloads in a single automated workflow. But first, let's see what the Terraform configuration for an EKS cluster looks like.

⚡ ⚡ ⚡

Writing Terraform configurations for an EKS cluster

For a full guide on how to organize your Terraform code, refer to the this article. For this guide you can have everything in one file.

Your EKS cluster needs a foundation of networking resources before the cluster itself can exist. Start with a VPC that provides isolation and proper subnet layout:

# vpc.tf
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name                                        = "eks-vpc"
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
  }
}

resource "aws_subnet" "private" {
  count             = 2
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.${count.index + 1}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = {
    Name                                        = "eks-private-${count.index + 1}"
    "kubernetes.io/cluster/${var.cluster_name}" = "owned"
    "kubernetes.io/role/internal-elb"           = "1"
  }
}

resource "aws_subnet" "public" {
  count                   = 2
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.${count.index + 10}.0/24"
  availability_zone       = data.aws_availability_zones.available.names[count.index]
  map_public_ip_on_launch = true

  tags = {
    Name                                        = "eks-public-${count.index + 1}"
    "kubernetes.io/cluster/${var.cluster_name}" = "owned"
    "kubernetes.io/role/elb"                    = "1"
  }
}

The subnet tags tell AWS Load Balancer Controller where to place load balancers.

Create the EKS cluster with proper IAM configuration:

# eks.tf
resource "aws_eks_cluster" "main" {
  name     = var.cluster_name
  role_arn = aws_iam_role.cluster.arn
  version  = "1.29"

  vpc_config {
    subnet_ids              = concat(aws_subnet.private[*].id, aws_subnet.public[*].id)
    endpoint_private_access = true
    endpoint_public_access  = true
    public_access_cidrs     = ["0.0.0.0/0"]
  }

  enabled_cluster_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]

  depends_on = [
    aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy,
  ]
}

resource "aws_iam_role" "cluster" {
  name = "${var.cluster_name}-cluster-role"

  assume_role_policy = jsonencode({
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "eks.amazonaws.com"
      }
    }]
    Version = "2012-10-17"
  })
}

resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.cluster.name
}

Add managed node groups for your worker nodes:

# node-groups.tf
resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "main-nodes"
  node_role_arn   = aws_iam_role.node_group.arn
  subnet_ids      = aws_subnet.private[*].id

  instance_types = ["t3.medium"]

  scaling_config {
    desired_size = 2
    max_size     = 4
    min_size     = 1
  }

  update_config {
    max_unavailable = 1
  }

  depends_on = [
    aws_iam_role_policy_attachment.node_group_AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.node_group_AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.node_group_AmazonEC2ContainerRegistryReadOnly,
  ]
}

resource "aws_iam_role" "node_group" {
  name = "${var.cluster_name}-node-group-role"

  assume_role_policy = jsonencode({
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "ec2.amazonaws.com"
      }
    }]
    Version = "2012-10-17"
  })
}

Using this configuration, you can create a production-ready cluster with proper networking, security, and autoscaling. The node groups run in private subnets for security while maintaining internet access through NAT gateways.

⚡ ⚡ ⚡

Using GitHub Actions to plan and apply changes

GitHub Actions automates your EKS deployments the same way it handles any Terraform infrastructure. The key difference is adding Kubernetes verification steps after the cluster deploys.

The workflow builds on the same OIDC patterns from our CI/CD pipeline guide, but adds EKS-specific verification:

name: EKS Deployment

on:
  pull_request:
    paths: ['**.tf']
  push:
    branches: [main]

permissions:
  id-token: write
  contents: read
  pull-requests: write

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: ${{ github.ref == 'refs/heads/main' && 'production' || 'development' }}
    
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::${{ vars.AWS_ACCOUNT_ID }}:role/GitHubActionsTerraformRole
          aws-region: ${{ vars.AWS_REGION }}

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Terraform Init
        run: terraform init

      - name: Terraform Plan
        id: plan
        run: terraform plan -var="cluster_name=${{ vars.CLUSTER_NAME }}"

      - name: Terraform Apply
        if: github.ref == 'refs/heads/main'
        run: terraform apply -auto-approve -var="cluster_name=${{ vars.CLUSTER_NAME }}"

      - name: Verify Cluster
        if: github.ref == 'refs/heads/main'
        run: |
          aws eks update-kubeconfig --name ${{ vars.CLUSTER_NAME }}
          kubectl get nodes
          kubectl get pods -A

Environment-specific deployments work through GitHub repository variables. Set CLUSTER_NAME to dev-cluster in development and prod-cluster in production. The same workflow code handles both environments with different configurations.

You get:

  • Consistent cluster configuration across all environments
  • Built-in approval gates through GitHub Environments
  • Automatic rollback capability through Git reverts
  • Integration with existing code review processes

The verification step makes sure your cluster is functional before marking the deployment successful. Failed cluster creation gets caught immediately rather than discovered later during application deployment.

For multiple environments, you can duplicate the workflow with different branch triggers.

⚡ ⚡ ⚡

Handling secrets and verification steps

OIDC handles AWS authentication, but your applications running in EKS need their own secrets, like database passwords, API keys, and third-party service credentials. Never store these in your Terraform code or GitHub repository.

Instead, use AWS Secrets Manager and integrate it with Kubernetes:

- name: Create Application Secrets
  run: |
    aws secretsmanager create-secret \
      --name "eks-app-secrets" \
      --secret-string '{"db_password":"${{ secrets.DB_PASSWORD }}"}'
    
    # Install AWS Secrets Store CSI Driver
    kubectl apply -f https://raw.githubusercontent.com/aws/secrets-store-csi-driver-provider-aws/main/deployment/aws-provider-installer.yaml

Never commit kubeconfig files to version control:

- name: Configure kubectl
  run: |
    aws eks update-kubeconfig --name ${{ vars.CLUSTER_NAME }} --region ${{ vars.AWS_REGION }}
    kubectl config current-context

Here, the aws eks update-kubeconfig command creates a temporary kubeconfig that uses AWS credentials for authentication. It expires with your OIDC token, which maintains security.

Next, you can add a few verification steps to confirm your cluster works before declaring success:

- name: Cluster Health Check
  run: |
    # Verify nodes are ready
    kubectl wait --for=condition=Ready nodes --all --timeout=300s
    
    # Check system pods
    kubectl get pods -n kube-system
    
    # Verify DNS resolution
    kubectl run test-pod --image=busybox --rm -it --restart=Never -- nslookup kubernetes.default
    
    # Test load balancer controller
    kubectl get deployment -n kube-system aws-load-balancer-controller

You can also add connectivity verification to make sure your cluster can reach external services:

# Test internet connectivity from nodes
kubectl run connectivity-test --image=curlimages/curl --rm -it --restart=Never -- curl -Is https://www.google.com

# Verify ECR access for image pulls
kubectl create secret docker-registry ecr-secret --docker-server=${{ vars.AWS_ACCOUNT_ID }}.dkr.ecr.${{ vars.AWS_REGION }}.amazonaws.com

These checks catch common issues like misconfigured security groups, broken DNS, or missing IAM permissions before your applications try to deploy.

⚡ ⚡ ⚡

Tips to deploy and manage your cluster

kubeconfig in CI/CD requires different handling than local development. Store cluster connection details in GitHub secrets rather than committing config files:

- name: Setup Cluster Access
  run: |
    aws eks update-kubeconfig --name ${{ vars.CLUSTER_NAME }}
    export KUBECONFIG=$HOME/.kube/config

Also, bootstrap essential add-ons through Terraform, not manual kubectl commands. Your cluster needs the AWS Load Balancer Controller and EBS CSI driver to function properly:

resource "aws_eks_addon" "ebs_csi" {
  cluster_name = aws_eks_cluster.main.name
  addon_name   = "aws-ebs-csi-driver"
}

resource "helm_release" "argocd" {
  name       = "argocd"
  repository = "https://argoproj.github.io/argo-helm"
  chart      = "argo-cd"
  namespace  = "argocd"

  create_namespace = true
}

Your Terraform creates the cluster and core add-ons. ArgoCD or similar tools handle application deployments. This separation prevents application changes from triggering expensive infrastructure plans.

Applications need cluster endpoints and certificate data that change between deployments. You can use Terraform data sources to pull dynamic cluster information:

data "aws_eks_cluster_auth" "main" {
  name = aws_eks_cluster.main.name
}

Finally, test cluster functionality immediately after creation. Deploy a simple nginx pod and expose it through a LoadBalancer service. If simple workloads can't run, fix the cluster before deploying complex applications.

The goal is a cluster that works immediately without manual configuration steps. Everything your applications need should be automated through Terraform or ArgoCD manifests.

⚡ ⚡ ⚡

AWS EKS cluster version upgrade best practices

EKS upgrades are high-risk operations that can break your entire workload. Kubernetes changes APIs between versions, deprecates features, and introduces new security policies that might reject your existing pods.

To plan upgrades, check the Kubernetes changelog for breaking changes affecting your workloads. Also, test application compatibility in development clusters running the target version before upgrading production.

Don't skip over versions. For example, go from 1.27 to 1.28, then 1.28 to 1.29. Never jump directly from 1.27 to 1.29. The upgrade sequence matters too. Upgrade the cluster first, then managed add-ons, and finally the node groups:

resource "aws_eks_cluster" "main" {
  version = "1.29"  # Update this first
}

resource "aws_eks_addon" "vpc_cni" {
  addon_version = "v1.15.0-eksbuild.2"  # Update after cluster
}

resource "aws_eks_node_group" "main" {
  version = "1.29"  # Update last
}

Instead of clicking through the console and hoping you remember all the steps, you should be using code to upgrade:

- name: Upgrade Cluster
  run: |
    terraform plan -var="cluster_version=1.29"
    terraform apply -auto-approve

Test upgrades in lower environments first. Your pipeline can automatically upgrade your development environment when you merge version changes, but you should require manual approval for staging and production deployments.

The key is treating upgrades like any other infrastructure change: planned, tested, and deployed through your existing automation rather than emergency console operations.

⚡ ⚡ ⚡

Conclusion

Your Terraform configurations create consistent clusters with proper networking, security, and autoscaling. GitHub Actions automates the entire process from code changes to running workloads. When you need a new environment, you modify variables and let the pipeline handle deployment.

With such a setup, your clusters deploy predictably across development, staging, and production. Version upgrades happen through pull requests rather than risky console operations. Applications get consistent infrastructure regardless of who deploys them or when.

Terrateam takes this foundation further by adding enterprise-grade features. Instead of maintaining custom workflows for dependency management between networking and application layers, Terrateam automatically orchestrates complex deployments. Policy enforcement, drift detection, and compliance reporting work out of the box.

For EKS specifically, Terrateam understands Kubernetes deployment patterns and integrates with tools like ArgoCD and Helm. Your infrastructure and application deployments coordinate automatically without custom scripting.

Start with the GitHub Actions approach to learn the fundamentals. When your team needs advanced orchestration and enterprise controls, Terrateam provides the next level of automation without throwing away your existing Terraform code.