Environment as Code: Testing Your Terraform and Pulumi Infrastructure

Environment as Code: Testing Your Terraform and Pulumi Infrastructure

Infrastructure code breaks. An S3 bucket gets configured with public ACLs. A security group rule opens port 22 to the world. A load balancer misconfiguration sends traffic to the wrong target group. These aren't hypothetical — they are the kinds of bugs that make it to production because the infrastructure code review process is, for most teams, just eyeballing YAML.

Testing infrastructure code is not optional. It's the same engineering discipline applied to the layer one level below your application code. The tools exist. The patterns are well established. Here's how to apply them.

Why Test Infrastructure Code

Application code has tests because bugs in application code cause failures. Infrastructure code deserves the same treatment for the same reason, amplified: an infrastructure bug can cause a security breach, a complete service outage, or data loss. The consequences are higher, not lower.

The specific problems that infrastructure tests catch:

  • Misconfigured resources. Wrong AMI, wrong instance type, public bucket in a module that should only create private buckets.
  • Missing required outputs. A module that no longer exports database_url after a refactor silently breaks everything that depends on it.
  • Policy violations. Encryption disabled, logging disabled, no tags for cost tracking.
  • Dependency breakage. Changing a module interface breaks the caller — tests catch this before terraform apply.
  • Drift detection. Running tests against deployed infrastructure reveals whether it matches the declared state.

Terraform: Static Analysis Before Anything Else

The cheapest tests run without touching AWS or any cloud provider. Start here.

fmt and validate

These two commands run in milliseconds and catch formatting drift and syntax errors:

terraform fmt -check -recursive
terraform validate

Add them to a pre-commit hook so developers never push malformed Terraform:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: v1.92.0
    hooks:
      - id: terraform_fmt
      - id: terraform_validate
      - id: terraform_tflint
      - id: terraform_docs

tflint for Policy Rules

tflint applies rules beyond syntax validation — it checks for deprecated resources, invalid resource configurations, and custom organization policies:

# .tflint.hcl
plugin "aws" {
  enabled = true
  version = "0.33.0"
  source  = "github.com/terraform-linters/tflint-ruleset-aws"
}

rule "aws_instance_invalid_type" {
  enabled = true
}

rule "aws_s3_bucket_name" {
  enabled = true
  regex   = "^[a-z0-9-]+$"
}

Run it in CI:

tflint --recursive --config=.tflint.hcl

Asserting on terraform plan Output

terraform plan tells you exactly what would change before anything changes. You can parse the plan output and assert on it — this is particularly useful for module tests where you want to verify that a given input produces the expected resource configuration.

Use terraform show -json on a saved plan file:

terraform plan -out=tfplan
terraform show -json tfplan > plan.json

Then parse it in your test script:

#!/usr/bin/env python3
# test_plan.py
import json
import sys

with open('plan.json') as f:
    plan = json.load(f)

changes = plan['resource_changes']
errors = []

for change in changes:
    resource_type = change['type']
    config = change.get('change', {}).get('after', {}) or {}

    # Assert S3 buckets are never public
    if resource_type == 'aws_s3_bucket':
        acl = config.get('acl', 'private')
        if acl != 'private':
            errors.append(f"S3 bucket {change['address']} has ACL '{acl}' — must be 'private'")

    # Assert RDS instances have encryption enabled
    if resource_type == 'aws_db_instance':
        if not config.get('storage_encrypted', False):
            errors.append(f"RDS instance {change['address']} has storage_encrypted=false")

    # Assert EC2 instances have required tags
    if resource_type == 'aws_instance':
        tags = config.get('tags', {})
        for required_tag in ['Environment', 'Team', 'CostCenter']:
            if required_tag not in tags:
                errors.append(f"EC2 {change['address']} missing required tag '{required_tag}'")

if errors:
    print("PLAN ASSERTION FAILURES:")
    for error in errors:
        print(f"  - {error}")
    sys.exit(1)

print(f"Plan assertions passed. {len(changes)} resources checked.")

Terratest: Integration Testing Terraform Modules

Terratest is a Go library that deploys real infrastructure and runs assertions against it. It's the closest thing to integration tests for Terraform modules.

Install it as a Go module in a test/ directory:

mkdir <span class="hljs-built_in">test && <span class="hljs-built_in">cd <span class="hljs-built_in">test
go mod init github.com/myorg/myinfra/test
go get github.com/gruntwork-io/terratest/modules/terraform
go get github.com/gruntwork-io/terratest/modules/aws

A Terratest for a VPC module:

// test/vpc_test.go
package test

import (
    "testing"

    "github.com/gruntwork-io/terratest/modules/aws"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
    "github.com/stretchr/testify/require"
)

func TestVpcModule(t *testing.T) {
    t.Parallel()

    opts := &terraform.Options{
        TerraformDir: "../modules/vpc",
        Vars: map[string]interface{}{
            "environment":     "test",
            "cidr_block":      "10.0.0.0/16",
            "private_subnets": []string{"10.0.1.0/24", "10.0.2.0/24"},
            "public_subnets":  []string{"10.0.101.0/24", "10.0.102.0/24"},
        },
    }

    // Destroy on test completion
    defer terraform.Destroy(t, opts)

    // Deploy the module
    terraform.InitAndApply(t, opts)

    // Read outputs
    vpcId := terraform.Output(t, opts, "vpc_id")
    privateSubnetIds := terraform.OutputList(t, opts, "private_subnet_ids")
    publicSubnetIds := terraform.OutputList(t, opts, "public_subnet_ids")

    require.NotEmpty(t, vpcId)
    assert.Len(t, privateSubnetIds, 2)
    assert.Len(t, publicSubnetIds, 2)

    // Assert against real AWS resources
    region := "us-east-1"
    vpc := aws.GetVpcById(t, vpcId, region)
    assert.Equal(t, "10.0.0.0/16", vpc.CidrBlock)

    // Assert private subnets have no public IP auto-assignment
    for _, subnetId := range privateSubnetIds {
        subnet := aws.GetSubnetById(t, subnetId, region)
        assert.False(t, subnet.MapPublicIpOnLaunch,
            "Private subnet %s should not auto-assign public IPs", subnetId)
    }
}

Terratest tests deploy real infrastructure, which means they cost real money and take real time (minutes to tens of minutes). Gate them behind a separate CI stage that runs less frequently than unit tests:

# .github/workflows/terraform-integration.yml
on:
  push:
    branches: [main]
    paths: ['modules/**', 'test/**']

jobs:
  terratest:
    runs-on: ubuntu-latest
    permissions:
      id-token: write  # For OIDC AWS auth
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/GitHubActionsTest
          aws-region: us-east-1
      - uses: actions/setup-go@v5
        with:
          go-version: '1.22'
      - name: Run Terratest
        working-directory: test
        run: go test -v -timeout 30m ./...
        env:
          TF_VAR_environment: test

Pulumi Testing SDK

Pulumi programs are real code (TypeScript, Python, Go), so they get real unit tests. The Pulumi testing SDK runs your infrastructure code without making any cloud API calls — it mocks the provider and lets you assert on what resources would be created.

// infra/__tests__/vpc.test.ts
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

// Mock the Pulumi runtime
pulumi.runtime.setMocks({
  newResource: (args: pulumi.runtime.MockResourceArgs) => {
    return {
      id: `${args.name}_id`,
      state: {
        ...args.inputs,
        // Simulate AWS filling in computed values
        arn: `arn:aws:ec2:us-east-1:123456789:vpc/${args.name}`,
      },
    };
  },
  call: (args: pulumi.runtime.MockCallArgs) => {
    return args.inputs;
  },
});

describe("VPC module", () => {
  let vpc: aws.ec2.Vpc;
  let privateSubnets: aws.ec2.Subnet[];

  beforeAll(async () => {
    const module = await import("../vpc");
    vpc = module.vpc;
    privateSubnets = module.privateSubnets;
  });

  it("creates VPC with correct CIDR", async () => {
    const cidr = await vpc.cidrBlock;
    expect(cidr).toBe("10.0.0.0/16");
  });

  it("disables DNS hostnames", async () => {
    // Our VPCs should have DNS hostnames enabled
    const enabled = await vpc.enableDnsHostnames;
    expect(enabled).toBe(true);
  });

  it("creates private subnets without public IP auto-assignment", async () => {
    for (const subnet of privateSubnets) {
      const mapPublicIp = await subnet.mapPublicIpOnLaunch;
      expect(mapPublicIp).toBe(false);
    }
  });

  it("tags all resources with required tags", async () => {
    const tags = await vpc.tags;
    expect(tags).toMatchObject({
      Environment: expect.any(String),
      ManagedBy: "pulumi",
    });
  });
});

Run these with your normal test runner:

npx vitest run infra/__tests__/

No AWS credentials needed. No deployed infrastructure. Fast feedback.

LocalStack for AWS IaC Testing

LocalStack runs AWS services locally in a Docker container. It's the middle ground between pure static analysis and real cloud deployment — you get real AWS API behavior without incurring cloud costs or requiring credentials.

Start LocalStack in your CI environment:

services:
  localstack:
    image: localstack/localstack:3
    ports:
      - "4566:4566"
    environment:
      SERVICES: s3,sqs,dynamodb,iam,lambda,secretsmanager
      DEBUG: 0
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock"

Configure Terraform to use LocalStack:

# providers.tf (test override)
provider "aws" {
  region                      = "us-east-1"
  access_key                  = "test"
  secret_key                  = "test"
  skip_credentials_validation = true
  skip_metadata_api_check     = true
  skip_requesting_account_id  = true

  endpoints {
    s3             = "http://localhost:4566"
    sqs            = "http://localhost:4566"
    dynamodb       = "http://localhost:4566"
    iam            = "http://localhost:4566"
    lambda         = "http://localhost:4566"
    secretsmanager = "http://localhost:4566"
  }
}

With Terratest pointing at LocalStack:

opts := &terraform.Options{
    TerraformDir: "../modules/storage",
    EnvVars: map[string]string{
        "AWS_ACCESS_KEY_ID":     "test",
        "AWS_SECRET_ACCESS_KEY": "test",
        "AWS_DEFAULT_REGION":    "us-east-1",
    },
    Vars: map[string]interface{}{
        "localstack_endpoint": "http://localhost:4566",
    },
}

LocalStack tests run in seconds rather than minutes and don't require AWS credentials — making them suitable for running on every PR.

The Complete CI Pipeline

Combine all layers into a pipeline that progresses from fast/cheap to slow/thorough:

name: Infrastructure CI

on:
  pull_request:
    paths: ['terraform/**', 'pulumi/**', 'modules/**']

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Terraform fmt check
        run: terraform fmt -check -recursive terraform/
      - name: Terraform validate
        run: |
          cd terraform/
          terraform init -backend=false
          terraform validate
      - name: tflint
        uses: terraform-linters/setup-tflint@v4
      - run: tflint --recursive

  unit:
    runs-on: ubuntu-latest
    needs: lint
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm ci
      - name: Pulumi unit tests
        run: npx vitest run pulumi/__tests__/
      - name: Plan assertions
        run: |
          cd terraform/
          terraform init -backend=false
          terraform plan -var-file=test.tfvars -out=tfplan
          terraform show -json tfplan > plan.json
          python3 ../scripts/assert-plan.py plan.json

  localstack:
    runs-on: ubuntu-latest
    needs: unit
    services:
      localstack:
        image: localstack/localstack:3
        ports: ["4566:4566"]
        env:
          SERVICES: s3,sqs,dynamodb,iam,lambda
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: '1.22'
      - name: Run LocalStack tests
        working-directory: test
        run: go test -v -run TestLocalStack ./...
        env:
          LOCALSTACK_ENDPOINT: http://localhost:4566

  integration:
    runs-on: ubuntu-latest
    needs: localstack
    if: github.event.pull_request.base.ref == 'main'
    permissions:
      id-token: write
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.TEST_ROLE_ARN }}
          aws-region: us-east-1
      - uses: actions/setup-go@v5
        with:
          go-version: '1.22'
      - name: Run Terratest (real AWS)
        working-directory: test
        run: go test -v -timeout 30m -run TestIntegration ./...

The integration stage only runs on PRs targeting main, not on every feature branch commit. The first three stages — lint, unit, and LocalStack — run on every push and complete in under two minutes. Integration tests run on the path to production, where the cost of full cloud deployment is justified.

The result is an infrastructure codebase that can be refactored with confidence, reviewed with data instead of intuition, and deployed knowing that the tests caught the issues that code review would have missed.

Read more