Advanced Atlantis: Plan Policies, Custom Workflows, and Integration Testing
The basics of Atlantis—comment atlantis plan on a PR, review the plan, apply—are well documented. What's less covered is how to enforce policy at plan time, build reusable workflow components, and test your Atlantis configuration before it breaks a production PR. This post goes deep on those topics.
Policy-as-Code with conftest
Atlantis supports policy checking as a first-class feature. After terraform plan, Atlantis runs conftest against the plan JSON and blocks apply if policies fail.
Enable it in atlantis.yaml:
# atlantis.yaml
version: 3
policies:
conftest_version: v0.45.0
policy_sets:
- name: security-policies
path: policies/security
source: local
- name: tagging-policies
path: policies/tagging
source: localThe plan JSON that conftest receives looks like:
{
"format_version": "1.1",
"resource_changes": [
{
"address": "aws_s3_bucket.data",
"type": "aws_s3_bucket",
"change": {
"actions": ["create"],
"after": {
"bucket": "my-data-bucket",
"force_destroy": false
}
}
}
]
}Write Rego policies against this structure:
# policies/security/s3.rego
package main
import future.keywords.every
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
resource.change.actions[_] == "create"
resource.change.after.force_destroy == true
msg := sprintf(
"S3 bucket '%s' has force_destroy=true. This is not allowed in production.",
[resource.address]
)
}
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
resource.change.actions[_] == "create"
not resource.change.after.tags.team
msg := sprintf(
"S3 bucket '%s' is missing required 'team' tag.",
[resource.address]
)
}# policies/security/iam.rego
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_iam_policy"
resource.change.actions[_] == "create"
doc := json.unmarshal(resource.change.after.policy)
statement := doc.Statement[_]
statement.Effect == "Allow"
statement.Action == "*"
statement.Resource == "*"
msg := sprintf(
"IAM policy '%s' grants full admin access (Action:* Resource:*). Use least-privilege instead.",
[resource.address]
)
}Test the policies without Atlantis involved:
# Generate a plan JSON locally
terraform init
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
<span class="hljs-comment"># Run conftest directly
conftest <span class="hljs-built_in">test plan.json \
--policy policies/security \
--policy policies/tagging \
--namespace mainWrite unit tests for your Rego policies:
# policies/security/s3_test.rego
package main
test_deny_force_destroy {
deny[_] with input as {
"resource_changes": [{
"address": "aws_s3_bucket.test",
"type": "aws_s3_bucket",
"change": {
"actions": ["create"],
"after": {
"bucket": "test",
"force_destroy": true,
"tags": {"team": "platform"}
}
}
}]
}
}
test_allow_no_force_destroy {
count(deny) == 0 with input as {
"resource_changes": [{
"address": "aws_s3_bucket.test",
"type": "aws_s3_bucket",
"change": {
"actions": ["create"],
"after": {
"bucket": "test",
"force_destroy": false,
"tags": {"team": "platform"}
}
}
}]
}
}conftest verify --policy policies/securityCustom Pre/Post-Plan Hooks
Hooks run shell commands at specific points in the Atlantis workflow. They're the escape hatch for anything the built-in workflow doesn't handle.
Pre-plan: Enforce Module Versions
# atlantis.yaml
workflows:
production:
plan:
steps:
- env:
name: ATLANTIS_TERRAFORM_VERSION
command: 'cat .terraform-version 2>/dev/null || echo "1.8.0"'
- run: |
# Ensure no module sources use latest without a version pin
if grep -r 'source.*github.com' . | grep -v '?ref=' | grep -v '#'; then
echo "ERROR: Found unpinned GitHub module source. All modules must specify ?ref=<tag>"
exit 1
fi
- init
- plan
staging:
plan:
steps:
- init
- planPost-plan: Cost Estimation
Run Infracost after planning and post the estimate as an Atlantis comment:
workflows:
default:
plan:
steps:
- init
- plan:
extra_args: ["-out", "$PLANFILE"]
- run: |
infracost breakdown \
--path $PLANFILE \
--format json \
--out-file /tmp/infracost.json
COST=$(jq -r '.totalMonthlyCost' /tmp/infracost.json)
DIFF=$(jq -r '.diffTotalMonthlyCost' /tmp/infracost.json)
echo "## Cost Estimate" >> $PLANFILE.txt
echo "Total monthly: \$${COST}" >> $PLANFILE.txt
echo "Delta: \$${DIFF}" >> $PLANFILE.txtPost-apply: Slack Notification
workflows:
production:
apply:
steps:
- apply
- run: |
curl -s -X POST $SLACK_WEBHOOK_URL \
-H 'Content-Type: application/json' \
-d "{
\"text\": \"Terraform apply completed in *$REPO_NAME/$DIR* by $USERNAME\",
\"attachments\": [{
\"color\": \"good\",
\"text\": \"PR #$PULL_NUM\"
}]
}"Testing Atlantis Workflows Locally
The atlantis testdrive command spins up a local Atlantis instance with your config, but it's limited. For full workflow testing, use Docker:
# docker-compose.yml for local Atlantis testing
version: <span class="hljs-string">'3.8'
services:
atlantis:
image: ghcr.io/runatlantis/atlantis:v0.28.0
ports:
- <span class="hljs-string">"4141:4141"
environment:
ATLANTIS_GH_USER: your-bot-user
ATLANTIS_GH_TOKEN: <span class="hljs-variable">${GH_TOKEN}
ATLANTIS_GH_WEBHOOK_SECRET: test-secret
ATLANTIS_REPO_ALLOWLIST: github.com/your-org/*
ATLANTIS_ATLANTIS_URL: http://localhost:4141
volumes:
- ./atlantis.yaml:/atlantis.yaml
- ./policies:/policies
<span class="hljs-built_in">command: server --config /atlantis.yamlFor workflow validation without a GitHub webhook, use the atlantis CLI to validate config:
atlantis validate --atlantis-yaml atlantis.yamlWrite an integration test script that exercises the full plan+policy cycle:
#!/bin/bash
<span class="hljs-comment"># scripts/test-atlantis-workflow.sh
<span class="hljs-built_in">set -euo pipefail
TERRAFORM_DIR=<span class="hljs-variable">${1:?Usage: $0 <terraform-dir>}
<span class="hljs-built_in">echo <span class="hljs-string">"=== Testing Atlantis workflow for: $TERRAFORM_DIR ==="
<span class="hljs-comment"># Step 1: Init and plan
<span class="hljs-built_in">cd <span class="hljs-string">"$TERRAFORM_DIR"
terraform init -backend=<span class="hljs-literal">false
terraform plan -out=tfplan.binary \
-var-file=test.tfvars 2>&1
<span class="hljs-comment"># Step 2: Generate plan JSON
terraform show -json tfplan.binary > plan.json
<span class="hljs-comment"># Step 3: Run conftest
<span class="hljs-built_in">echo <span class="hljs-string">"--- Running policy checks ---"
conftest <span class="hljs-built_in">test plan.json \
--policy ../../policies/security \
--policy ../../policies/tagging \
--namespace main \
--output tap
<span class="hljs-comment"># Step 4: Run tflint
<span class="hljs-built_in">echo <span class="hljs-string">"--- Running tflint ---"
tflint --format compact
<span class="hljs-built_in">echo <span class="hljs-string">"=== Workflow test passed for: $TERRAFORM_DIR ==="Integrate into CI to test the testing infrastructure:
# .github/workflows/test-atlantis-config.yml
name: Test Atlantis Config
on:
pull_request:
paths:
- 'atlantis.yaml'
- 'policies/**'
jobs:
validate-policies:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate atlantis.yaml
run: |
docker run --rm \
-v $(pwd):/workspace \
ghcr.io/runatlantis/atlantis:latest \
validate --atlantis-yaml /workspace/atlantis.yaml
- name: Test conftest policies
run: |
conftest verify --policy policies/security
conftest verify --policy policies/taggingTeam Patterns for Large Orgs
Per-Team Workflows
Large organizations often need different workflows per team—some require approval from a security group, others are autonomous.
# atlantis.yaml
repos:
- id: github.com/example/platform
allowed_overrides: [workflow, apply_requirements]
allowed_workflows: [production, staging, sandbox]
apply_requirements: [approved, mergeable]
- id: github.com/example/sandbox
allowed_overrides: [workflow, apply_requirements]
allowed_workflows: [sandbox]
apply_requirements: []
workflows:
production:
plan:
steps:
- run: ./scripts/pre-plan-checks.sh
- init:
extra_args: ["-backend-config=prod.hcl"]
- plan
apply:
steps:
- run: ./scripts/pre-apply-checks.sh
- apply
sandbox:
plan:
steps:
- init
- plan
apply:
steps:
- applyDirectory-Level Apply Requirements
Require security team approval only for Terraform that touches IAM:
repos:
- id: github.com/example/infra
projects:
- name: iam-prod
dir: terraform/iam
workspace: production
apply_requirements: [approved, mergeable]
required_approvals: 2
- name: networking-prod
dir: terraform/networking
workspace: production
apply_requirements: [approved, mergeable]Preventing Concurrent Applies
Atlantis has a built-in queue, but for cross-repo dependencies, use workspace locking with an external backend:
# pre-plan hook: acquire lock
run: <span class="hljs-pipe">|
aws dynamodb put-item \
--table-name atlantis-locks \
--item <span class="hljs-string">"{\"LockId\": {\"S\": \"$REPO_NAME-<span class="hljs-variable">$DIR\"}, \"Owner\": {\"S\": \"<span class="hljs-variable">$PULL_NUM\"}, \"TTL\": {\"N\": \"<span class="hljs-subst">$(date -d '+1 hour' +%s)\"}}" \
--condition-expression <span class="hljs-string">"attribute_not_exists(LockId)"
<span class="hljs-comment"># post-apply hook: release lock
run: <span class="hljs-pipe">|
aws dynamodb delete-item \
--table-name atlantis-locks \
--key <span class="hljs-string">"{\"LockId\": {\"S\": \"$REPO_NAME-<span class="hljs-variable">$DIR\"}}"The Atlantis policy system—conftest integration, pre/post hooks, and per-repo workflow configuration—gives you control that goes well beyond the basic PR automation. The key is testing the testing infrastructure itself: Rego unit tests for policies, local workflow scripts in CI, and atlantis validate to catch config errors before they break a production apply.