Infrastructure-as-code (Terraform, Bicep, Pulumi, CloudFormation, OpenTofu) belongs in CI just as much as application code. Running it in pipelines makes infra changes reviewable, repeatable, and auditable.
The Plan / Apply Lifecycle
Terraform-style tools split changes into two steps:
- Plan — show what would change. Read-only.
- Apply — actually make the changes.
The CI pattern follows naturally:
PR opened → run terraform plan → post diff as a comment
PR merged → run terraform apply on main
A GitHub Actions Example
name: Terraform
on:
pull_request:
paths: [infra/**]
push:
branches: [main]
paths: [infra/**]
permissions:
id-token: write
contents: read
pull-requests: write
jobs:
plan:
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./infra
steps:
- uses: actions/checkout@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123:role/tf-plan
aws-region: us-east-1
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: terraform fmt -check
- run: terraform validate
- id: plan
run: terraform plan -no-color -out=tfplan
apply:
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: production # gates with manual approval
defaults:
run:
working-directory: ./infra
steps:
- uses: actions/checkout@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123:role/tf-apply
aws-region: us-east-1
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: terraform apply -auto-approve
Note the two distinct IAM roles. The plan role can read everything; the apply role can change things. Even a hijacked PR run cannot mutate infra.
Remote State and Locking
Local .tfstate files do not work for teams. Use a remote backend:
| S3 + DynamoDB | State in S3, lock in DynamoDB. Classic AWS setup. |
| Terraform Cloud / HCP Terraform | SaaS — managed state, runs, policy, team workflows. |
| Azure Storage | Container with blob lease for locking. |
| GCS | Native locking on the bucket. |
| OpenTofu | Same backends; community fork of Terraform. |
Locking prevents two pipeline runs from corrupting state when they apply concurrently. Always enable it.
Posting the Plan to the PR
Make the plan readable to reviewers:
- name: Comment plan
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const plan = fs.readFileSync('infra/plan.txt', 'utf8');
const body = '### Terraform Plan\n\n\`\`\`\n' + plan + '\n\`\`\`';
await github.rest.issues.createComment({
...context.repo,
issue_number: context.issue.number,
body,
});
Reviewers approve the diff, not just the code. "What will this do to prod?" becomes a one-glance answer.
Policy as Code
Stop bad plans before they're applied. Tools:
- OPA / Conftest — write Rego rules ("no public S3 buckets")
- Sentinel — Terraform Cloud's policy engine
- Checkov, tfsec, Terrascan — pre-built rule packs
- Open Policy Agent Gatekeeper — at the Kubernetes admission layer
- name: Run Checkov
uses: bridgecrewio/checkov-action@v12
with:
directory: infra
framework: terraform
soft_fail: false # fail the build on findings
Multi-Environment Layout
Two common patterns:
Workspaces
terraform workspace new dev
terraform workspace new prod
terraform workspace select prod
terraform apply
One config, many state files. Light-touch but discourages env-specific differences.
Per-environment directories
infra/
├── modules/
├── envs/
│ ├── dev/
│ ├── staging/
│ └── prod/
Each env is a top-level config that imports shared modules. Easy to give different envs different shapes; preferred for non-trivial setups.
Drift Detection
Someone clicks something in the cloud console, or an auto-scaling group changes, and your IaC no longer matches reality. Run a scheduled terraform plan and alert if it shows changes:
on:
schedule:
- cron: '0 8 * * *' # daily 8am
jobs:
drift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- id: plan
run: terraform plan -detailed-exitcode -no-color -out=tfplan
continue-on-error: true
- if: steps.plan.outputs.exitcode == '2'
run: |
echo "Drift detected — alerting Slack"
curl -X POST $SLACK_WEBHOOK -d '{"text":"⚠️ Terraform drift in prod"}'
Managed IaC Platforms
Once you have many states, tens of pipelines, and several teams, raw GitHub Actions starts to creak. Specialised products take over:
- Atlantis — open-source, comments on PRs (
atlantis plan,atlantis apply) - Terraform Cloud / HCP Terraform — runs, RBAC, policy, audit
- Spacelift — multi-IaC (Terraform + Pulumi + Ansible + CloudFormation)
- env0, Scalr — similar managed platforms
They give you queueing, drift detection, RBAC, cost estimates, and audit logs out of the box.
Cost Estimation
Add a "what will this cost?" step to every plan:
- uses: infracost/actions/setup@v3
with:
api-key: ${{ secrets.INFRACOST_API_KEY }}
- run: infracost diff --path=infra --format=table
Reviewers see "+\$1,200/month" before approving — much harder to ignore than a lengthy plan diff.
The Pipeline Pattern in One Sentence
Plan on PRs, apply on merge with a separate, more-privileged identity, gate with environment approvals, scan with policy-as-code, and run scheduled drift checks.