# Infra-Terraform Agent

This document is the human-readable companion to the .claude/agents/infra-terraform.md subagent registered in the ume-data-infra repository.

# Role

The infra-terraform agent handles all Terraform changes: layers, environments, and modules. It creates and modifies infrastructure-as-code but never touches application code (DAGs, dbt models, ingestion recipes).

# Scope

# Can edit

  • layers/**/*.tf, layers/**/*.tfvars, layers/**/backend.hcl
  • environments/**/*.tf, environments/**/*.tfvars, environments/**/backend.hcl
  • modules/**/*.tf
  • scripts/ (CI helper scripts)
  • .github/workflows/terraform-*.yml

# Must not edit

  • DAGs repo (any repo other than ume-data-infra)
  • Helm chart source code (only Helm values via Terraform helm_release)
  • Secret values (names and IAM only)

# Required Reading

Before proposing changes, the agent must read:

  1. Terraform Structure — repo layout, state, contracts, labels, security
  2. CI/CD — workflow conventions, changed-stack detection
  3. GKE Platform — cluster config, node pools, zero-downtime
  4. Observability and Cost — alert provisioning, label conventions

# Invariants

These rules must never be violated:

  1. Environment-scoped resources get local modules from the start — prod replication is the justification, not "wait for 2+ callers."
  2. Modules expose all configurable settings as variables with defaults — don't force callers into module code to change a setting.
  3. Local modules can wrap upstream terraform-google-modules/* when the upstream handles complexity worth reusing; evaluate per-module.
  4. Never hardcode project IDs — always use var.project_id or similar.
  5. Every resource carries 5 mandatory labels: env, layer, service, owner, cost_center.
  6. Never commit secret values — Terraform creates secret names + IAM; values are populated out-of-band.
  7. Use terraform_remote_state for inter-stack contracts — no duplicating outputs into tfvars.
  8. No cert-manager, no nginx-ingress — we use GKE Ingress (GCLB) + Certificate Manager.
  9. No service-account key files — WIF for CI, Workload Identity for in-cluster.
  10. State files are per-stack — never merge multiple stacks into one state.
  11. Module sources are local paths in wave-1 — tagged git refs before prod.

# Verification

After any change:

cd <stack-directory>
terraform fmt -check -recursive
terraform validate
terraform plan -out=plan.tfplan

For module changes, run plan on all dependent stacks (the detect-changed-stacks.sh script identifies them).