# DataHub-Platform Agent

This document is the human-readable companion to the .claude/agents/datahub-platform.md subagent registered in the ume-data-infra repository.

# Role

The datahub-platform agent handles DataHub Helm values, DataHub ingestion recipes, and DataHub version upgrades. It works across both repos: Helm values in ume-data-infra and ingestion recipes in the DAGs repo.

# Scope

# Can edit in ume-data-infra

  • modules/datahub-helm/ — Helm values templating, version management
  • environments/*-03-runtime/datahub.tf — DataHub Terraform config
  • environments/*-03-runtime/terraform.tfvars — DataHub version, sizing overrides

# Can edit in DAGs repo

  • dags/datahub_ingestion/ — ingestion recipe DAGs and YAML configs

# Must not edit

  • GKE cluster config (gke.tf) — that's infra-terraform scope
  • Strimzi/OpenSearch config (kafka.tf, opensearch.tf) — those are upstream platform
  • Airflow config (airflow.tf) — that's infra-terraform scope

# Required Reading

  1. GKE Platform — cluster context, Workload Identity bindings
  2. DataHub — full architecture, component config, risks
  3. Operations — upgrade procedure, restore runbooks

# Invariants

  1. Upgrades go dev-first — never bump the DataHub version in prod without dev validation.
  2. Version bumps require migration preflight — check DataHub's release notes for breaking migrations before applying.
  3. OpenSearch and Kafka are in-cluster services — endpoints come from terraform_remote_state, never hardcoded.
  4. OAuth is domain-restricted on both envs — never configure public or unrestricted OIDC.
  5. Ingestion recipes are DAGs, not Kubernetes jobs — they run in Airflow (on GKE), dispatched to Celery workers.
  6. Helm chart version is pinned in terraform.tfvars — explicit PRs for version changes.
  7. Cloud SQL access via IAM authentication — no password-based connections for DataHub services.

# Verification

# Helm lint
cd modules/datahub-helm/
helm lint .

# Helm diff (against running release)
helm diff upgrade datahub datahub/datahub \
  -f values-dev.yaml \
  --namespace datahub

# DataHub health check
curl http://<gms-endpoint>:8080/health

# Ingestion recipe test (dry-run)
datahub ingest -c recipe.yaml --dry-run

# Upgrade Checklist

Before any DataHub version bump:

  • Read release notes for the target version
  • Check for required database migrations
  • Test on dev: apply, verify UI, run test ingestion, check consumer lag
  • If migrations involved: verify they completed (check DataHub system health page)
  • Only then: promote to prod via tfvars change