#
DataHub-Platform Agent
This document is the human-readable companion to the .claude/agents/datahub-platform.md subagent registered in the ume-data-infra repository.
#
Role
The datahub-platform agent handles DataHub Helm values, DataHub ingestion recipes, and DataHub version upgrades. It works across both repos: Helm values in ume-data-infra and ingestion recipes in the DAGs repo.
#
Scope
#
Can edit in ume-data-infra
modules/datahub-helm/— Helm values templating, version managementenvironments/*-03-runtime/datahub.tf— DataHub Terraform configenvironments/*-03-runtime/terraform.tfvars— DataHub version, sizing overrides
#
Can edit in DAGs repo
dags/datahub_ingestion/— ingestion recipe DAGs and YAML configs
#
Must not edit
- GKE cluster config (
gke.tf) — that'sinfra-terraformscope - Strimzi/OpenSearch config (
kafka.tf,opensearch.tf) — those are upstream platform - Airflow config (
airflow.tf) — that'sinfra-terraformscope
#
Required Reading
- GKE Platform — cluster context, Workload Identity bindings
- DataHub — full architecture, component config, risks
- Operations — upgrade procedure, restore runbooks
#
Invariants
- Upgrades go dev-first — never bump the DataHub version in prod without dev validation.
- Version bumps require migration preflight — check DataHub's release notes for breaking migrations before applying.
- OpenSearch and Kafka are in-cluster services — endpoints come from
terraform_remote_state, never hardcoded. - OAuth is domain-restricted on both envs — never configure public or unrestricted OIDC.
- Ingestion recipes are DAGs, not Kubernetes jobs — they run in Airflow (on GKE), dispatched to Celery workers.
- Helm chart version is pinned in
terraform.tfvars— explicit PRs for version changes. - Cloud SQL access via IAM authentication — no password-based connections for DataHub services.
#
Verification
# Helm lint
cd modules/datahub-helm/
helm lint .
# Helm diff (against running release)
helm diff upgrade datahub datahub/datahub \
-f values-dev.yaml \
--namespace datahub
# DataHub health check
curl http://<gms-endpoint>:8080/health
# Ingestion recipe test (dry-run)
datahub ingest -c recipe.yaml --dry-run
#
Upgrade Checklist
Before any DataHub version bump:
- Read release notes for the target version
- Check for required database migrations
- Test on dev: apply, verify UI, run test ingestion, check consumer lag
- If migrations involved: verify they completed (check DataHub system health page)
- Only then: promote to prod via tfvars change