Slice G1 of EPIC-0 (#1095, Group G "Multi-cluster substrate"). Today
infra/hetzner/main.tf only realises regions[0] end-to-end — every wizard
payload's regions[1..N] entries silently no-op. EPIC-6 (#1101) Continuum
DR demo needs 3 regions (mgmt + fsn + hel per docs/EPICS-1-6-unified-design.md
§3.8 + §11), so this slice closes the gap.
Architecture: hybrid singular-path + secondary-region overlay.
- The legacy singular path (var.region + count = local.control_plane_count)
STAYS untouched — every existing Sovereign state (omantel, otech*) keeps
its resource addresses (hcloud_server.control_plane[0],
hcloud_load_balancer.main, etc) and produces a no-op plan diff.
- New regions (regions[1+]) are realised via a parallel for_each set keyed
by "{cloudRegion}-{index}" (e.g. fsn1-1, hel1-2). Each secondary region
gets its own /24 subnet inside the shared /16 hcloud_network, its own
CP server, its own workers, and its own lb11 load balancer. The shared
hcloud_firewall + hcloud_ssh_key (one tenant boundary per Sovereign).
Why hybrid not full for_each: a wholesale refactor would change every
existing resource address (hcloud_server.control_plane[0] →
hcloud_server.control_plane["mgmt"]), forcing every running Sovereign
to run `tofu state mv` for ~12 resources or face destructive recreates.
The brief explicitly bans that. Hybrid is purely additive — secondary
resources are NEW addresses no existing state carries.
No `tofu state mv` runbook required. Existing Sovereigns provisioned
with var.regions = [] or len(var.regions) == 1 produce identical plans
before and after this PR.
Slice G3 (out of scope here) wires Cilium ClusterMesh between secondary
regions and adds per-cluster GitOps path differentiation; today every
secondary CP renders an identical Flux Kustomization pointed at
clusters/<sovereign_fqdn>/.
Tests: tests/multi_region.tftest.hcl exercises 5 scenarios offline via
mock_provider + override_resource (no real Hetzner):
- legacy_no_regions_payload (var.regions=[])
- single_region_entry_does_not_double_provision (len==1)
- three_region_mgmt_fsn_hel (EPIC-6 shape)
- same_region_duplicates_produce_distinct_keys
- non_hetzner_regions_are_filtered_out (oci entries skipped)
All 5 pass. CI workflow infra-hetzner-tofu.yaml runs validate + fmt -check
+ test on every PR touching infra/hetzner/**.
Per CLAUDE.md "every workflow MUST be event-driven, NEVER scheduled":
push-on-merge + pull-request-on-touch + workflow_dispatch only. No cron.
Validation:
$ tofu validate
Success! The configuration is valid.
$ tofu fmt -check -recursive
exit=0
$ tofu test
tests/multi_region.tftest.hcl... pass
run "legacy_no_regions_payload"... pass
run "single_region_entry_does_not_double_provision"... pass
run "three_region_mgmt_fsn_hel"... pass
run "same_region_duplicates_produce_distinct_keys"... pass
run "non_hetzner_regions_are_filtered_out"... pass
Success! 5 passed, 0 failed.
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>