openova/platform/bp-mgmt-vcluster/README.md
e3mrah c073db28a9
feat(bootstrap-kit): bp-mgmt-vcluster + bp-dmz-vcluster + bp-rtz-vcluster — implement DoD A4 vCluster topology (#1526)
Founder ruling 2026-05-16: docs/SOVEREIGN-MULTI-REGION-DOD.md A4 has
been promised on every multi-region prov for weeks but never built in
code — the bootstrap-kit had NO mgmt/dmz/rtz vCluster blueprints and
the Sovereign Console canvas reported `vCluster 0/0` on every prov.
This PR ships the 3 missing blueprints + wires them into the
bootstrap-kit so the topology contract finally lands.

DoD A4 ratified contract:
  primary    region → MGMT  + DMZ  vCluster
  secondary  region → DMZ   + RTZ  vCluster
Cross-vCluster intra-region traffic stays inside host k3s via Cilium.
Inter-region traffic goes over the DMZ WireGuard hop per A2.

Charts (all 3 mirror the canonical bp-cert-manager umbrella pattern —
loft-sh/vcluster 0.20.0 bundled as a Helm subchart via
`helm dependency build`, MIRROR-EVERYTHING image via
harbor.openova.io/proxy-ghcr by default, fail-fast image-tag guard
per INVIOLABLE-PRINCIPLES #4a, default-OFF via subchart `condition:`
key, NetworkPolicy isolation baseline):

  platform/bp-mgmt-vcluster/   primary-only,    slot 58
  platform/bp-dmz-vcluster/    every region,    slot 54 (default-ON)
  platform/bp-rtz-vcluster/    secondary-only,  slot 59

Each chart's tests/render.sh covers 3 contracts:
  1. default-OFF renders zero resources (subchart condition gate)
  2. enabled-with-empty-image-tag fails fast (SHA-pin guard)
  3. full-ON renders Namespace + NetworkPolicy + subchart
     StatefulSet + Service

Bootstrap-kit wiring:
  clusters/_template/bootstrap-kit/{54,58,59}-bp-*-vcluster.yaml
  clusters/_template/bootstrap-kit/kustomization.yaml (3 new resources)
  scripts/expected-bootstrap-deps.yaml (slots 54/58/59 + adjacent
    bp-openova-flow-server bp-cnpg dep drift fix)

scripts/check-bootstrap-deps.sh passes 0-drift after the change
(48 HRs present on disk, 14 deferred for W2.K4).

Region-key threading uses the existing `${SOVEREIGN_REGION_KEY}`
postBuild.substitute that the cloud-init tftpl already exports (per
the brief's "DON'T touch infra/hetzner/*" directive). The per-role
enable gates default safely (mgmt=false, dmz=true, rtz=false); a
follow-up tofu PR will add MGMT_VCLUSTER_ENABLED + RTZ_VCLUSTER_ENABLED
substitutes flipped on only on the appropriate CP, taking the canvas
count from `vCluster 3/3` to `vCluster 6/6` on a 3-region Sovereign.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:13:17 +04:00

2.8 KiB

bp-mgmt-vcluster

Bootstrap-kit Blueprint #58. Provisions the MGMT vCluster that hosts every Sovereign's mgmt-tier control plane (catalyst-api, catalyst-ui, openova-flow-server) on the primary region of a multi-region Sovereign.

Why this exists — DoD A4

docs/SOVEREIGN-MULTI-REGION-DOD.md ratified 2026-05-15 declares invariant A4:

vCluster topology: primary region = MGMT + DMZ vCluster; each secondary region = DMZ + RTZ vCluster. Cross-vCluster intra-region traffic stays inside host k3s via Cilium.

This Blueprint implements the MGMT half of that contract.

Region role vClusters this Blueprint renders Companion charts
Primary MGMT bp-dmz-vcluster (slot 54)
Secondary (skipped — gated off) bp-dmz-vcluster + bp-rtz-vcluster (slot 59)

The bootstrap-kit Kustomization gates render via a SOVEREIGN_REGION_ROLE substitute. The primary CP's cloud-init template sets it to primary; secondary CPs set it to secondary. The slot 58 manifest's mgmtVcluster.enabled flips on only when role=primary.

Resources rendered (full-ON)

  • Namespace mgmt (catalyst.openova.io/vcluster-role=mgmt label so the OpenovaFlow canvas adapter counts it for the dashboard vCluster X/Y tile)
  • NetworkPolicy default-deny + allowFrom dmz for cross-vCluster intra-region traffic from the public-fronted DMZ vCluster
  • Upstream loft-sh/vcluster 0.20.0 subchart resources (StatefulSet, Service, RBAC, etc.) under the mgmt namespace with:
    • nodeSelector: openova.io/region=<primary-region-key> so the StatefulSet pod always lands on the primary CP node
    • local-path storage class, 5Gi PVC for embedded sqlite backing store
    • 200m CPU / 384Mi memory request (limits 2 CPU / 1Gi memory)
    • MIRROR-EVERYTHING image: harbor.openova.io/proxy-ghcr/loft-sh/vcluster:0.20.0

Topology dependency

Phase 0 (cloud-init Hetzner CP)
   ↓
bp-cilium             — CNI + Gateway API (slot 01)
   ↓
bp-cert-manager       — TLS for ClusterIssuers (slot 02)
   ↓
bp-mgmt-vcluster      — THIS chart (slot 58, primary-only)
bp-dmz-vcluster       — slot 54 (every region)
bp-rtz-vcluster       — slot 59 (secondary-only)

Testing

tests/render.sh exercises three contracts via helm template:

  1. Default-OFF renders zero umbrella resources
  2. Enabled-with-empty-image-tag fails fast (#4a SHA-pin guard)
  3. Full-ON renders Namespace + NetworkPolicy + subchart StatefulSet + Service

See also

  • docs/SOVEREIGN-MULTI-REGION-DOD.md — A4 contract
  • infra/hetzner/README.md lines 50-100 — topology diagram
  • platform/bp-dmz-vcluster/ — companion (every region)
  • platform/bp-rtz-vcluster/ — companion (secondary regions)
  • scripts/expected-bootstrap-deps.yaml slot 58 — dependency-graph audit declaration