openova/clusters
e3mrah 4881692159
feat(tenant-gitops): emit Continuum CR for each multi-region tenant app (Refs #2066) (#2074)
Per the 2026-05-20 Pillar 3 audit (audit-pillar3-cnpg-2026-05-20.md
surface #12 MISSING): even with bp-cnpg-pair rendered inline by the
WordPress tenant chart, no Continuum.dr.openova.io/v1 resource is
ever created for the new tenant. The bp-continuum controller (wired
by PR #2072 / Refs #2065) therefore has nothing to reconcile against
and primary-kill yields no automated failover — breaking the Pillar 3
"≤30s failover / zero-tx-loss" claim from CLAUDE.md §0.

This change extends renderSMETenantOverlay in
products/catalyst/bootstrap/api/internal/handler/sme_tenant_gitops.go
to emit a per-Application Continuum CR (continuum.yaml) alongside
the bp-wordpress-tenant HelmRelease whenever
SOVEREIGN_ENABLE_HOT_STANDBY=true AND both regions are non-empty
and distinct (same defence-in-depth gate the existing
pg.activeHotStandby.* block already passes through). The
kustomization.yaml conditionally references the new file under
resources:, and the overlay writer now skips empty template
contents so single-cluster tenants never see a stray empty file.

Continuum CR shape per products/catalyst/chart/crds/continuum.yaml:
- applicationRef = bp-wordpress-tenant
- primaryRegion / hotStandbyRegions[] = SOVEREIGN_{PRIMARY,REPLICA}_REGION
- rto: 30s, rpo: 5s (matches CLAUDE.md §0 + PR #2071 remote_apply
  synchronous-replication shape)
- leaseClient.kind: dns-quorum (canonical Sovereign-internal default;
  3 in-cluster PowerDNS resolvers)
- luaRecord.healthCheck.url: https://<WordPressHost>/healthz
- autoFailover: false (operator-driven first walk; flip post-handover)

This PR creates the CR; PR #2071 (Refs #2064) ships synchronous
replication; PR #2072 (Refs #2065) wires bp-continuum into the
bootstrap-kit. All three are needed for Pillar 3 to actually achieve
zero-tx-loss + ≤30s failover. D31 acceptance test (#2067) and
standalone bp-cnpg-pair install path (#2068) remain separate.

Tests:
- TestRenderSMETenantOverlay_HotStandby_On_EmitsContinuumCR asserts
  the CR + kustomization.yaml entry both appear with correct fields
  when SOVEREIGN_ENABLE_HOT_STANDBY=true + distinct regions.
- TestRenderSMETenantOverlay_HotStandby_Off_NoContinuumCR asserts
  symmetry — no CR file, no kustomization.yaml reference — when HA
  is off (avoids stray missing-resource or unknown-apiGroup
  reconcile errors on single-cluster tenants).
- Existing TestRenderSMETenantOverlay_HotStandby_* tests still pass
  (full handler suite green, 87s wall).

Chart bump (Principle #14 lockstep):
- products/catalyst/chart/Chart.yaml: 1.4.229 → 1.4.230
- clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml
  pinned version: 1.4.229 → 1.4.230

Refs #2066 (NOT Closes — closes after operator walks the surface on
a fresh prov and confirms the Continuum CR reconciles into a
synchronizing state).

Validation (Principle #15):
- go test ./internal/handler/... -count=1 PASSES (89s wall, full
  handler suite).
- helm lint products/catalyst/chart PASSES.
- Render dump confirmed generated continuum.yaml + kustomization.yaml
  match CRD shape character-for-character.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:35:38 +04:00
..
_template feat(tenant-gitops): emit Continuum CR for each multi-region tenant app (Refs #2066) (#2074) 2026-05-20 10:35:38 +04:00
contabo-mkt/tenants provision: deploy tenant e2e-wp-test (plan: m, apps: 1) 2026-05-06 02:23:14 +04:00
omantel.omani.works fix(bp-crossplane): align ProviderConfig secretRef with cloud-init seam (Refs #1947) (#1963) 2026-05-19 19:23:04 +04:00
otech.omani.works fix(bp-crossplane): align ProviderConfig secretRef with cloud-init seam (Refs #1947) (#1963) 2026-05-19 19:23:04 +04:00