openova/scripts
e3mrah 53f510b983
feat(bootstrap-kit): wire bp-continuum (failover orchestrator) — Pillar 3 unblock (Refs #2065) (#2072)
* feat(bootstrap-kit): wire bp-continuum (failover orchestrator) — Pillar 3 unblock

Adds bootstrap-kit slot 62 (62-bp-continuum.yaml) so the Continuum DR
controller actually deploys on a fresh Sovereign. Without this slot the
chart at products/continuum/chart/ sat in-tree with no install path —
catalyst-platform's QA fixtures (slot 13 qa-continuum-status-seed-job)
reference a Continuum CR named `cont-omantel` that no controller was
ever spinning up to reconcile, leaving Pillar-3 unverifiable end-to-end.

Pillar-3 of the canonical end-user DoD ("multi-region BCP — region kill
zero-data-loss failover") requires three pieces:

  1. bp-cnpg-pair (Pillar-3 follow-up #2068) — primary + replica CNPG
     with ReplicaCluster sync over Cilium ClusterMesh on the WG-public-
     IP DMZ data plane.
  2. Continuum CR + the per-app HTTPRoute drain hook (follow-up #2066).
  3. THIS controller — without bp-continuum deployed, every Continuum
     CR sits unhandled and the lua-record flip never fires, so a
     region-kill produces TXN-loss on every transaction in-flight.

This PR ships piece 3 — the controller itself, gated default-OFF.

Files
- NEW clusters/_template/bootstrap-kit/62-bp-continuum.yaml — HelmRepository
  + HelmRelease pinned to bp-continuum 0.1.1, targetNamespace
  catalyst-system, dependsOn [bp-catalyst-platform, bp-nats-jetstream,
  bp-powerdns], default-OFF gate via ${CONTINUUM_ENABLED:-false}.
- UPDATE clusters/_template/bootstrap-kit/kustomization.yaml — slot 62
  appended after slot 60 (bp-vcluster-helmrepo), with a header comment
  explaining the Pillar-3 dependency analysis.
- UPDATE scripts/expected-bootstrap-deps.yaml — slot 62 declared with the
  same dep set so scripts/check-bootstrap-deps.sh stays drift-free.
- UPDATE products/continuum/chart/Chart.yaml — version 0.1.0 → 0.1.1
  (first PUBLISHED version; the previous 0.1.0 sat in-tree but blueprint-
  release.yaml never pushed it to GHCR for lack of a path-change trigger)
  + add `catalyst.openova.io/smoke-render-mode: default-off` annotation
  required by blueprint-release's smoke-render gate for default-OFF charts.

Default-OFF rationale
The chart's own values.yaml ships `continuum.enabled: false` (chart
fail-fasts on empty `image.tag` when enabled=true — Inviolable
Principle #4a no-`:latest` guard). We surface a CONTINUUM_ENABLED
envsubst placeholder so per-Sovereign overlays may flip the gate on
once bp-cnpg-pair + bp-powerdns + lease witness are ready. Default
`false` matches the MARKETPLACE_ENABLED / SANDBOX_ENABLED knob shape.

Why dependsOn does NOT include bp-cnpg-pair
The chart ships default-OFF — the controller installs idle and only
exercises bp-cnpg-pair when an operator flips `continuum.enabled=true`.
Adding bp-cnpg-pair to dependsOn today would break the install on every
Sovereign that hasn't shipped #2068 yet. Per-Sovereign cnpg-pair
provisioning is the gating dependency at flip-time, not install-time.

Validation (Principle #15 — fresh state, NOT --dry-run=server)
- `helm package products/continuum/chart` → bp-continuum-0.1.1.tgz
- `helm template smoke products/continuum/chart` → empty (default-OFF,
  matches smoke-render-mode annotation contract).
- `helm template smoke products/continuum/chart --set
  continuum.enabled=true` → 6 resources rendered cleanly (Deployment,
  Service, ServiceAccount, RBAC, NetworkPolicy).
- `bash scripts/check-bootstrap-deps.sh` → "Drift: 0  Cycles: 0  PASSED".
- `bash scripts/check-bootstrap-kit-pin-sync.sh` → "bp-continuum:
  chart=0.1.1 pin=0.1.1  PASS".
- `kubectl kustomize clusters/_template/bootstrap-kit/` → 52 HelmReleases
  rendered (was 51 + bp-continuum), `kubectl apply --dry-run=client` on
  the rendered YAML produces no errors for bp-continuum.

GHCR publication path
bp-continuum:0.1.0 was never published — git history shows the chart
committed in-tree but the blueprint-release workflow (which triggers on
`products/*/chart/**` diffs) had no path-change to detect since the
initial commit. Bumping Chart.yaml to 0.1.1 forces a fresh publish on
this PR's merge; the auto-bump-pin hook (TBD-A6) then converges the
slot pin via a no-op (already matches at 0.1.1).

Verified bp-continuum:0.1.1 will publish via blueprint-release.yaml's
detect step (`git diff HEAD~1 HEAD | grep -E
'^(platform|products)/[^/]+/(chart/|blueprint.yaml)'`) which catches
products/continuum/chart/Chart.yaml in this commit's diff.

Refs #2065

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(continuum): bump blueprint.yaml spec.version 0.1.0 → 0.1.1 (lockstep)

TestBootstrapKit_BlueprintVersionLockstepSweep enforces
Chart.yaml.version == blueprint.yaml.spec.version for every
bootstrap-kit blueprint. Previous commit bumped Chart.yaml but missed
the blueprint manifest — this commit closes the lockstep.

Same Refs #2065 thread.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hati.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:10:59 +04:00
..
check-bootstrap-deps.sh fix(bp-external-secrets-stores): split ClusterSecretStore into separate chart per #247 pattern (closes #331) (#426) 2026-05-01 17:33:47 +04:00
check-bootstrap-kit-pin-sync.sh feat(ci): TBD-A26 pin-sync audit verifies GHCR artifact exists for each bootstrap-kit pin (#1874) 2026-05-19 03:12:13 +04:00
check-controller-workflow-uniformity.sh chore(ci): add auto-bump-images + pkg/** path filter to all build-*-controller workflows (Closes #2006) (#2012) 2026-05-20 04:11:04 +04:00
check-vendor-coupling.sh fix(ci): vendor-coupling guardrail path - products/catalyst/bootstrap/api/internal/objectstorage (closes #438) (#440) 2026-05-01 18:21:57 +04:00
expected-bootstrap-deps.yaml feat(bootstrap-kit): wire bp-continuum (failover orchestrator) — Pillar 3 unblock (Refs #2065) (#2072) 2026-05-20 10:10:59 +04:00
generate-blueprint-deps.sh fix(wizard): blueprint deps sourced from Flux dependsOn (single source of truth) (#652) 2026-05-03 09:47:52 +04:00
operator-recover-sovereign.sh docs(ops): comprehensive operator runbook + remediation playbook + idempotent recovery script 2026-04-29 19:26:29 +02:00