8e96522d67
35 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
f6757c7c93
|
feat(docs): lean documentation strategy — consolidate 16 docs into 7 canonical + 3 subdirs (#2094)
* docs(arch): consolidate ARCHITECTURE + PLATFORM-TECH-STACK + NAMING + EPICS-1-6 + BOOTSTRAP-KIT-EXPANSION → docs/ARCHITECTURE.md (lean doc strategy) Single canonical "how OpenOva works" doc per founder's lean-doc strategy. 2926 source lines → 1110 consolidated lines, no semantic loss. Sections: §1 High-level model (Catalyst/Sovereign/Org/Env/Application/Blueprint) §2 Repo layout §3 Tech stack by layer (CNI/GitOps/IaC/event-spine/data/secrets/identity/...) §4 Naming conventions (dimensions, patterns, labels, DOMAINS-CANON) §5 Catalyst control plane (rules, CRDs, controllers, cutover, identity, surfaces) §6 Per-host-cluster infrastructure §7 Application Blueprints §8 Multi-region topology (1 cpx52/region, WireGuard-over-public-IPs, ClusterMesh) §9 Bootstrap-kit slot ordering (full 48-slot canonical list) §10 EPIC-level design overview (EPIC-0 through EPIC-6) §11 Per-chart DESIGN.md inventory §12 OAM influence §13 Read further Stale literal fixes: - omantel.openova.io → omantel.biz / <sovereign>.<tld> / t38.omani.works (7 instances) - SPIRE marked DEFERRED / opt-in only (PR #665, TBD-V29 #2055) - failover-controller marked REPLACED by bp-continuum New PR refs wired into §3: - PR #665 SPIRE deferral - PR #2071 bp-cnpg-pair synchronous remote_apply (zero-tx-loss multi-region) - PR #2087 bp-cnpg-pair pre-merge guard - PR #2093 bp-cnpg-pair pre-merge guard New stack components added to §3: - bp-cnpg-pair (synchronous remote_apply ReplicaCluster across ClusterMesh) - bp-continuum (lease-based failover orchestrator) - bp-self-sovereign-cutover (8-tether pivot, ADR-0002, Principle #11) Source docs (to be deleted by orchestrator in final PR): - docs/PLATFORM-TECH-STACK.md - docs/NAMING-CONVENTION.md - docs/EPICS-1-6-unified-design.md - docs/BOOTSTRAP-KIT-EXPANSION-PLAN.md * docs(principles): consolidate INVIOLABLE-PRINCIPLES + ANTI-PATTERN-CATALOG → docs/PRINCIPLES.md (lean doc strategy) * docs(dod): consolidate 5-PILLAR-DOD + DOMAINS-CANON + SOVEREIGN-MULTI-REGION-DOD + PERSONAS-AND-JOURNEYS → docs/DOD.md (lean doc strategy) * docs(runbooks+status+glossary): consolidate 5 runbooks → RUNBOOKS.md + refresh STATUS.md + fold banned-terms into GLOSSARY.md (lean doc strategy) Part 1 — Runbook consolidation: - NEW docs/RUNBOOKS.md with 7 numbered sections (provisioning, day-2 ops, Blueprint authoring, chart conventions, demo walk, failover, troubleshooting) - Folds BLUEPRINT-AUTHORING / CHART-AUTHORING / DEMO-RUNBOOK / RUNBOOK-OPERATIONS / RUNBOOK-PROVISIONING into one canonical surface - Documents dual-annotation requirement for charts with enabled.default: false (GUARD 1 #2087 no-upstream + GUARD 2 #2093 smoke-render) with bp-network-policies:1.0.1 dead-reserve incident as the live evidence - All admin.<fqdn> legacy URL refs → console.<fqdn>/bss (BSS lives in operator console) - All openova.io / omantel.omani.works test commands → canonical t<NN>.omani.works - Cites PRs #2076 (docs migration), #2082 (no-auto-close-keyword), #2087, #2093 Part 2 — STATUS.md refresh (renamed from IMPLEMENTATION-STATUS.md): - Header dated 2026-05-20 (was 2026-04-29; 22 days stale per audit) - Adds 🟦 CODE-COMPLETE state for "controllers + CRDs + tests landed, awaiting fresh-prov walk" (per 5-pillar DoD) - Pillar 3 marked CODE-COMPLETE (PRs #2071/#2072/#2073/#2074/#2075/#2053) - Adds 3 new CRDs verified in products/catalyst/chart/crds/: CNPGPair, PDM, Sandbox - Sandbox controller chain CODE-COMPLETE (PRs #1615/#1618/#1621/#1622/#1626/#1631/#1632) - SPIRE marked DEFERRED — opt-in only (PRs #665, #2056, #2061) - New §6 CI / supply-chain guards table: hollow-chart (#2087), smoke-render (#2093), no-auto-close-keyword (#2082), observability-toggle, subchart 4-step, Flux version-pin replay - New §9 Pillar-status table — Pillars 1/2/3/4 CODE-COMPLETE, Pillar 5 🚧 - Pillar 1 (PRs #2038 V18, #2043 V18-D), Pillar 2 (PR #2029 V20), Pillar 3 (per above), Pillar 4 (Sandbox chain) Part 3 — GLOSSARY.md folded as single source of truth for banned terms: - Header dated 2026-05-20, notes "single source of truth for banned terms" and "no separate BANNED-TERMS.md" - Existing 11 banned-terms rows rewritten with italicized qualifiers - NEW Forbidden test domains subsection: openova.io (mothership-only), omantel.openova.io (hallucinated), Nova Cloud (predecessor brand), eventforge.io (hallucinated), admin.<fqdn> (dead BSS URL) - SPIFFE/SPIRE identity row + acronym row marked deferred per PR #665 with TBD-V29 (#2055) re-introduction roadmap - Cross-links updated: IMPLEMENTATION-STATUS → STATUS, SOVEREIGN-PROVISIONING + BLUEPRINT-AUTHORING → RUNBOOKS.md CLAUDE.md NOT touched. Source files NOT deleted (orchestrator owns deletion). No push, no PR. Manifest at /tmp/merge-D-runbooks-status-glossary-manifest.txt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: assemble lean doc strategy — delete legacy sources, move ledger/sessions/archive, ADR-0004, rewrite cross-refs Per founder direction 2026-05-20 + user-global ~/.claude/CLAUDE.md §11. This is the orchestrator commit on top of the four cherry-picked consolidation commits (ARCHITECTURE, PRINCIPLES, DOD, RUNBOOKS+STATUS+GLOSSARY). It: 1. Deletes 15 legacy source docs (now folded into the 7 canonical): PLATFORM-TECH-STACK, NAMING-CONVENTION, EPICS-1-6-unified-design, BOOTSTRAP-KIT-EXPANSION-PLAN, INVIOLABLE-PRINCIPLES, ANTI-PATTERN-CATALOG, 5-PILLAR-DOD, DOMAINS-CANON, SOVEREIGN-MULTI-REGION-DOD, PERSONAS-AND-JOURNEYS, BLUEPRINT-AUTHORING, CHART-AUTHORING, DEMO-RUNBOOK, RUNBOOK-OPERATIONS, RUNBOOK-PROVISIONING. 2. Moves transient + historical docs into proper subdirs: - docs/ledger/{TRUST,TRACKER}.md (cron-refreshed live state) - docs/sessions/{2026-05-17-convergence,2026-05-19-20-trust-recovery, 2026-05-20-trust-audit,2026-05-20-walk-runbook}.md - docs/archive/{validation-log,orchestrator-state,omantel-handover-wbs}.md 3. Adds docs/adr/0004-cnpg-sync-replication.md (Pillar 3 zero-tx-loss decision) + docs/adr/README.md index. 4. Updates CLAUDE.md reading-order + repo-structure block to match the lean strategy and current core/ tree (controllers/, marketplace/, etc.). 5. Sweeps all .md files + .github/workflows + scripts to repoint old doc paths to the new canonical homes. ADR cross-references kept intact (ADRs are immutable historical artifacts). Operator-side cron scripts that still write to the old paths (/home/openova/bin/refresh-dod-dashboard.sh, refresh-wbs.sh and openova-private/bin/trust-audit.sh) need a one-line path update — flagged in the PR body. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(bootstrap-kit): update repo-root sentinel to docs/PRINCIPLES.md The bootstrap-kit Go test used `docs/INVIOLABLE-PRINCIPLES.md` as its repo-root sentinel; the file no longer exists after the lean-doc consolidation (it's now `docs/PRINCIPLES.md`). Update the walker to match the new canonical filename. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
6d38089895 | deploy(bp-harbor): bump bootstrap-kit pin -> 1.2.19 + blueprint.yaml lockstep (auto, Refs TBD-A6 + TBD-A20, retry 2) | ||
|
|
59980125ed
|
fix(networkpolicy): egress to CNPG data-plane Pods, not cnpg-system operator NS (TBD-A39, Closes #1901) (#1911)
The CNPG operator runs in the `cnpg-system` namespace, but the actual
Postgres workload Pods reconcile into the same namespace as the CNPG
`Cluster` CR — for the auto-provisioned-DB blueprints that's
`.Release.Namespace` (e.g. `newapi`, `harbor`). A NetworkPolicy egress
rule that namespace-selects on `cnpg-system` reaches the operator pods
only, NOT the Postgres workloads — every 5432 connection times out.
Verified live on t31: `newapi-bp-newapi-newapi-pg-1` runs in `newapi`
ns with label `cnpg.io/cluster=newapi-bp-newapi-newapi-pg`, while
`newapi-bp-newapi-…` is stuck 1/2 Ready with 20 restarts because its
egress NP allows 5432 only to `cnpg-system`.
Fix: every affected NP now selects the Postgres workload Pods by the
operator-emitted `cnpg.io/cluster=<clusterName>` Pod label — namespace-
agnostic, survives the operator namespace being different from the
data-plane namespace.
Charts fixed (4):
- bp-newapi (1.4.22 → 1.4.23) — auto-provisions CNPG Cluster in
`.Release.Namespace`. Removed the bogus `namespaceLabel: cnpg-system`
egress entry from values.yaml; added a podSelector-based rule
(cnpg.io/cluster=<release>-bp-newapi-newapi-pg) directly in the
template, gated by `.Values.cnpg.enabled`.
- bp-harbor (1.2.17 → 1.2.18) — Cluster CR in
`postgres.cluster.namespace | default .Release.Namespace` (default
`harbor`). Changed egress from namespaceSelector=cnpg to
podSelector cnpg.io/cluster=<postgres.cluster.name|default harbor-pg>.
- bp-matrix (1.0.0 → 1.0.1) — chart points at
matrix-postgres-rw.matrix.svc.cluster.local (Cluster CR in
`.Release.Namespace`). Replaced `cnpgNamespace` value with
`cnpgClusterName` (default `matrix-postgres`) and switched egress
rule to podSelector.
- bp-openmeter (1.0.0 → 1.0.1) — operator-supplied CNPG endpoint
pattern. Replaced `cnpgNamespace` with `cnpgClusterName` (default
`openmeter-pg`) and switched egress rule to podSelector. Same
pattern as matrix.
Audited and clean:
- bp-cnpg-pair: already uses podSelectors throughout.
- bp-wordpress-tenant: cnpgNamespaceLabel="" path resolves to
`.Release.Namespace` via the `cnpgNamespace` helper.
- bp-llm-gateway: already pod-selects on
`cnpg.io/cluster=bp-llm-gateway-audit`.
- bp-keycloak / bp-gitea / bp-grafana / bp-mimir: no own
networkpolicy.yaml template (grafana/mimir pass enabled=false
to upstream subcharts).
Validation:
- helm template render clean for all 4 charts.
- `kubectl apply --dry-run=server` on t31 — all 4 NetworkPolicies
accepted by the API server.
- Verbatim render confirms the auto-emitted cluster name matches the
label on the existing CNPG Pod (newapi-bp-newapi-newapi-pg).
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
|
||
|
|
0a45a790e7
|
fix: omit HTTPRoute sectionName across blueprint charts — match PR #1888 pattern (Closes #1902) (#1909)
PR #1888 (TBD-A30) fixed catalyst-system HTTPRoutes for multi-zone Sovereigns whose Cilium Gateway renames HTTPS listeners from `https` to `https-<sanitised-zone>` (e.g. `https-omani-works`, `https-omani-homes`) when more than one parent zone is enabled. Every public HTTPRoute pinned to `sectionName: https` got `Accepted=False NoMatchingListener` and the hosted service 404'd / connection-refused. That fix only touched products/catalyst/chart. Per-blueprint HTTPRoutes shipped the same `sectionName: https` default in values.yaml, so on a multi-zone Sovereign every blueprint route — gitea, grafana, harbor, keycloak, newapi, openbao, powerdns, stalwart-tenant — silently failed to attach. TBD-A40 / issue #1902. Sweep verbatim: $ git grep -nE 'sectionName:[[:space:]]+(https|"https")[[:space:]]*$' \ platform/*/chart/ products/ clusters/ core/ 2>/dev/null \ | grep -v 'platform/gateway-api/chart/templates' platform/gitea/chart/values.yaml:168: sectionName: https platform/grafana/chart/values.yaml:124: sectionName: https platform/harbor/chart/values.yaml:437: sectionName: https platform/keycloak/chart/values.yaml:482: sectionName: https platform/newapi/chart/values.yaml:721: sectionName: https platform/openbao/chart/values.yaml:72: sectionName: https platform/powerdns/chart/values.yaml:407: sectionName: https platform/stalwart-tenant/chart/values.yaml:297: sectionName: https products/catalyst/bootstrap/api/internal/handler/sme_tenant_gitops.go:802: sectionName: https Fix (Option C — omit sectionName, same as PR #1888): - 8 blueprint values.yaml defaults flipped from `sectionName: https` to `sectionName: ""`. The chart templates already guard with `{{- with .Values.gateway.parentRef.sectionName }}`, so a blank value drops the field entirely and Cilium Gateway matches by hostname filter. - platform/newapi/chart/templates/httproute.yaml was the outlier: it used `default "https" $parent.sectionName` which fell back to `https` even when values.yaml said empty. Rewritten to `{{- with $parent.sectionName }}` so empty drops the field — same pattern as the other 7 blueprints. - products/catalyst/bootstrap/api/internal/handler/sme_tenant_gitops.go renders a per-tenant bp-keycloak HelmRelease and injected `sectionName: https` into spec.values. Flipped to `sectionName: ""` so the bp-keycloak chart's `{{- with }}` guard drops the field. Validation (real `helm template`, default values, gateway enabled, no sectionName override) — Principle #15: gitea : sectionName lines in rendered output = 0 grafana : sectionName lines in rendered output = 0 harbor : sectionName lines in rendered output = 0 keycloak : sectionName lines in rendered output = 0 openbao : sectionName lines in rendered output = 0 powerdns : sectionName lines in rendered output = 0 newapi : sectionName lines in rendered output = 0 stalwart-tenant : sectionName lines in rendered output = 0 Override path preserved — `--set ...parentRef.sectionName=https-omani-works` on each chart renders `sectionName: "https-omani-works"` correctly, so operators on single-zone clusters or non-Cilium gateways can still pin explicitly via bootstrap-kit overlay. helm lint clean on all 8 blueprint charts (newapi cnpg-cluster.yaml lint error is pre-existing on origin/main, unrelated to this fix). Chart bumps (each blueprint also bumps blueprint.yaml spec.version per #817 lockstep): bp-gitea 1.2.7 -> 1.2.8 bp-grafana 1.0.1 -> 1.0.2 bp-harbor 1.2.17 -> 1.2.18 bp-keycloak 1.4.5 -> 1.4.6 bp-newapi 1.4.22 -> 1.4.23 bp-openbao 1.2.16 -> 1.2.17 bp-powerdns 1.2.3 -> 1.2.4 bp-stalwart-tenant 0.1.2 -> 0.1.3 Refs TBD-A40. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
cf35b4a9b6
|
fix(ci): blueprint.yaml spec.version lockstep in auto-bump (Closes #1856) (#1858)
A17 (#1855) hot-patched 6 drifted blueprints (cilium, cert-manager, flux, openbao, keycloak, gitea) where blueprint.yaml spec.version had silently fallen behind chart/Chart.yaml version, breaking TestBootstrapKit_BlueprintCardsHaveRequiredFields. The structural root cause: the TBD-A6 auto-bump hook in blueprint-release.yaml updated only clusters/_template/bootstrap-kit/<N>-<chart>.yaml pins on every chart publish — never the upstream platform/<bp>/blueprint.yaml. This PR extends the auto-bump hook to lockstep platform/<bp>/blueprint.yaml spec.version whenever Chart.yaml version bumps. Both file edits land in the SAME commit (subject becomes `deploy(<chart>): bump bootstrap-kit pin X -> Y (auto, Refs TBD-A6)` with a secondary line noting the blueprint lockstep). Idempotent reset-and-rewrite retry preserved for the existing parallel-matrix race case. Workflow changes (.github/workflows/blueprint-release.yaml): * New step `bump_blueprint` after `bump_pin` — locates ${matrix.path}/blueprint.yaml OR ${matrix.path}/chart/blueprint.yaml (handles both platform-leaf and products-umbrella conventions), filters to kind:Blueprint (defensive against CRD yaml at the products/catalyst/chart/crds path), reads current spec.version at 2-space indent, sed-rewrites to CHART_VERSION, verifies post-write. * Commit step renamed to "Commit + push bootstrap-kit pin bump + blueprint.yaml lockstep"; stages both files, single commit, with convergent retry on conflict. * Summary block surfaces both bumps separately. Regression test (tests/e2e/bootstrap-kit/main_test.go): * New TestBootstrapKit_BlueprintVersionLockstepSweep — walks platform/* and products/*, discovers every Blueprint manifest with a sibling Chart.yaml, asserts spec.version == Chart.yaml version. Covers ALL ~70 blueprints, not just the canonical 10 kit ones the existing TestBootstrapKit_BlueprintCardsHaveRequiredFields gates. * Failure messages name the file, drift direction, and the exact sed command to fix — drift remediation is mechanical. Drift cleanup (mandatory companion, same shape as A17/#1855): 26 Application-Blueprint blueprints whose spec.version had been left at 1.0.0 / 0.1.0 while Chart.yaml moved forward — synced down to Chart.yaml as authoritative. All currently surface in the new sweep test; without the cleanup the test would block this PR (and every subsequent one). Affected: alloy, cert-manager-{dynadot,powerdns}-webhook, cluster-autoscaler-hcloud, cnpg, crossplane-claims, external-secrets[-stores], falco, grafana, guacamole, harbor, hcloud-csi, k8s-ws-proxy, mimir, netbird, newapi, openclaw, powerdns, seaweedfs, self-sovereign-cutover, trivy, valkey, velero, vpa, products/dmz-vcluster. After this lands, the next chart-version bump in any platform/<bp>/ folder auto-converges all three artifacts (Chart.yaml, blueprint.yaml, bootstrap-kit pin) in a single bot commit. No more manual collector PRs; no more silent drift between chart and Blueprint manifest. Closes #1856. Refs #1855 (A17 hot-patch this replaces structurally), #1713 (original TBD-A6 auto-bump hook). Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3d929e69d7
|
fix(httproute): collapse double-prefix when releaseName contains chart name (gitea/harbor/openbao 500/404) (#1483)
* fix(tls): cilium-gateway-cert STAGING/PROD issuer selectable via tofu clusters/_template/sovereign-tls/cilium-gateway-cert.yaml hardcoded letsencrypt-dns01-prod-powerdns regardless of qa_test_session_enabled. On high-cadence QA reprov cycles this hits the LE PROD 5/168h rate limit (caught on prov #76 at 13:45 UTC, retry-after 16:49 UTC) and the wildcard Certificate sticks Ready=False — Cilium Gateway has no valid TLS secret → envoy listener never binds → public TLS handshake to console.<fqdn> dies with SSL_ERROR_SYSCALL. Add tofu local.wildcard_cert_issuer = qa_test_session_enabled ? staging : prod. Thread WILDCARD_CERT_ISSUER through the sovereign- tls Kustomization postBuild.substitute. cilium-gateway-cert.yaml references it as ${WILDCARD_CERT_ISSUER}. Default behaviour unchanged for non-QA (production) Sovereigns — they still resolve to letsencrypt-dns01-prod-powerdns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cilium-gateway): allow world ingress to Cilium Gateway reserved:ingress endpoint When Cilium Gateway API runs with gatewayAPI.hostNetwork.enabled=true and a default-deny CCNP is present, every public request to a Sovereign host (console, auth, gitea, registry, api, ...) hits the gateway listener and gets DENIED at envoy's cilium.l7policy filter with: cilium.l7policy: Ingress from 1 policy lookup for endpoint X for port 30443: DENY Public response: HTTP/1.1 403 Forbidden, body "Access denied", server: envoy. Root cause: Cilium creates a special endpoint with identity reserved:ingress (8) representing the gateway listener. By default this endpoint has policy-enabled=both with allowed-ingress-identities=[1 (host)] and empty L4 rules — so no port is permitted. The default-deny CCNP's NotIn-namespace endpointSelector does NOT cover this endpoint (it has no io.kubernetes.pod.namespace label), and our qa-fixtures didn't ship a matching allow-template for it. Net effect: TLS handshake succeeds, HTTPRoutes are Programmed, backends are healthy in-cluster, but every request 403s. Caught live on prov #80 (omantel.biz, 2026-05-14) after the Gateway hostNetwork fix (#1480) finally activated host-bind on :30443. Verified by: - envoy debug log: cilium.l7policy DENY for endpoint 10.42.0.201 port 30443 - cilium-dbg endpoint get 3282 -o json: l4.ingress: [] and allowed-ingress-identities: [1] - transiently applying the same CCNP via kubectl: console.omantel.biz → 200 Fix: ship a CCNP scoped to reserved:ingress that allows ingress from world, cluster, host, remote-node (multi-region CP-to-CP), and kube-apiserver, plus egress to all so envoy can forward to any backend service. This is the canonical Cilium hostNetwork Gateway-API zero-trust pattern. Chart bump: catalyst 1.4.142 → 1.4.143. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(httproute): match upstream chart fullname-collapse when releaseName contains chart name Three Sovereign-facing HTTPRoute templates (gitea, harbor, openbao) had backend defaults hardcoded as `<release>-<chart>-<resource>` (e.g. `gitea-gitea-http`, `harbor-harbor-core`, `openbao-openbao`). The upstream subcharts use a `<chart>.fullname` helper that COLLAPSES the prefix when `.Release.Name` already contains the chart name — i.e. when the bootstrap-kit releaseName is the chart name (the convention), the live Service is `<release>-<resource>` (or just `<release>` for openbao), not `<release>-<chart>-<resource>`. Effect on prov #80 (omantel.biz): - gitea/gitea HTTPRoute → backendRef `gitea-gitea-http` (does not exist; live is `gitea-http`) → BackendNotFound → gitea.omantel.biz returns HTTP 500 - harbor/harbor HTTPRoute → `harbor-harbor-core` (live is `harbor-core`) → registry.omantel.biz returns HTTP 500 - openbao/openbao HTTPRoute → `openbao-openbao` (live is `openbao`) → bao.omantel.biz dead Fix: replicate the upstream chart's `.fullname` collapse logic via `(ternary .Release.Name (printf "%s-<chart>" .Release.Name) (contains "<chart>" .Release.Name))` so the default backend always matches the live Service name regardless of releaseName choice. Operators retain the `gateway.backendService` override for non-standard release names. Chart bumps: bp-gitea 1.2.6 → 1.2.7, bp-harbor 1.2.16 → 1.2.17, bp-openbao 1.2.14 → 1.2.15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: e3mrah <catalyst@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: e3mrah <1234567+e3mrah@users.noreply.github.com> |
||
|
|
74d23ab3dc
|
fix(charts): explicit harbor.openova.io/proxy-dockerhub prefix on all chart-hook images (#163) (#1367)
Per CLAUDE.md MIRROR-EVERYTHING inviolable rule: every chart-hook image reference (pre/post-install Jobs, helper Pods) must use the explicit Harbor proxy-cache form. Fix #158's bitnami → bitnamilegacy swap was a band-aid; the architecturally correct fix is to defeat upstream-deletion blast radius entirely by routing through Harbor. The node-level containerd mirror in infra/hetzner/cloudinit-control- plane.tftpl (line 706) already redirects docker.io/* → harbor.openova.io/proxy-dockerhub/* implicitly, but implicit routing: - Hides the routing from SBOM scans - Bypasses the Kyverno harbor-proxy-pull ClusterPolicy - Means a chart audit (`grep docker.io`) misses a real dependency - Was the proximate cause of prov #27 wedging when Bitnami deleted docker.io/bitnami/kubectl:1.30.4 (Fix #158 had to chase the deletion mid-flight instead of being insulated by Harbor cache) 19 chart-hook image: refs + 5 chart values.yaml repository: defaults now carry the explicit harbor.openova.io/proxy-dockerhub prefix. Application/subchart images (keycloak, postgresql, mongodb in keycloak+litmus subcharts) are intentionally out of scope for this PR — those go through the node-level containerd mirror still. Affected blueprints + chart version bumps: bp-cert-manager 1.2.1 -> 1.2.2 bp-external-secrets-stores 1.0.4 -> 1.0.5 bp-crossplane-claims 1.1.4 -> 1.1.5 bp-flux 1.2.1 -> 1.2.2 bp-guacamole 0.1.16 -> 0.1.17 bp-self-sovereign-cutover 0.1.28 -> 0.1.29 bp-k8s-ws-proxy 0.1.9 -> 0.1.10 bp-harbor 1.2.15 -> 1.2.16 bp-gitea 1.2.5 -> 1.2.6 bp-newapi 1.4.5 -> 1.4.6 bp-wordpress-tenant 0.2.0 -> 0.2.1 catalyst-platform 1.4.138 -> 1.4.139 Co-authored-by: e3mrah <1234567+e3mrah@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
890fa67eff
|
fix(bp-harbor): inline labels on admin Secret to drop duplicate keys (#949) (#950)
PR #947 (bp-harbor 1.2.14) added templates/admin-secret.yaml that included the canonical bp-harbor.labels helper AND re-declared app.kubernetes.io/name + catalyst.openova.io/component with admin- credential-specific values. Helm's strict YAML post-render parser rejected the rendered manifest with `mapping key "app.kubernetes.io/name" already defined at line 8`, blocking the upgrade chain on otech113 — bp-self-sovereign-cutover dependsOn bp-harbor and re-blocked, stalling cutover indefinitely. Per the issue's recommended Option A, labels are inlined verbatim on the admin Secret. Every key the helper would emit is reproduced explicitly, except the two that need a Secret-specific value (catalyst.openova.io/component=harbor-admin) plus an explicit admin-credentials sub-component label. A regression guard (Case 6) is added to tests/admin-secret.sh: the rendered Secret block is parsed through PyYAML's safe_load_all, which enforces mapping-key uniqueness the same way Helm's post- render does. Duplicate keys raise and break the test. Bumps: - platform/harbor/chart/Chart.yaml 1.2.14 → 1.2.15 - clusters/_template/bootstrap-kit/19-harbor.yaml slot pin Verification (all green locally): helm template smoke . --namespace harbor # renders OK bash tests/admin-secret.sh # 6 gates green helm lint . # 0 failed Closes one half of #949 (bp-harbor side); the slot pin update delivers it to fresh Sovereigns; existing otech113 picks up the upgrade on next Flux reconcile after the new chart publishes. Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> |
||
|
|
88a8ecd8bb
|
fix(cutover): Reflector-mirror harbor-admin Secret + in-cluster trigger endpoint (#935) (#947)
Two bugs surfaced live on otech113 2026-05-05 blocking Self-Sovereignty Cutover end-to-end. Fix both in lockstep: Bug 1 — bp-self-sovereign-cutover Step 02 (harbor-projects) Job in `catalyst` namespace was hitting `secret "harbor-core" not found` for 11+ retries because the upstream Harbor `harbor-core` Secret only exists in the `harbor` namespace and Kubernetes forbids cross-namespace secretKeyRef. Step 02 was stuck in CreateContainerConfigError forever. Fix: bp-harbor 1.2.13 → 1.2.14 ships a Catalyst-curated `harbor-admin` Secret in the `harbor` namespace with Reflector mirror annotations (allowed-namespaces=catalyst, auto-enabled). The same Secret name auto-materialises in `catalyst` so the cutover Job's secretKeyRef resolves natively. Password is randomly generated on first install (32-char alphanum, 190 bits entropy per feedback_passwords.md) and preserved across reconciles via `lookup`. The upstream Harbor subchart consumes it via `existingSecretAdminPassword: harbor-admin`. bp-self-sovereign-cutover 0.1.16 → 0.1.17 updates `harbor.adminSecretRef.name` from `harbor-core` to `harbor-admin`. Bug 2 — The 0.1.16 auto-trigger Helm post-install Job (#933) POSTed /api/v1/sovereign/cutover/start which sits behind RequireSession middleware. The Job has no human session cookie — every request 401'd forever and cutover never started. Fix: new catalyst-api endpoint POST /api/v1/internal/cutover/trigger lives OUTSIDE RequireSession and validates the bearer token via the apiserver's TokenReview API + checks the resolved username matches the canonical `bp-self-sovereign-cutover-runner` SA. Same engine, same idempotency, same state machine — different auth surface. The auto-trigger Job now mounts its projected SA token at /var/run/secrets/kubernetes.io/serviceaccount/token and sends it as `Authorization: Bearer <token>`. SA username + accepted list are runtime-overridable per Inviolable Principle #4. Tests - 6 Go unit tests for HandleCutoverInternalTrigger covering happy path, missing bearer (401), TokenReview rejection (502), wrong SA (403), idempotency (no Jobs created when complete), wrong method (405). All pass. - bp-harbor admin-secret contract test (5 cases) — Secret renders, HARBOR_ADMIN_PASSWORD key present, Reflector annotations, keep policy, upstream consumes via existingSecretAdminPassword. - bp-self-sovereign-cutover cutover-contract test extended with 3 new cases — auto-trigger uses /internal/cutover/trigger, sends SA bearer token, references harbor-admin (not harbor-core). - All 12 cutover-contract gates green; all 4 observability-toggle gates green; helm template + helm lint clean on both charts. Bootstrap-kit slot pins - clusters/_template/bootstrap-kit/19-harbor.yaml: 1.2.13 → 1.2.14 - clusters/_template/bootstrap-kit/06a-bp-self-sovereign-cutover.yaml: 0.1.16 → 0.1.17 Closes #935 Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
6baf7e56e7
|
fix(bp-harbor): grep-oE for password (multi-line tolerant) (chart 1.2.13) (#651)
Co-authored-by: hatiyildiz <hatiyildiz@openova.io> |
||
|
|
d519dc8ba2
|
fix(bp-harbor): switch sync Job to curl-against-apiserver (chart 1.2.12) (#650)
rancher/kubectl is distroless (no /bin/sh) so the inline shell script can't run. Replace with curlimages/curl which has alpine sh + curl. Talk to k8s API directly via the in-pod ServiceAccount token. The PATCH merges password + HARBOR_DATABASE_PASSWORD into the existing pre-install-hook Secret without touching annotations. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> |
||
|
|
08432b540e
|
fix(bp-harbor): switch sync Job to rancher/kubectl (chart 1.2.11) (#649)
bitnami/kubectl moved to sha256-only tags; bitnami/kubectl:1.31.4 returns 'not found' from Docker Hub. rancher/kubectl is always available on k3s clusters. Bumps chart 1.2.10 -> 1.2.11. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> |
||
|
|
de51fa3f7a
|
fix(bp-harbor): post-install Job copies CNPG password (chart 1.2.10) (#648)
* fix(wizard): SOLO default CPX42 → CPX52 (8→12 vCPU / 16→24 GB)
CPX42 fit 30/40 HRs on otech29 but keycloak-keycloak-config-cli
post-upgrade Job sat Pending 8h with 'Insufficient cpu' — 35-component
bootstrap-kit + post-install hooks at peak exceed 8 vCPU. CPX52 (12
vCPU / 24 GB / €36/mo) is the smallest SKU that schedules every default
Pod on one node.
Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
* test(bp-openbao): align Case-4 expectation with #600 RBAC-hook removal
Commit
|
||
|
|
8bb66fe43e
|
fix(bp-{harbor,gitea,powerdns}): bp-cnpg dependsOn + Reflector auto-enabled (#644)
* fix(infra): break tofu cycle — resolve CP public IP at boot via metadata service PR #546 (Closes #542) introduced a dependency cycle: hcloud_server.control_plane.user_data → local.control_plane_cloud_init local.control_plane_cloud_init → hcloud_server.control_plane[0].ipv4_address `tofu plan` failed with: Error: Cycle: local.control_plane_cloud_init (expand), hcloud_server.control_plane Caught live during otech23 first-end-to-end provisioning attempt. Fix: stop templating `control_plane_ipv4` at plan time. cloud-init runs ON the CP node, so it resolves its own public IPv4 at boot via Hetzner's metadata service: curl http://169.254.169.254/hetzner/v1/metadata/public-ipv4 Same observable behavior as #546 (kubeconfig server: rewritten to CP public IP, not LB IP — preserves the wizard-jobs-page-not-stuck-PENDING fix), with no graph cycle. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(infra+api): wire handover_jwt_public_key end-to-end The OpenTofu cloud-init template references ${handover_jwt_public_key} (infra/hetzner/cloudinit-control-plane.tftpl:371) and variables.tf declares the variable, but neither side wires it: - main.tf templatefile() call did not pass the key → "vars map does not contain key handover_jwt_public_key" on tofu plan - provisioner.writeTfvars never set the var → empty even when wired Caught live during otech23 provisioning, immediately after the tofu-cycle fix landed. tofu plan failed with: Error: Invalid function argument on main.tf line 170, in locals: 170: control_plane_cloud_init = replace(templatefile(... Invalid value for "vars" parameter: vars map does not contain key "handover_jwt_public_key", referenced at ./cloudinit-control-plane.tftpl:371,9-32. Fix: - main.tf templatefile() now passes handover_jwt_public_key = var.handover_jwt_public_key - provisioner.Request gains a HandoverJWTPublicKey field (json:"-", server-stamped, never accepted from client JSON) - handler.CreateDeployment stamps it from h.handoverSigner.PublicJWK() when the signer is configured (CATALYST_HANDOVER_KEY_PATH set) - writeTfvars emits the value into tofu.auto.tfvars.json variables.tf default "" preserves the no-signer path: cloud-init writes an empty handover-jwt-public.jwk and the new Sovereign is provisioned without the handover-validation surface (handover flow simply not wired on that Sovereign — degraded gracefully, not a hard failure). Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(api): cloud-init kubeconfig postback must live outside RequireSession The PUT /api/v1/deployments/{id}/kubeconfig route was registered inside the RequireSession-gated chi.Group, so every cloud-init postback was rejected with HTTP 401 {"error":"unauthenticated"} before PutKubeconfig could run. Cloud-init has no browser session cookie — it authenticates with the SHA-256-hashed bearer token PutKubeconfig already verifies internally. Result on otech23: Phase 0 finished (Hetzner CP + LB up), but every cloud-init `curl --retry 60 -X PUT ... /kubeconfig` returned 401 unauth. catalyst-api never received the kubeconfig, Phase 1 helmwatch never started, the wizard's Jobs page stayed in PENDING forever. Fix: register the PUT outside the auth group so cloud-init's bearer-hash auth path is the only gate. The matching GET stays inside session auth — the operator's "Download kubeconfig" button needs the session cookie. Caught live during otech23 first end-to-end provisioning. Per the new "punish-back-to-zero" rule, otech23 was wiped (Hetzner + PDM + PowerDNS + on-disk state) and the next provision will use otech24. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(catalyst-api): wire harbor_robot_token through to tofu — never pull from docker.io PR #557 added the registries.yaml mirror in cloudinit-control-plane.tftpl and declared var.harbor_robot_token in infra/hetzner/variables.tf with a default of "". The catalyst-api side never set it, so every Sovereign so far provisioned with an empty token in registries.yaml — containerd's auth to harbor.openova.io's proxy projects failed silently and pulls fell through to docker.io. On a fresh Hetzner IP, Docker Hub returns rate-limit HTML and: Failed to pull image "rancher/mirrored-pause:3.6": unexpected media type text/html for sha256:... cilium / coredns / local-path-provisioner sit at Init:0/6 forever; Flux pods stay Pending; no HelmReleases ever land; the wizard's job stream shows everything PENDING because there's nothing to watch. Caught live during otech24. Wiring (mirrors the GHCRPullToken pattern): 1. Provisioner.HarborRobotToken — read from CATALYST_HARBOR_ROBOT_TOKEN env at New(). 2. Stamped onto every Request in Provision() and Destroy() before writeTfvars. 3. Request.HarborRobotToken — server-stamped (json:"-"); never accepted from the wizard payload. 4. writeTfvars emits "harbor_robot_token" into tofu.auto.tfvars.json. 5. api-deployment.yaml mounts the catalyst/harbor-robot-token Secret (mirrored from openova-harbor — Reflector-managed on Sovereign clusters; copied per-namespace on Catalyst-Zero contabo) as CATALYST_HARBOR_ROBOT_TOKEN, optional=true so degraded paths still come up. variables.tf default "" preserves graceful fall-through if the operator hasn't issued a robot token yet, and the architecture rule is now enforced end-to-end: every image on every Sovereign goes through harbor.openova.io. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(handler): stamp CATALYST_HARBOR_ROBOT_TOKEN before Validate() (#638 follow-up) PR #638 added Validate() rejection for missing harbor_robot_token, but the handler only stamped req.HarborRobotToken from p.HarborRobotToken inside Provision() — Validate() runs in the handler BEFORE Provision() gets the chance to stamp. Result: every wizard launch returned Provisioning rejected: Harbor robot token is required (CATALYST_HARBOR_ROBOT_TOKEN missing) even though the env var is set on the Pod. Caught immediately on the otech25 launch attempt. Fix: same env-stamp pattern as GHCRPullToken at the top of the CreateDeployment handler. Provisioner-level stamp in Provision() stays as defense-in-depth. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(infra): registries.yaml needs rewrite — Harbor proxy URL is /v2/<proj>/<repo>, not /<proj>/v2/<repo> PR #557 wrote registries.yaml with mirror endpoints like https://harbor.openova.io/proxy-dockerhub hoping containerd would build URLs like https://harbor.openova.io/proxy-dockerhub/v2/rancher/mirrored-pause/manifests/3.6 But Harbor proxy-cache projects expose their API at https://harbor.openova.io/v2/proxy-dockerhub/rancher/mirrored-pause/manifests/3.6 (project name lives BEFORE the image-path /v2/, not as a path prefix). Harbor returns its SPA UI HTML (status 200, content-type text/html) for the wrong shape; containerd then errors with: "unexpected media type text/html for sha256:... not found" and pause-image / cilium / coredns pulls fail forever — caught live during otech24 and otech25. Fix: switch to k3s registries.yaml `rewrite` syntax. Endpoint is the bare Harbor host; per-mirror rewrite re-maps the image path so containerd's final URL is correctly project-prefixed. Verified manually: curl https://harbor.openova.io/v2/proxy-dockerhub/rancher/mirrored-pause/manifests/3.6 -> 200 application/vnd.docker.distribution.manifest.list.v2+json This unblocks every Sovereign image pull through the central Harbor. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(bp-vpa): drop registry.k8s.io/ prefix from repository — upstream chart prepends it cowboysysop/vertical-pod-autoscaler subchart prepends `.image.registry` (default registry.k8s.io) to `.image.repository`. Catalyst's bp-vpa overrode `repository: registry.k8s.io/autoscaling/vpa-...` so the rendered image was `registry.k8s.io/registry.k8s.io/autoscaling/vpa-...:1.5.0` — doubled prefix, image-not-found, ImagePullBackOff on every fresh Sovereign. Caught live during otech26. Fix: drop the redundant prefix. Subchart's default `.image.registry` keeps it pointing at registry.k8s.io which the new Sovereign's containerd routes through harbor.openova.io/v2/proxy-k8s/... via registries.yaml rewrite (#640). Bumps bp-vpa chart version to 1.0.1 and bootstrap-kit reference to match. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(wizard): SOLO default SKU CPX32 → CPX42 — 35-component bootstrap-kit needs 8 vCPU / 16 GB CPX32 (4 vCPU / 8 GB) cannot fit the full SOLO bootstrap-kit on a single node. Caught live during otech26: 38 pods Running, 34 pods stuck Pending indefinitely with "Insufficient cpu" — Cilium + Crossplane + Flux + cert-manager + CNPG + Keycloak + OpenBao + Harbor + Gitea + Mimir + Loki + Tempo + … each request 50-500m vCPU and the node hits 100% allocatable before half the workloads schedule. CPX42 (8 vCPU / 16 GB / 320 GB SSD) at €25.49/mo is the smallest size that fits the bootstrap-kit with VPA-recommendation headroom. Operators can still pick CPX32 explicitly if they trim the component set on StepComponents — but the default SOLO path now provisions a node that actually boots into a steady state. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(bp-cert-manager-dynadot-webhook): pin SHA tag + add ghcr-pull imagePullSecret (chart 1.1.2) - Replace forbidden `:latest` tag with current short-SHA `942be6f` per docs/INVIOLABLE-PRINCIPLES.md #4. - Add default `webhook.imagePullSecrets: [{name: ghcr-pull}]` so kubelet authenticates against private ghcr.io/openova-io/openova/* via the Reflector-mirrored `ghcr-pull` Secret in cert-manager namespace. Without this, the webhook Pod was stuck ErrImagePull/ImagePullBackOff on every Sovereign — caught live during otech27. - Bumps chart version 1.1.1 -> 1.1.2 and bootstrap-kit reference. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> * fix(bp-{harbor,gitea,powerdns}): add bp-cnpg dependency + Reflector auto-enabled Two related Phase-8a stragglers diagnosed live during otech28: 1. bp-powerdns missed bp-cnpg in dependsOn. Helm renders BEFORE postgresql.cnpg.io/v1 CRD is registered → templates/cnpg-cluster.yaml `Capabilities.APIVersions.Has` gate evaluates false → no Cluster CR → no pdns-pg-app Secret → powerdns Pods stuck CreateContainerConfigError forever ("secret pdns-pg-app not found"). Adds explicit dependsOn. 2. bp-harbor/gitea/powerdns CNPG inheritedMetadata only set reflection-allowed; missing reflection-auto-enabled. Reflector races when destination Secret (harbor-database-secret) is created BEFORE CNPG provisions the source (harbor-pg-app). Reflector logs "Source could not be found" once and never retries — leaving harbor- core stuck CreateContainerConfigError. Adding auto-enabled makes Reflector actively watch the source and re-fire when it appears. Bumps: bp-harbor 1.2.8 -> 1.2.9 bp-gitea 1.2.1 -> 1.2.2 bp-powerdns 1.1.5 -> 1.1.7 (skips 1.1.6 which was a non-released bump) Bootstrap-kit references updated to pull the new chart versions on the next Sovereign provisioning. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> --------- Co-authored-by: hatiyildiz <hatiyildiz@openova.io> |
||
|
|
93627ada20
|
fix(bp-harbor): convert harbor-database-secret to Helm pre-install hook (1.2.8) (#603)
The 1.2.7 fix dropped the `data:` block from the chart template, but
Helm's three-way merge still owns the Secret as a release resource and
resets `data: {}` (no keys) on every chart upgrade — verified on otech22
where 1.2.6→1.2.7 reconcile wiped Reflector-populated keys back to nil.
Architectural fix: convert the Secret to a Helm pre-install hook.
- `helm.sh/hook: pre-install` — Secret is created at install time only.
On `helm upgrade`, Helm does NOT touch the Secret (no three-way merge),
so keys populated by Reflector persist across every chart bump.
- `helm.sh/hook-delete-policy: before-hook-creation` — On a re-install,
Helm deletes the previous Secret first so the hook recreates clean.
- `helm.sh/resource-policy: keep` — `helm uninstall` does NOT delete the
Secret (paired with hook means standard upgrade path never sees a delete).
- Hook resources are NOT recorded in the Helm release manifest, so they're
invisible to `helm upgrade`'s three-way merge.
Also drops the inline `data:` block (kept from 1.2.7) — Reflector still
populates everything from harbor-pg-app once CNPG bootstraps the source.
Bumps bp-harbor 1.2.7 → 1.2.8, bootstrap-kit refs (_template, otech, omantel).
Closes #585
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
09208ca58f
|
fix(bp-harbor): omit data block in harbor-database-secret — Helm overwrite regression (1.2.7) (#602)
On every helm upgrade, Helm three-way merge resets `data.password` and `data.HARBOR_DATABASE_PASSWORD` to "" because the chart declares them empty in the template. After Reflector populates them from `harbor-pg-app`, the next bp-harbor upgrade silently empties them again — harbor-core then crashloops on the next pod restart with "password authentication failed". Observed on otech22 after the 1.2.5→1.2.6 Flux upgrade: harbor-database- secret.password went from 64 bytes back to 0 bytes, harbor-core entered CrashLoopBackOff. Resolved at runtime by touching harbor-pg-app to bump its resourceVersion and re-trigger Reflector, but the architectural fix is needed so it doesn't recur on the next chart upgrade. Fix: drop the entire `data:` block from templates/database-secret.yaml. The Secret is created by Helm with no data keys (Helm owns nothing in the data field). Reflector adds ALL keys from `harbor-pg-app` (password, HARBOR_DATABASE_PASSWORD, username, host, dbname, jdbc-uri, etc.) on the first SecretWatcher event after CNPG bootstraps the source. On subsequent helm upgrades, Helm's three-way merge has nothing to overwrite in `data:` because the chart no longer declares any keys there. Bumps bp-harbor 1.2.6 → 1.2.7, bootstrap-kit refs (_template, otech, omantel). Closes #585 (regression of) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
8d50402038
|
fix(bp-harbor): remove cnpg-app-annotator Job — CNPG inheritedMetadata handles annotation (1.2.6) (#601)
The post-install Job `harbor-pg-app-annotator` (with curlimages/curl:8.7.1) is no longer needed: bp-harbor 1.2.5 already uses CNPG's `inheritedMetadata` stanza in cnpg-cluster.yaml to stamp `reflection-allowed: true` onto `harbor-pg-app` at CNPG bootstrap time. The Job was causing ErrImagePull on otech22 because Docker Hub is proxied through Harbor itself (chicken-and-egg). Removes: - templates/cnpg-app-annotator-job.yaml - templates/cnpg-app-annotator-rbac.yaml - values.yaml cnpgAnnotator section Updates database-secret.yaml comment to reflect the inheritedMetadata approach. Bumps Chart.yaml 1.2.5 → 1.2.6, bootstrap-kit refs (_template, otech, omantel). Closes #585 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
cba1b5070a
|
fix(bp-gitea+harbor): use CNPG inheritedMetadata to propagate reflector annotations to pg-app Secret (#595)
The Cluster CR `metadata.annotations` are NOT propagated by CNPG onto the
generated `{name}-app` Secrets. Reflector requires the SOURCE Secret (e.g.
`gitea-pg-app`) to carry `reflection-allowed: "true"` before it will copy
data into the DESTINATION Secret (`gitea-database-secret`). On otech22 this
caused `gitea-database-secret` to stay empty indefinitely — gitea init container
failed auth with "password authentication failed for user gitea".
Fix: use CNPG's `inheritedMetadata.annotations` stanza (v1.24+) to instruct
CNPG to annotate all generated Secrets with the reflector permission annotations.
Applied to both bp-gitea (1.2.0→1.2.1) and bp-harbor (1.2.4→1.2.5) since
harbor-pg-app had the same issue.
Bootstrap-kit: bump bp-gitea chart ref 1.2.0→1.2.1 (template + otech + omantel).
Co-authored-by: alierenbaysal <alierenbaysal@openova.io>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
fe03b8cc42
|
fix(bp-harbor): use curl for CNPG annotator PATCH + add values defaults (1.2.4) (#594)
busybox wget does not support --method=PATCH (only GET/POST). The harbor-pg-app-annotator Job silently succeeded without actually patching harbor-pg-app, leaving harbor-database-secret empty on fresh install. Fixes: 1. Switch cnpg-app-annotator-job.yaml from busybox:1.36.1 + wget to curlimages/curl:8.7.1 + curl -X PATCH. curl natively supports all HTTP verbs. HTTP response code checked explicitly; non-2xx exits 1 so the Job retries instead of silently passing with no-op. 2. Add cnpgAnnotator.image stanza to values.yaml (was missing — prior charts defaulted via nil-safe dict fallback but the section was never actually written to values.yaml). Defaults to curlimages/curl:8.7.1. 3. readOnlyRootFilesystem: false (curl writes /tmp/patch-response.json for error diagnostics). 4. Bump chart 1.2.3 → 1.2.4. Closes #585 Co-authored-by: hatiyildiz <hatiyildiz@openova.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
97abf9dedb
|
fix(bp-harbor): nil-safe image value extraction in cnpg-app-annotator Job (#593)
.Values.cnpgAnnotator.image.repository triggers nil pointer when the values tree is partially absent in Helm's default-values render. Use | default dict chained assignments to safely extract image repo/tag/ pullPolicy. Fixes blueprint-release smoke render failure on 1.2.3. Closes #585 Co-authored-by: hatiyildiz <hatiyildiz@openova.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
74d526c276
|
fix: bp-gateway-api 5→10 CRDs + bp-gitea CNPG + bp-harbor CNPG race fix + DAG audit (#592)
* fix(bp-gitea): switch to CNPG-managed postgres, drop bitnamilegacy subchart (Closes #584) The bundled Bitnami postgresql subchart pulls docker.io/bitnamilegacy/postgresql which is unavailable (DH deprecated namespace) — gitea-postgresql-0 stuck in ImagePullBackOff on otech22, cascading to gitea Init:CrashLoopBackOff. Mirrors the bp-harbor pattern (PR #578): provision a CNPG Cluster CR (gitea-pg, namespace gitea, 5Gi, pg16) + a reflector-managed gitea-database-secret, wiring GITEA__database__PASSWD from the CNPG-generated gitea-pg-app Secret. All Bitnami subchart config removed; postgresql.enabled: false. Bootstrap-kit (template + otech + omantel): bump bp-gitea 1.1.2 → 1.2.0, add dependsOn: bp-cnpg so the postgresql.cnpg.io/v1 CRD is registered before the Capabilities gate in cnpg-cluster.yaml fires. omantel overlay migrated from legacy ingress: to gateway: (Cilium Gateway API, issue #387). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(dependency-audit): add bp-reflector (5a) to expected DAG + external-dns dep edge bp-reflector was added to the bootstrap-kit (slot 05a) in issue #543 but was never registered in scripts/expected-bootstrap-deps.yaml, causing the dependency-graph-audit CI gate to error on every PR that includes this branch. Also declare bp-reflector in bp-external-dns's depends_on to match the actual HR file (12-external-dns.yaml dependsOn bp-reflector). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(bp-gateway-api): update CRD-count test 5→10 for experimental channel + DAG audit Two fixes to unblock bp-gateway-api:1.1.0 OCI publish and the dependency-graph-audit CI gate: 1. crd-render.sh: expect 10 CRDs (experimental channel) not 5. Chart 1.1.0 vendors experimental-install.yaml (TLSRoute, TCPRoute, UDPRoute, BackendLBPolicy, BackendTLSPolicy in addition to 5 standard CRDs) because Cilium 1.16.x checks for TLSRoute at operator startup. Without this fix the blueprint-release workflow for 1.1.0 fails the chart-test step and never pushes to GHCR — leaving all 13 dependent HRs stuck dependency-not-ready on every Sovereign. 2. expected-bootstrap-deps.yaml: add bp-reflector (slot 5a) and update bp-external-dns depends_on to include bp-reflector. bp-reflector was added to the bootstrap-kit in issue #543 but was missing from the expected DAG, causing dependency-graph-audit ERRORs on every PR. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: alierenbaysal <alierenbaysal@openova.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: hatiyildiz <hatice@openova.io> |
||
|
|
8d2ba0495d
|
fix(bp-gitea): switch to CNPG-managed postgres, drop bitnamilegacy subchart (Closes #584) (#586)
Squash merge: fix(bp-gitea) switch to CNPG-managed postgres (Closes #584) |
||
|
|
2adc3a9493
|
fix(bp-harbor): CNPG database must be 'registry' not 'harbor' — matches coreDatabase (#579)
Harbor upstream always connects to a database named 'registry' (harbor.database.external.coreDatabase default). The CNPG Cluster was initialised with database='harbor', causing: FATAL: database "registry" does not exist (SQLSTATE 3D000) Fix: change postgres.cluster.database default from 'harbor' → 'registry' in values.yaml and cnpg-cluster.yaml template. Both the CNPG bootstrap and Harbor's coreDatabase now use 'registry'. Runtime fix on otech22: CREATE DATABASE registry OWNER harbor was run against harbor-pg-1. harbor-core is now 1/1 Running. Bump bp-harbor 1.2.1 → 1.2.2. Bootstrap-kit refs updated. Co-authored-by: alierenbaysal <alierenbaysal@openova.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
b647aa2561
|
fix(bp-harbor): provision harbor-pg CNPG cluster + database-secret (Closes #566) (#578)
Replace Helm lookup in database-secret.yaml with reflector annotation: harbor-database-secret now reflects harbor-pg-app via reflector.v1.k8s.emberstack.com/reflects. This fixes the race between Helm rendering (fresh install) and CNPG cluster bootstrap — reflector is event-driven and propagates the CNPG password within seconds of harbor-pg-app being created, with no operator action required. Also includes: - templates/cnpg-cluster.yaml: harbor-pg CNPG Cluster (1 inst, 5Gi, pg16) - values.yaml: postgres: block + database.external.host = harbor-pg-rw - Chart 1.2.0 → 1.2.1; bootstrap-kit refs updated (_template, otech, omantel) Co-authored-by: alierenbaysal <alierenbaysal@openova.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
06844d3a70
|
fix(bp-external-dns): point NetworkPolicy egress + pdns-server at powerdns ns (Closes #569) (#573)
bp-powerdns was moved to the `powerdns` namespace in PR #556/#553, but bp-external-dns still had `powerdnsNamespace: openova-system` in its NetworkPolicy egress rule and `--pdns-server=...openova-system...` in extraArgs. Both pointed at the wrong namespace, blocking DNS reconciliation. Fix: - externalDns.networkPolicy.powerdnsNamespace: openova-system → powerdns - extraArgs --pdns-server: ...openova-system... → ...powerdns... Bump bp-external-dns 1.1.2 → 1.1.3. Bootstrap-kit slot 12 updated. Co-authored-by: alierenbaysal <alierenbaysal@openova.io> |
||
|
|
0511efbdac
|
feat(bp-harbor): vendor-agnostic Object Storage backend (closes #383) (#437)
Reworks bp-harbor to write blobs DIRECTLY to the cloud-provider's
native S3 endpoint (Hetzner Object Storage on Hetzner Sovereigns)
per ADR-0001 §13. Mirrors the post-#425 vendor-agnostic seam shipped
in bp-velero:1.2.0 (PR #435 / SHA
|
||
|
|
a1bd550208
|
fix(charts): HTTPRoute templates skip-render on missing host (was failing default-values render) (#402)
Blueprint-release for #401 failed because HTTPRoute templates use
{{- fail }} when gateway.host is not set, which trips the chart default-values
render gate in CI. Switched 6 templates from 'fail loud' to 'skip render':
if .Values.gateway.host → emit HTTPRoute
else → emit nothing
The Gateway API admission already rejects HTTPRoute with empty hostnames,
so the loud-fail wasn't buying anything an operator wouldn't see at apply
time. Default-values render now produces zero HTTPRoute resources, which
is the correct shape for the upstream chart consumers that don't set
the Sovereign-only gateway block.
Files: keycloak, gitea, openbao, grafana, harbor, catalyst-platform.
Verified:
helm template t products/catalyst/chart/ → 0 HTTPRoutes (clean)
helm template t products/catalyst/chart/ --set ingress.gateway.enabled=true --set ingress.hosts.console.host=console.test --set ingress.hosts.api.host=api.test → 2 HTTPRoutes
Closes the blueprint-release failure on commit
|
||
|
|
abf01b6f21
|
feat(platform): Gateway API migration audit (#387) (#401)
Migrates every minimal-Sovereign-set blueprint chart from networking.k8s.io/v1.Ingress to gateway.networking.k8s.io/v1.HTTPRoute, replacing the legacy Traefik-on-Sovereigns assumption with the canonical Cilium + Envoy + Gateway API path per ADR-0001 §9.4 and the WBS §2 correction note (#388). The single per-Sovereign Gateway is added as additional documents in the existing bootstrap-kit slot clusters/_template/bootstrap-kit/01-cilium.yaml (NOT a new top-level slot), since Cilium owns the GatewayClass. It includes: - Certificate `sovereign-wildcard-tls` requesting `*.${SOVEREIGN_FQDN}` from `letsencrypt-dns01-prod` (cert-manager + #373 webhook) - Gateway `cilium-gateway` in `kube-system` with HTTPS (443, TLS terminate) + HTTP (80) listeners, allowedRoutes.namespaces.from=All Per-blueprint HTTPRoute templates (canonical seam: each wrapper chart's existing `templates/` directory): | Blueprint | Host pattern | Backend port | |---------------------|---------------------------------|--------------| | bp-keycloak | auth.<sov> | 80 | | bp-gitea | git.<sov> | 3000 | | bp-openbao | bao.<sov> | 8200 | | bp-grafana | grafana.<sov> | 80 | | bp-harbor | registry.<sov> | 80 | | bp-powerdns | pdns.<sov>/api (dual-mode) | 8081 | | bp-catalyst-platform| console.<sov>, api.<sov> | 80, 8080 | bp-powerdns supports both Ingress (contabo legacy) and HTTPRoute (Sovereign) simultaneously — the per-Sovereign overlay sets `api.gateway.enabled=true` while leaving `api.enabled=true`. The Ingress object is harmless on Cilium clusters with no Traefik. This preserves contabo's existing pdns.openova.io flow per ADR-0001 §9.4. bp-harbor flips `expose.type` from `ingress` to `clusterIP` in platform/harbor/chart/values.yaml so the upstream chart no longer emits its own Ingress; the HTTPRoute is the sole HTTP exposure. TLS terminates at the Gateway (wildcard cert) rather than per-host Certificates inside the chart. bp-catalyst-platform's `templates/httproute.yaml` is NOT excluded by .helmignore (unlike templates/ingress.yaml + templates/ingress-console-tls.yaml, which remain contabo-only legacy demo infra). The contabo path keeps serving console.openova.io/sovereign via Traefik unchanged. Bootstrap-kit slot updates (per-Sovereign hostname interpolation): - 08-openbao.yaml → gateway.host: bao.${SOVEREIGN_FQDN} - 09-keycloak.yaml → gateway.host: auth.${SOVEREIGN_FQDN} - 10-gitea.yaml → gateway.host: gitea.${SOVEREIGN_FQDN} - 11-powerdns.yaml → api.host: pdns.${SOVEREIGN_FQDN}, api.gateway.enabled: true - 19-harbor.yaml → gateway.host: registry.${SOVEREIGN_FQDN} - 25-grafana.yaml → gateway.host: grafana.${SOVEREIGN_FQDN} Server-side dry-run validation against the live Cilium Gateway API CRDs on contabo: every HTTPRoute and the per-Sovereign Gateway + Certificate apply cleanly via `kubectl apply --dry-run=server`. Contabo unaffected: clusters/contabo-mkt/* not modified. The legacy SME ingresses (console-nova, marketplace, admin, axon, talentmesh, stalwart, ...) continue to serve via Traefik as before. powerdns on contabo remains on the Ingress path (api.gateway.enabled defaults to false at the chart level). Closes #387. Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ba2ff05292
|
feat(charts): bp-seaweedfs + bp-harbor + bp-vpa wrapper charts (#284)
W2.5.B — first authoring of the three Catalyst Blueprint wrapper charts
that fill bootstrap-kit slots 18 (seaweedfs), 19 (harbor) and 29 (vpa).
Each wraps an upstream chart as a Helm subchart and ships Catalyst-
curated overlay templates (NetworkPolicy + ServiceMonitor) gated behind
opt-in toggles, per docs/BLUEPRINT-AUTHORING.md §11 and
docs/INVIOLABLE-PRINCIPLES.md.
bp-seaweedfs (slot 18 — storage foundation)
- Wraps seaweedfs/seaweedfs 4.22.0; Chart name `bp-seaweedfs`.
- Catalyst defaults: 1 master + 3 volume + 1 filer + 2 s3 replicas.
- S3 API on 8333 — single S3 surface every consumer talks to per
docs/PLATFORM-TECH-STACK.md §3.5 (no per-app MinIO).
- Overlay templates: NetworkPolicy (cross-namespace S3 reachability,
cold-tier egress allowlist), ServiceMonitor (Capabilities-gated,
DEFAULT FALSE per §11.2).
- Default helm template kinds: ClusterRole, ClusterRoleBinding,
ConfigMap, Deployment, Secret, Service, ServiceAccount, StatefulSet.
bp-harbor (slot 19 — per-Sovereign OCI registry)
- Wraps goharbor/harbor 1.18.3 (appVersion 2.14.3); Chart name
`bp-harbor`.
- Catalyst defaults: blob backend = SeaweedFS S3 (regionendpoint
seaweedfs-s3.seaweedfs.svc:8333), metadata DB = bp-cnpg external
Postgres, ingress class `cilium`, expose.tls.enabled true (cert-
manager-issued Secret).
- Overlay templates: NetworkPolicy (CNPG/SeaweedFS/Keycloak egress),
ServiceMonitor (Capabilities-gated, DEFAULT FALSE).
- Trivy + SSO + pull-mirror are operator-flag opt-ins per per-
Sovereign overlay (default false; trivy/keycloak/cnpg deps land on
later slots).
- Default helm template kinds: ConfigMap, Deployment, Ingress,
PersistentVolumeClaim, Secret, Service, StatefulSet.
bp-vpa (slot 29 — vertical autoscaling)
- Wraps cowboysysop/vertical-pod-autoscaler 11.1.1 (appVersion
1.5.0); Chart name `bp-vpa`.
- Catalyst defaults: 1 replica each of recommender + updater +
admission-controller. Default mode `Off` (recommend only).
- Admission webhook self-signs via init Job (cluster-internal); per-
Sovereign overlay MAY swap to cert-manager.
- Overlay templates: NetworkPolicy (apiserver + metrics-server
egress, admission webhook ingress).
- Upstream metrics.serviceMonitor / metrics.prometheusRule defaulted
false per §11.2.
- Default helm template kinds: ClusterRole, ClusterRoleBinding,
ConfigMap, Deployment, Job, Pod, Secret, Service, ServiceAccount.
Lint + observability-toggle results
helm lint: 1 chart(s) linted, 0 chart(s) failed (each)
tests/observability-toggle.sh: PASS on all three (default render has
zero monitoring.coreos.com/v1 references; opt-in render produces a
ServiceMonitor; explicit-off render is clean).
Path isolation: only platform/seaweedfs/, platform/harbor/, and
platform/vpa/ — no HR slot files or other charts touched.
Refs: bootstrap-kit slots 18, 19, 29 reconcile against
ghcr.io/openova-io/bp-seaweedfs:1.0.0, bp-harbor:1.0.0, bp-vpa:1.0.0
which this commit produces on next blueprint-release CI run.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
|
||
|
|
7cafa3c894 |
docs(seaweedfs+guacamole): replace MinIO with SeaweedFS as unified S3 encapsulation; add Guacamole to bp-relay
Component-level architectural correction (two changes): 1. MinIO → SeaweedFS as unified S3 encapsulation layer The old design used MinIO for in-cluster S3 plus separate cold-tier configuration scattered across consumers. The new design positions SeaweedFS as the single S3 encapsulation layer: every Catalyst component talks to one endpoint (seaweedfs.storage.svc:8333). SeaweedFS internally handles hot tier (in-cluster NVMe), warm tier (in-cluster bulk), and cold tier (transparent passthrough to cloud archival storage — Cloudflare R2 / AWS S3 / Hetzner Object Storage / etc., chosen at Sovereign provisioning). One audit/lifecycle/encryption boundary instead of N. No Catalyst component talks to cloud S3 directly anymore — Velero, CNPG WAL archive, OpenSearch snapshots, Loki/Mimir/Tempo, Iceberg, Harbor blob store, Application buckets all share one S3 surface. 2. Apache Guacamole added as Application Blueprint §4.5 Communication Clientless browser-based RDP/VNC/SSH/kubectl-exec gateway. Keycloak SSO, full session recording to SeaweedFS for compliance evidence (PSD2/DORA/SOX). Composed into bp-relay. Replaces VPN+native-client distribution for auditable remote access. Component changes: - DELETED: platform/minio/ - CREATED: platform/seaweedfs/README.md (unified S3 + cold-tier encapsulation; bucket layout; multi-region replication via shared cold backend; migration-from-MinIO section) - CREATED: platform/guacamole/README.md (clientless remote-desktop gateway; GuacamoleConnection CRD; compliance integration via session recordings) Doc updates: PLATFORM-TECH-STACK §1+§3.5+§4.5+§5+§7.4; TECHNOLOGY-FORECAST L11+mandatory+a-la-carte counts (52 → 53); ARCHITECTURE §3 topology; SECURITY §4 DB engines; SOVEREIGN-PROVISIONING §1 inputs; SRE §2.5+§7; IMPLEMENTATION-STATUS §3; BLUEPRINT-AUTHORING stateful examples; BUSINESS-STRATEGY 13 component-count anchors + Relay product line; README.md backup row; CLAUDE.md folder count. Component README updates (S3 endpoint + dependency renames): cnpg, clickhouse, flink, gitea, iceberg, harbor, grafana, livekit, kserve, milvus, opensearch, flux, stalwart, velero (substantive rewrite of velero — now writes exclusively to SeaweedFS with cold-tier auto-routing). Products: relay, fabric. UI scaffold: products/catalyst/bootstrap/ui/src/shared/constants/components.ts — minio entry replaced with seaweedfs; velero+harbor deps updated; new guacamole entry added. VALIDATION-LOG entry "Pass 104 — MinIO → SeaweedFS swap + Guacamole add" captures the encapsulation principle and adds Lesson #22: storage tier policy belongs at the encapsulation boundary, not inside every consumer. Verification: zero remaining MinIO references in canonical docs (one intentional retention in TECHNOLOGY-FORECAST L37 explaining the swap); 53 platform/ folders matching all "53 components" anchors; bp-relay composition includes guacamole. |
||
|
|
2a1d6f5d3f |
docs(pass-41): SOVEREIGN-PROVISIONING §4 + minio namespace drift across 3 components
SOVEREIGN-PROVISIONING.md §4 (Phase 1 Hand-off) "self-sufficient" list had 6 items vs PLATFORM-TECH-STACK §2.3's 6 control-plane supporting services. List was missing SPIRE (5-min rotating SVIDs — critical to SECURITY model) and observability (Grafana stack — Catalyst's self-monitoring). Same drift category as Pass 40: summary list drifted independently from canonical reference. Added both, plus enumerated the §2.1+§2.2 services in the "Catalyst control plane" bullet. Mid-pass sweep finding: kserve L217 used minio.minio-system.svc but canonical minio README declares namespace: storage (L70). Three other components also used minio-system: milvus L78, harbor L145. Fixed all three to align with canonical `storage` namespace per PLATFORM-TECH-STACK §3.5. Drift likely came from Helm-chart upstream defaults. platform/kserve substantively clean apart from namespace fix. Pass 41 lesson: union-equality check applies to ALL summary passages in canonical docs. When a passage enumerates items derived from a canonical source list, count both and verify equality. |
||
|
|
4043e1d51c |
docs(pass-32): registry-DNS sweep — harbor.<domain> across 9 component READMEs
Pass 25's deferred sweep, executed. Image refs of the form
harbor.<domain>/... (and one registry.<domain>/... in temporal) collapse
the location-code segment. Per NAMING §5.1, Catalyst per-host-cluster
Harbor DNS is harbor.{location-code}.{sovereign-domain} (e.g.
harbor.hfmp.openova.io).
Fixed (11 instances, 9 files):
- anthropic-adapter, bge (×2), debezium, harbor (×2 — ingress + Kyverno
policy), knative (×2 — serving + traffic-split), llm-gateway, strimzi,
trivy — all standardized to harbor.<location-code>.<sovereign-domain>.
- temporal had two drift items in one line: registry.<domain> (off-spec
placeholder — Catalyst's only per-host-cluster registry is Harbor) AND
legacy "fuse" namespace (renamed to bp-fabric per BUSINESS-STRATEGY
§16.2 / Pass 26). Rewritten to fabric/order-worker.
Out of scope (deliberate): :latest tag hygiene, and whether Application
Blueprint READMEs should reference ghcr.io/openova-io/bp-<name>:<semver>
vs the Sovereign Harbor mirror. Stalwart customer-email-domain <domain>
placeholders preserved (correct semantics). external-dns illustrative
gslb/api/svc.<domain> preserved (upstream-doc generic).
With Pass 29 (canonical-doc DNS) + Pass 31 (carry-over fixes) + Pass 32
(image registry), the recurring DNS-placeholder collapse drift category
is addressed end-to-end.
Validation log Pass 32 entry added.
|
||
|
|
eff264b077 |
docs(pass-17): ARCHITECTURE OAM table pipe-fix + Harbor README de-drift
Pass 17 — drift-detection sweep on ARCHITECTURE + harbor. Two real
findings.
ARCHITECTURE §13 (OAM table):
- `| Trait | Blueprint overlay (`overlays/small|medium|large`) |`
has pipe chars inside backticks inside a Markdown table cell —
a known GFM rendering hazard. Replaced with comma-separated
examples.
platform/harbor/README.md:
- The banner added in Pass 9 said "every host cluster runs a
Harbor instance" but the body still described an older
"Harbor Primary / Harbor Replica" cross-region replication
topology. Same shape of architectural drift Pass 7 caught in
OpenBao/ESO/Gitea/Flux — banner-add doesn't rewrite the body.
- Three sections rewritten:
* Overview mermaid: now shows upstream-OCI → multiple
independent per-cluster Harbors with local Trivy scan + local
Pod pulls.
* "Multi-Region Replication" → "Per-host-cluster mirroring (NOT
primary-replica)". Single source of truth = upstream OCI
(ghcr.io/openova-io/* for Catalyst+Blueprints, customer CI for
application images), not a "primary Harbor".
* Example replication policy: was a `dest_registry` cross-region
push policy → now a pull-mirror policy from ghcr.io with
scheduled-cron trigger.
- "Why Mandatory" table reframed in per-host-cluster terms.
VALIDATION-LOG: Pass 17 entry added with the specific drift-detection
lesson — banner-addition passes don't catch body-level drift; need
explicit body re-reads.
Refs #37
|
||
|
|
a52bda30cb |
docs(pass-9b): retry banners on harbor / falco / sigstore / syft-grype
Pass 9's commit
|
||
|
|
c9d04a53b4 |
refactor: flatten platform/ structure (41 components)
Remove hierarchical grouping (networking/, security/, etc.) and use flat structure for all 41 platform components. Changes: - All components now directly under platform/ (no subfolders) - AI Hub components moved from meta-platforms/ai-hub/components/ to platform/ - Open Banking components (lago, openmeter) moved to platform/ - meta-platforms/ now only contains README files that reference platform/ - Open Banking custom services remain in meta-platforms/open-banking/services/ Structure: - platform/ (41 components, flat) - meta-platforms/ai-hub/ (README only, references platform/) - meta-platforms/open-banking/ (README + 6 custom services) All documentation links updated. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |