openova/clusters/_template/bootstrap-kit/27-kyverno.yaml
e3mrah 7f2a121a9a
feat(security/kyverno): split policies into bp-kyverno-policies@1.0.0 Blueprint (Refs #2019) (#2022)
* feat(security/kyverno): split policies into bp-kyverno-policies@1.0.0 Blueprint

Splits the 20 EPIC-1 (#1096) compliance ClusterPolicy templates out of
bp-kyverno (engine umbrella chart) into a dedicated Blueprint
bp-kyverno-policies@1.0.0 with its own HelmRelease, ordered via HR-to-HR
dependsOn on bp-kyverno in the bootstrap-kit Kustomization.

WHY (the bug we're killing):
PR #1138 (2026-05-08) shipped 20 ClusterPolicy templates with
`enabled: false` defaults → dead-on-arrival for 11 days. PR #1933
(2026-05-19) flipped 18 defaults to `enabled: true` + bumped chart
1.1.0 → 1.2.0 + bumped the bootstrap-kit pin — but hit a CRD install-
ordering race on fresh prov t33: ClusterPolicy CRs (in
templates/policies/baseline/*.yaml) and Kyverno CRDs (in upstream
charts/crds/templates/) render in the SAME Helm pass, and the
apiserver's RESTMapper has not yet learned kyverno.io/v1.ClusterPolicy
when Helm applies the ClusterPolicy CRs. PR #1935 reverted ONLY the
bootstrap-kit pin (1.2.0 → 1.1.0) — chart source kept claiming policies
were on by default while the deployed pin pulled an engine-only artifact
with zero policies. "Theater on theater" — founder walk on t34 confirmed
GET /api/v1/sovereigns/<id>/compliance/policies returns `policyCount=0`,
only `useraccess-boundary` (from bp-crossplane-claims) was installed.

The structural fix is splitting the chart so the engine + CRDs reconcile
+ register first, THEN the policy chart applies its CRs cleanly. Audit
mode default = non-blocking (admission still passes, PolicyReport rows
populate). Operators flip individual policies to Enforce per-Sovereign
overlay or via EnvironmentPolicy.spec.compliance.modes (slice C2
controller path — separate work item).

CHANGES:

1. NEW chart `platform/kyverno-policies/chart/`:
   - Chart.yaml: name=bp-kyverno-policies, version=1.0.0, no subchart deps
   - values.yaml: `compliancePolicies:` block moved verbatim from bp-kyverno
     (defaults: 18 enabled+Audit, 2 intentionally OFF — `hubbleFlowsSeen`
     stub for W2 evaluator, `cosignVerified` until operator supplies PEM)
   - templates/baseline/01-..20-*.yaml: 20 ClusterPolicy templates moved
     via `git mv` (preserves blame; preserves PR #1933's 3 operator fixes
     — regex_match JMESPath + operator: Equals for 11/12/19)
   - tests/fixtures/: moved with the policies (fixtures reference policy
     output, not engine output)

2. ENGINE chart `platform/kyverno/chart/`:
   - Chart.yaml: 1.2.0 → 1.2.1 (policies removed, source no longer
     drift-claims compliance content)
   - values.yaml: `compliancePolicies:` block deleted (now lives in
     bp-kyverno-policies)
   - templates/clusterpolicy-mutate-add-openova-labels.yaml + ...require-
     openova-labels.yaml KEPT (engine-coupled mutating policies, EPIC-0
     label-vocab E1/E2, defaults OFF — separate concern from EPIC-1
     compliance library)
   - Empty `templates/policies/` directory removed

3. NEW bootstrap-kit slot `clusters/_template/bootstrap-kit/27a-kyverno-
   policies.yaml`:
   - HelmRelease bp-kyverno-policies pinned at chart `1.0.0`
   - HR-level `dependsOn: [bp-kyverno]` — same-kind, honored by Flux
     (per docs/INVIOLABLE-PRINCIPLES.md #14 cross-kind HR→Kustomization
     dependsOn is silently ignored, so we keep ordering at HR→HR within
     the single bootstrap-kit Kustomization)
   - targetNamespace: kyverno (same as engine — ClusterPolicy is cluster-
     scoped but the umbrella overlay namespacing matches the engine)
   - disableWait: true — Kyverno reports ClusterPolicy Ready asynchronously
     so we don't want downstream HRs stalling on policy-level health

4. UPDATED `clusters/_template/bootstrap-kit/kustomization.yaml`:
   - Added `27a-kyverno-policies.yaml` immediately after `27-kyverno.yaml`

5. BUMPED `clusters/_template/bootstrap-kit/27-kyverno.yaml`:
   - Engine pin 1.1.0 → 1.2.1 (engine-only; install behavior identical
     to 1.1.0 since policies + their values are no longer in this chart)

VALIDATION (Principle #15 — validate against fresh state, not stable state):

  $ helm template bp-kyverno-policies platform/kyverno-policies/chart \
      | grep -c '^kind: ClusterPolicy'
  18
  $ helm lint platform/kyverno-policies/chart && helm lint platform/kyverno/chart
  ==> 1 chart(s) linted, 0 chart(s) failed (both)
  $ helm template bp-kyverno platform/kyverno/chart \
      | grep -c '^kind: ClusterPolicy'
  0   # engine no longer renders any ClusterPolicy CRs
  $ helm package platform/kyverno-policies/chart
  Successfully packaged → bp-kyverno-policies-1.0.0.tgz (20 templates)

  CRD-race REPRODUCED locally without container runtime: applying the
  rendered policy YAML to a cluster WITHOUT Kyverno CRDs returns
    "no matches for kind \"ClusterPolicy\" in version \"kyverno.io/v1\"
     ensure CRDs are installed first"
  for every policy — proving the install-order fix is necessary.

  Full `helm install` from-scratch on Kind blocked locally (no container
  runtime on bastion); the Blueprint-Release CI workflow runs the full
  `helm dependency build` + package + GHCR push pipeline AND a
  `helm template` smoke render at publish time — that is the fresh-state
  Helm install gate before any pin lands.

CI / GHCR (Principle #13):
  Blueprint-Release workflow auto-detects `platform/kyverno-policies/chart/**`
  and publishes `oci://ghcr.io/openova-io/bp-kyverno-policies:1.0.0`
  on push to main. The slot pin in 27a-kyverno-policies.yaml is set to
  `1.0.0` to match (auto-bump-pin step is a no-op when source version
  already matches the slot pin).

DELIBERATELY OUT OF SCOPE:
  - W2 Go evaluator for `hubble-flows-seen` (stub stays a no-op)
  - Cosign publicKey supply path for `cosign-verified`
  - Per-Environment EnvironmentPolicy.spec.compliance.modes enforcement
    flip controller
  - Score-aggregator weight defaults configuration UI
  - `useraccess-boundary` (lives in bp-crossplane-claims, unchanged)

This does NOT close #1096. The EPIC remains open until a fresh-prov walk
shows `kubectl get clusterpolicies -A` returning the 18 baseline policies
+ useraccess-boundary, plus the AppDetail Compliance tab rendering non-
zero policyCount. Founder closes #1096 after that walk.

Refs #1096, Refs #2019, Refs #1929, Refs #1936

* fix(ci): register bp-kyverno-policies in expected-bootstrap-deps.yaml

* fix(blueprints): blueprint.yaml lockstep for kyverno 1.2.1 + add kyverno-policies 1.0.0 blueprint.yaml

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-20 04:42:29 +04:00

78 lines
2.4 KiB
YAML

# bp-kyverno — Catalyst bootstrap-kit Blueprint #27 (W2.K3, Tier 7 — Security/Policy).
# Kubernetes-native admission policy engine. Validating/mutating/generating
# admission control via ClusterPolicy/Policy CRDs. HA mode with separate
# admission/background/cleanup/reports controllers. The first guardrail
# downstream Catalyst Apps land behind once the platform is bootstrapped.
#
# Wrapper chart: platform/kyverno/chart/ (umbrella over upstream
# kyverno/kyverno chart, Catalyst-curated values under the `kyverno:` key).
# Reconciled by: Flux on the new Sovereign's k3s control plane.
#
# dependsOn:
# - bp-cilium — Kyverno admission webhooks need a working CNI + Service
# mesh substrate to receive AdmissionReview requests from the apiserver.
# Cilium is the root of the Catalyst-Zero DAG; until it is Ready the
# apiserver→webhook path is not reachable and Kyverno install is racy.
#
# No further dependsOn: Kyverno installs its own CRDs and does not require
# cert-manager (it auto-generates admission webhook TLS via its built-in
# certificate controller).
---
apiVersion: v1
kind: Namespace
metadata:
name: kyverno
labels:
catalyst.openova.io/sovereign: ${SOVEREIGN_FQDN}
---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: bp-kyverno
namespace: flux-system
spec:
type: oci
interval: 15m
url: oci://ghcr.io/openova-io
secretRef:
name: ghcr-pull
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: bp-kyverno
namespace: flux-system
labels:
catalyst.openova.io/slot: "27"
spec:
interval: 15m
releaseName: kyverno
targetNamespace: kyverno
dependsOn:
- name: bp-cilium
chart:
spec:
chart: bp-kyverno
version: 1.2.1
sourceRef:
kind: HelmRepository
name: bp-kyverno
namespace: flux-system
# Event-driven install: Kyverno HA mode brings up four controller
# Deployments (admission, background, cleanup, reports) plus the
# admission webhook TLS bootstrap. Pod Ready is multi-minute on a
# cold cluster; Helm `--wait` would hold the HR's Ready=True signal
# past the point where downstream HRs could legitimately reconcile.
# disableWait lets Flux mark this Ready as soon as manifests apply.
install:
timeout: 15m
disableWait: true
remediation:
retries: 3
upgrade:
timeout: 15m
disableWait: true
remediation:
retries: 3