History

e3mrah ba2ff05292 feat(charts): bp-seaweedfs + bp-harbor + bp-vpa wrapper charts (#284 ) W2.5.B — first authoring of the three Catalyst Blueprint wrapper charts that fill bootstrap-kit slots 18 (seaweedfs), 19 (harbor) and 29 (vpa). Each wraps an upstream chart as a Helm subchart and ships Catalyst- curated overlay templates (NetworkPolicy + ServiceMonitor) gated behind opt-in toggles, per docs/BLUEPRINT-AUTHORING.md §11 and docs/INVIOLABLE-PRINCIPLES.md. bp-seaweedfs (slot 18 — storage foundation) - Wraps seaweedfs/seaweedfs 4.22.0; Chart name `bp-seaweedfs`. - Catalyst defaults: 1 master + 3 volume + 1 filer + 2 s3 replicas. - S3 API on 8333 — single S3 surface every consumer talks to per docs/PLATFORM-TECH-STACK.md §3.5 (no per-app MinIO). - Overlay templates: NetworkPolicy (cross-namespace S3 reachability, cold-tier egress allowlist), ServiceMonitor (Capabilities-gated, DEFAULT FALSE per §11.2). - Default helm template kinds: ClusterRole, ClusterRoleBinding, ConfigMap, Deployment, Secret, Service, ServiceAccount, StatefulSet. bp-harbor (slot 19 — per-Sovereign OCI registry) - Wraps goharbor/harbor 1.18.3 (appVersion 2.14.3); Chart name `bp-harbor`. - Catalyst defaults: blob backend = SeaweedFS S3 (regionendpoint seaweedfs-s3.seaweedfs.svc:8333), metadata DB = bp-cnpg external Postgres, ingress class `cilium`, expose.tls.enabled true (cert- manager-issued Secret). - Overlay templates: NetworkPolicy (CNPG/SeaweedFS/Keycloak egress), ServiceMonitor (Capabilities-gated, DEFAULT FALSE). - Trivy + SSO + pull-mirror are operator-flag opt-ins per per- Sovereign overlay (default false; trivy/keycloak/cnpg deps land on later slots). - Default helm template kinds: ConfigMap, Deployment, Ingress, PersistentVolumeClaim, Secret, Service, StatefulSet. bp-vpa (slot 29 — vertical autoscaling) - Wraps cowboysysop/vertical-pod-autoscaler 11.1.1 (appVersion 1.5.0); Chart name `bp-vpa`. - Catalyst defaults: 1 replica each of recommender + updater + admission-controller. Default mode `Off` (recommend only). - Admission webhook self-signs via init Job (cluster-internal); per- Sovereign overlay MAY swap to cert-manager. - Overlay templates: NetworkPolicy (apiserver + metrics-server egress, admission webhook ingress). - Upstream metrics.serviceMonitor / metrics.prometheusRule defaulted false per §11.2. - Default helm template kinds: ClusterRole, ClusterRoleBinding, ConfigMap, Deployment, Job, Pod, Secret, Service, ServiceAccount. Lint + observability-toggle results helm lint: 1 chart(s) linted, 0 chart(s) failed (each) tests/observability-toggle.sh: PASS on all three (default render has zero monitoring.coreos.com/v1 references; opt-in render produces a ServiceMonitor; explicit-off render is clean). Path isolation: only platform/seaweedfs/, platform/harbor/, and platform/vpa/ — no HR slot files or other charts touched. Refs: bootstrap-kit slots 18, 19, 29 reconcile against ghcr.io/openova-io/bp-seaweedfs:1.0.0, bp-harbor:1.0.0, bp-vpa:1.0.0 which this commit produces on next blueprint-release CI run. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>		2026-04-30 18:37:50 +04:00
..
chart	feat(charts): bp-seaweedfs + bp-harbor + bp-vpa wrapper charts (#284 )	2026-04-30 18:37:50 +04:00
blueprint.yaml	feat(charts): bp-seaweedfs + bp-harbor + bp-vpa wrapper charts (#284 )	2026-04-30 18:37:50 +04:00
README.md	docs(seaweedfs+guacamole): replace MinIO with SeaweedFS as unified S3 encapsulation; add Guacamole to bp-relay	2026-04-28 10:23:46 +02:00

README.md

Harbor

Container registry with vulnerability scanning. Per-host-cluster infrastructure (see docs/PLATFORM-TECH-STACK.md §3.5) — every host cluster runs a Harbor instance for Catalyst component images, mirrored Blueprint OCI artifacts, and customer images.

Status: Accepted | Updated: 2026-04-27

Overview

Harbor is mandatory on every host cluster. Each host cluster runs its own Harbor instance that mirrors from upstream sources (ghcr.io/openova-io/... for Catalyst components and Blueprint OCI artifacts; the customer's own CI for application images). Local Harbor = fast Pod pulls, no cross-region traffic on every image pull, air-gap ready.

flowchart TB
    subgraph Upstream["Upstream OCI sources"]
        GHCR[ghcr.io/openova-io/* — Catalyst + Blueprints]
        CustCI[Customer CI — Application images]
    end

    subgraph Cluster1["Host cluster A (e.g. hz-fsn-rtz-prod)"]
        H1[Harbor — local mirror]
        T1[Trivy Scanner]
        Pods1[Pods pull locally]
    end

    subgraph Cluster2["Host cluster B (e.g. hz-hel-rtz-prod)"]
        H2[Harbor — local mirror]
        T2[Trivy Scanner]
        Pods2[Pods pull locally]
    end

    GHCR -.->|"pull mirror"| H1
    CustCI -.->|"push"| H1
    GHCR -.->|"pull mirror"| H2
    CustCI -.->|"push"| H2
    H1 --> T1
    H2 --> T2
    H1 --> Pods1
    H2 --> Pods2

Why Mandatory?

Requirement	Harbor (per host cluster)	External Registry
Local pulls (no cross-region traffic)	✅ Each cluster's Pods pull from local Harbor	❌ Pods pull cross-region
Vulnerability scanning	✅ Trivy integrated	⚠️ Depends on provider
Air-gap support	✅ Self-hosted	❌
RBAC	✅ Full control	⚠️ Provider-specific
Audit logging	✅ Complete	⚠️ Limited
No external dependency at runtime	✅ Once mirrored	❌

Features

Feature	Support
Image storage	OCI-compliant
Vulnerability scanning	Trivy integration
Image signing	Cosign/Notary
Replication	Push/pull between regions
RBAC	Project-based access
Quotas	Per-project storage limits
Garbage collection	Automatic cleanup

Per-host-cluster mirroring (NOT primary-replica)

Catalyst's agreed model is one Harbor per host cluster, each independently pulling from upstream OCI sources. There is no Harbor-to-Harbor replication primary/replica.

sequenceDiagram
    participant CI as CI / Upstream OCI
    participant H1 as Harbor (cluster A)
    participant T1 as Trivy (cluster A)
    participant H2 as Harbor (cluster B)
    participant T2 as Trivy (cluster B)
    participant Pods as Pods

    CI->>H1: pull-mirror sync (configured per project)
    H1->>T1: scan on ingest
    CI->>H2: pull-mirror sync (independent of H1)
    H2->>T2: scan on ingest
    Pods->>H1: pull (cluster A Pods)
    Pods->>H2: pull (cluster B Pods)

Why pull-mirror, not Harbor-to-Harbor replication:

Single source of truth = upstream (ghcr.io/openova-io/... or customer CI), not a "primary Harbor".
Each cluster is its own failure domain — primary-replica drift between Harbors would be one more thing to fail.
Air-gap path is the same shape: a one-time mirror import vs ongoing primary-pushed replication.

Benefits:

Images available locally in each cluster.
Survives any cluster (including the management cluster) going down — workload clusters keep pulling locally.
Faster pulls (no cross-region traffic per Pod start).

Storage Backend Options

Backend	Use Case	Notes
PVC	Small deployments	Local storage
S3 (SeaweedFS)	Production	Recommended - tiered archiving
Cloud S3	Managed	AWS S3 / GCS / Azure Blob

Recommended: S3 via SeaweedFS

flowchart LR
    Harbor[Harbor] -->|"S3 API"| SeaweedFS[SeaweedFS]
    SeaweedFS -->|"Tier cold data"| Archive[Archival S3]

Configuration

Helm Values

expose:
  type: ingress
  ingress:
    className: cilium
    hosts:
      core: harbor.<location-code>.<sovereign-domain>
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt-prod

# S3 Storage (SeaweedFS)
persistence:
  imageChartStorage:
    type: s3
    s3:
      region: us-east-1
      bucket: harbor-registry
      accesskey: ""  # From ESO secret
      secretkey: ""  # From ESO secret
      regionendpoint: http://seaweedfs.storage.svc:8333
      v4auth: true

trivy:
  enabled: true

database:
  type: internal  # or external for CNPG

redis:
  type: internal  # or external for Valkey

core:
  secretName: harbor-core-secret

Pull-mirror policy

{
  "name": "ghcr-openova-mirror",
  "src_registry": {
    "type": "harbor",
    "url": "https://ghcr.io",
    "credential": {
      "access_key": "",
      "access_secret": ""
    }
  },
  "trigger": {
    "type": "scheduled",
    "trigger_settings": {
      "cron": "0 */6 * * *"
    }
  },
  "filters": [
    {
      "type": "name",
      "value": "openova-io/**"
    }
  ],
  "enabled": true
}

Security Scanning

Trivy Integration

Scan Type	Trigger
On push	Automatic when image pushed
Scheduled	Daily full scan
Manual	On-demand via UI/API

Scan Policy

Severity	Action
Critical	Block pull
High	Allow (configurable)
Medium	Allow
Low	Allow

Kyverno Policies

Require Harbor Images

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-harbor-images
spec:
  validationFailureAction: Enforce
  rules:
    - name: require-harbor-registry
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "Images must be pulled from Harbor registry"
        pattern:
          spec:
            containers:
              - image: "harbor.<location-code>.<sovereign-domain>/*"

Resource Requirements

Component	CPU	Memory
Harbor Core	0.5	512Mi
Registry	0.5	512Mi
Database	0.5	512Mi
Redis	0.25	256Mi
Trivy	0.5	1Gi
Total	2.25	2.75Gi

Backup Strategy

Harbor data backed up via Velero to Archival S3:

flowchart LR
    Harbor[Harbor] --> Velero[Velero]
    Velero --> S3[Archival S3]

Backed up:

Database (PostgreSQL)
Registry storage (blobs)
Configuration

Consequences

Positive:

Complete control over image lifecycle.
Built-in vulnerability scanning (Trivy on ingest).
Per-cluster mirror = no cross-region pull traffic; each cluster is an independent failure domain.
Air-gap ready (one-time import works the same way as ongoing pull-mirror).
Audit trail for compliance.

Negative:

Resource overhead (~3GB RAM)
Operational responsibility
Backup requirements (handled by Velero)

Part of OpenOva