* docs(arch): consolidate ARCHITECTURE + PLATFORM-TECH-STACK + NAMING + EPICS-1-6 + BOOTSTRAP-KIT-EXPANSION → docs/ARCHITECTURE.md (lean doc strategy) Single canonical "how OpenOva works" doc per founder's lean-doc strategy. 2926 source lines → 1110 consolidated lines, no semantic loss. Sections: §1 High-level model (Catalyst/Sovereign/Org/Env/Application/Blueprint) §2 Repo layout §3 Tech stack by layer (CNI/GitOps/IaC/event-spine/data/secrets/identity/...) §4 Naming conventions (dimensions, patterns, labels, DOMAINS-CANON) §5 Catalyst control plane (rules, CRDs, controllers, cutover, identity, surfaces) §6 Per-host-cluster infrastructure §7 Application Blueprints §8 Multi-region topology (1 cpx52/region, WireGuard-over-public-IPs, ClusterMesh) §9 Bootstrap-kit slot ordering (full 48-slot canonical list) §10 EPIC-level design overview (EPIC-0 through EPIC-6) §11 Per-chart DESIGN.md inventory §12 OAM influence §13 Read further Stale literal fixes: - omantel.openova.io → omantel.biz / <sovereign>.<tld> / t38.omani.works (7 instances) - SPIRE marked DEFERRED / opt-in only (PR #665, TBD-V29 #2055) - failover-controller marked REPLACED by bp-continuum New PR refs wired into §3: - PR #665 SPIRE deferral - PR #2071 bp-cnpg-pair synchronous remote_apply (zero-tx-loss multi-region) - PR #2087 bp-cnpg-pair pre-merge guard - PR #2093 bp-cnpg-pair pre-merge guard New stack components added to §3: - bp-cnpg-pair (synchronous remote_apply ReplicaCluster across ClusterMesh) - bp-continuum (lease-based failover orchestrator) - bp-self-sovereign-cutover (8-tether pivot, ADR-0002, Principle #11) Source docs (to be deleted by orchestrator in final PR): - docs/PLATFORM-TECH-STACK.md - docs/NAMING-CONVENTION.md - docs/EPICS-1-6-unified-design.md - docs/BOOTSTRAP-KIT-EXPANSION-PLAN.md * docs(principles): consolidate INVIOLABLE-PRINCIPLES + ANTI-PATTERN-CATALOG → docs/PRINCIPLES.md (lean doc strategy) * docs(dod): consolidate 5-PILLAR-DOD + DOMAINS-CANON + SOVEREIGN-MULTI-REGION-DOD + PERSONAS-AND-JOURNEYS → docs/DOD.md (lean doc strategy) * docs(runbooks+status+glossary): consolidate 5 runbooks → RUNBOOKS.md + refresh STATUS.md + fold banned-terms into GLOSSARY.md (lean doc strategy) Part 1 — Runbook consolidation: - NEW docs/RUNBOOKS.md with 7 numbered sections (provisioning, day-2 ops, Blueprint authoring, chart conventions, demo walk, failover, troubleshooting) - Folds BLUEPRINT-AUTHORING / CHART-AUTHORING / DEMO-RUNBOOK / RUNBOOK-OPERATIONS / RUNBOOK-PROVISIONING into one canonical surface - Documents dual-annotation requirement for charts with enabled.default: false (GUARD 1 #2087 no-upstream + GUARD 2 #2093 smoke-render) with bp-network-policies:1.0.1 dead-reserve incident as the live evidence - All admin.<fqdn> legacy URL refs → console.<fqdn>/bss (BSS lives in operator console) - All openova.io / omantel.omani.works test commands → canonical t<NN>.omani.works - Cites PRs #2076 (docs migration), #2082 (no-auto-close-keyword), #2087, #2093 Part 2 — STATUS.md refresh (renamed from IMPLEMENTATION-STATUS.md): - Header dated 2026-05-20 (was 2026-04-29; 22 days stale per audit) - Adds 🟦 CODE-COMPLETE state for "controllers + CRDs + tests landed, awaiting fresh-prov walk" (per 5-pillar DoD) - Pillar 3 marked CODE-COMPLETE (PRs #2071/#2072/#2073/#2074/#2075/#2053) - Adds 3 new CRDs verified in products/catalyst/chart/crds/: CNPGPair, PDM, Sandbox - Sandbox controller chain CODE-COMPLETE (PRs #1615/#1618/#1621/#1622/#1626/#1631/#1632) - SPIRE marked DEFERRED — opt-in only (PRs #665, #2056, #2061) - New §6 CI / supply-chain guards table: hollow-chart (#2087), smoke-render (#2093), no-auto-close-keyword (#2082), observability-toggle, subchart 4-step, Flux version-pin replay - New §9 Pillar-status table — Pillars 1/2/3/4 CODE-COMPLETE, Pillar 5 🚧 - Pillar 1 (PRs #2038 V18, #2043 V18-D), Pillar 2 (PR #2029 V20), Pillar 3 (per above), Pillar 4 (Sandbox chain) Part 3 — GLOSSARY.md folded as single source of truth for banned terms: - Header dated 2026-05-20, notes "single source of truth for banned terms" and "no separate BANNED-TERMS.md" - Existing 11 banned-terms rows rewritten with italicized qualifiers - NEW Forbidden test domains subsection: openova.io (mothership-only), omantel.openova.io (hallucinated), Nova Cloud (predecessor brand), eventforge.io (hallucinated), admin.<fqdn> (dead BSS URL) - SPIFFE/SPIRE identity row + acronym row marked deferred per PR #665 with TBD-V29 (#2055) re-introduction roadmap - Cross-links updated: IMPLEMENTATION-STATUS → STATUS, SOVEREIGN-PROVISIONING + BLUEPRINT-AUTHORING → RUNBOOKS.md CLAUDE.md NOT touched. Source files NOT deleted (orchestrator owns deletion). No push, no PR. Manifest at /tmp/merge-D-runbooks-status-glossary-manifest.txt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: assemble lean doc strategy — delete legacy sources, move ledger/sessions/archive, ADR-0004, rewrite cross-refs Per founder direction 2026-05-20 + user-global ~/.claude/CLAUDE.md §11. This is the orchestrator commit on top of the four cherry-picked consolidation commits (ARCHITECTURE, PRINCIPLES, DOD, RUNBOOKS+STATUS+GLOSSARY). It: 1. Deletes 15 legacy source docs (now folded into the 7 canonical): PLATFORM-TECH-STACK, NAMING-CONVENTION, EPICS-1-6-unified-design, BOOTSTRAP-KIT-EXPANSION-PLAN, INVIOLABLE-PRINCIPLES, ANTI-PATTERN-CATALOG, 5-PILLAR-DOD, DOMAINS-CANON, SOVEREIGN-MULTI-REGION-DOD, PERSONAS-AND-JOURNEYS, BLUEPRINT-AUTHORING, CHART-AUTHORING, DEMO-RUNBOOK, RUNBOOK-OPERATIONS, RUNBOOK-PROVISIONING. 2. Moves transient + historical docs into proper subdirs: - docs/ledger/{TRUST,TRACKER}.md (cron-refreshed live state) - docs/sessions/{2026-05-17-convergence,2026-05-19-20-trust-recovery, 2026-05-20-trust-audit,2026-05-20-walk-runbook}.md - docs/archive/{validation-log,orchestrator-state,omantel-handover-wbs}.md 3. Adds docs/adr/0004-cnpg-sync-replication.md (Pillar 3 zero-tx-loss decision) + docs/adr/README.md index. 4. Updates CLAUDE.md reading-order + repo-structure block to match the lean strategy and current core/ tree (controllers/, marketplace/, etc.). 5. Sweeps all .md files + .github/workflows + scripts to repoint old doc paths to the new canonical homes. ADR cross-references kept intact (ADRs are immutable historical artifacts). Operator-side cron scripts that still write to the old paths (/home/openova/bin/refresh-dod-dashboard.sh, refresh-wbs.sh and openova-private/bin/trust-audit.sh) need a one-line path update — flagged in the PR body. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(bootstrap-kit): update repo-root sentinel to docs/PRINCIPLES.md The bootstrap-kit Go test used `docs/INVIOLABLE-PRINCIPLES.md` as its repo-root sentinel; the file no longer exists after the lean-doc consolidation (it's now `docs/PRINCIPLES.md`). Update the walker to match the new canonical filename. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
9.4 KiB
OpenOva Project Memory
Last Updated: 2026-04-27 (Catalyst-unified rewrite) Purpose: Persistent context for Claude Code sessions about Catalyst platform strategy and architecture.
This file is now an index and decision log. The full architecture lives in docs/. When in doubt, the canonical docs win over this file.
1. Read these first
In strict order:
docs/GLOSSARY.md— terminology source of truthdocs/STATUS.md— what's built vs designeddocs/ARCHITECTURE.md— Catalyst target architecturedocs/ARCHITECTURE.md— naming patternsdocs/DOD.md— who uses whatdocs/SECURITY.md— identity, secrets, rotationdocs/SOVEREIGN-PROVISIONING.md— bringing a Sovereign onlinedocs/RUNBOOKS.md— writing Blueprints
If any older notes in this file contradict those docs, those docs win.
2. Core positioning (locked 2026-04-27)
- OpenOva = the company.
- Catalyst = the OpenOva platform (the control plane that turns a Kubernetes cluster into a self-sufficient deployment).
- Sovereign = a deployed instance of Catalyst.
- Organization = multi-tenancy unit inside a Sovereign.
- Environment =
{org}-{env_type}scope where Applications run. - Application = an installed Blueprint.
- Blueprint = the unified unit of installable software (replaces the older "module" + "template" split).
What was previously called Nova was just a Sovereign run by us hosting our SaaS Organizations. The "Nova" brand is retired in favor of "the openova Sovereign."
OpenOva's other products (Cortex, Axon, Fingate, Fabric, Relay, Specter, Exodus) are now positioned as composite Blueprints that run on Catalyst — not as parallel platform layers.
3. Stack decisions (locked 2026-04-27)
| Concern | Choice | Notes |
|---|---|---|
| Event spine | NATS JetStream | Apache 2.0 (no BSL risk); native KV; native multi-tenant Accounts. Replaces the older "Redpanda + Valkey" combo for the control plane only. Application-level event needs choose freely (Redpanda, Kafka, NATS, RabbitMQ). |
| Secrets | OpenBao + ESO | Apache 2.0 fork of Vault (LF-led, IBM-backed). Replaces Vault. |
| Multi-region OpenBao | Independent Raft per region + async perf replication | NOT a stretched cluster. Each region is its own failure domain. |
| Workload identity | SPIFFE/SPIRE | 5-min rotating SVIDs, mTLS everywhere. |
| User identity | Keycloak | Per-Org realm in SME-style Sovereigns; per-Sovereign realm in corporate-style. SME tier uses minimal single-replica Keycloak (no HA). |
| GitOps | Flux per vcluster | Lightweight (source + kustomize + helm controllers). One Flux per vcluster, watching its Environment Gitea repo. |
| Git | Gitea | Per-Sovereign. Hosts public Blueprint mirror, Org-private Blueprints, per-Environment workspace repos. |
| IaC for non-K8s | Crossplane | Only IaC. Never user-facing. Advanced users author Compositions as Blueprints. |
| Bootstrap IaC | OpenTofu | One-shot only. Archived after Phase 0. Crossplane takes over. |
| Multi-tenancy | vcluster (loft.sh) | One per Organization per host cluster. |
| CNI / Service Mesh | Cilium | eBPF mTLS, L7 policies, Gateway API. |
| Bootstrap host | catalyst-provisioner.openova.io | Permanent service. Each Sovereign is fully self-sufficient after Phase 0; provisioner stays online for the next Sovereign. |
4. User-facing surfaces (locked 2026-04-27)
Three first-class surfaces. No fourth.
- UI — Catalyst console. Form / Advanced / IaC editor depths. Default for all personas.
- Git — direct push or PR to the Environment Gitea repo (or Blueprint repos). Equal weight with UI.
- API — REST + GraphQL for portal integrations (Backstage, ServiceNow). Not a primary IaC surface.
kubectl is debug-only, scoped to one's own vcluster. No Terraform/Pulumi/CLI for production changes.
5. Banned terms
Replaced terms — never use in new docs, code, UI strings:
| Banned | Use instead |
|---|---|
| Tenant | Organization |
| Operator (entity / person) | sovereign-admin (role) |
| Client (UX sense) | User |
| Module / Template (Catalyst sense) | Blueprint |
| Backstage | Catalyst console |
| Synapse (the OpenOva product) | Axon |
| Lifecycle Manager (separate) | Catalyst |
| Bootstrap wizard (separate) | Catalyst bootstrap |
| Workspace (Catalyst scope) | Environment |
| Instance (user-facing object) | Application |
Full glossary: docs/GLOSSARY.md.
6. Sovereign topology
catalyst-provisioner (always on) ──Phase 0──► Target cloud (Hetzner / AWS / etc.)
│
▼
Sovereign deployment:
─ Management cluster (mgt)
- Catalyst control plane
- Gitea, JetStream, OpenBao,
Keycloak, projector, …
─ Workload clusters (rtz, dmz)
- Per-Org vclusters
- Each with lightweight Flux
After Phase 0: Sovereign is self-sufficient. Provisioner is no longer in the path.
See docs/SOVEREIGN-PROVISIONING.md for full details.
7. Promotion model (no chain object)
There is no ApplicationGroup or ChainPolicy CRD. Promotion is the act of copying an Application's manifest from one Environment Gitea repo to another, gated by an EnvironmentPolicy attached to the destination Environment.
The Blueprint detail page in the console is the cross-Environment view: it shows every Application using a given Blueprint across all Environments in the Org, with version drift visible at a glance.
8. Multi-region semantics
- Clusters named by building block, not failover role. Same building blocks deployed in multiple regions; k8gb routes traffic. Section 1.3 of
docs/ARCHITECTURE.md. - Each region's OpenBao is an independent Raft cluster with async perf replication. No stretched clusters. See
docs/SECURITY.md§5. - Catalyst Environment is a logical scope realized by N vclusters across regions — Placement metadata on each Application controls fan-out.
9. Naming changes vs older docs
| Old | New |
|---|---|
{env} dimension in NAMING-CONVENTION |
{env_type} |
| "Workspace" (Catalyst scope) | "Environment" |
| "Tenant" (anywhere) | "Organization" |
"Bootstrap mode" / "Manager mode" of core/ app |
Both fold under "Catalyst control plane" |
| Catalyst as a sub-product | Catalyst as the platform itself |
| Cortex / Fingate / etc. as products | Composite Blueprints running on Catalyst |
| OpenBao multi-region as stretched | OpenBao multi-region as independent Raft + async perf replication |
| Vault | OpenBao |
| Redpanda (control plane) | NATS JetStream |
| Valkey (control plane) | NATS JetStream KV (Valkey remains as Application Blueprint) |
10. Component count
The historical "52 components" framing is retained at the marketing level for continuity, but the platform's identity is now Catalyst, not "the 52 components." Components are Blueprints. The list is in docs/ARCHITECTURE.md. Adding or removing components is a Blueprint addition or removal — does not require any platform-level rebrand.
11. Customer sync (unchanged in spirit)
Each Sovereign's Gitea mirrors the public Blueprint catalog from this repo. Pull cadence is Sovereign-local; air-gapped Sovereigns mirror offline. See docs/SOVEREIGN-PROVISIONING.md §9.
12. Open follow-ups (post-rewrite)
- Per-Blueprint
README.mdaudit — most are clean; remaining cleanup tracked in issue #37. core/directory may be reorganized to match the Catalyst component naming (no urgency; functional code unchanged).- Specter and Exodus positioning: Specter is a composite Blueprint (
bp-specter) installed by default in corporate-style Sovereigns; Exodus is a deliverable migration service (people + playbooks), not a Blueprint. Documented at length indocs/BUSINESS-STRATEGY.md.
13. Approved key phrases
- "Cloud-native is the foundation. Catalyst is how you operate it."
- "Catalyst — the OpenOva platform."
- "A Sovereign is a self-sufficient deployment of Catalyst."
- "Nova was just a Sovereign run by us. Now we say 'the openova Sovereign'."
- "Same code in every Sovereign — whether run by us, by Omantel, or by Bank Dhofar."
14. Phrases to avoid
- "Tenant" anywhere in product context.
- "Operator" as an entity (the role is "sovereign-admin").
- "Module" / "Template" in the Catalyst sense.
- "Backstage" — replaced.
- "Lifecycle Manager" or "Bootstrap wizard" as separate products.
- "Stretched cluster" in OpenBao context — we deliberately reject that pattern.
- "Workspace" as Catalyst scope — replaced by Environment.
Older sections from earlier project-memory revisions removed during the 2026-04-27 unified rewrite. Historical decisions remain captured in git log of this repository if needed.