openova/.claude/project-memory.md
e3mrah f6757c7c93
feat(docs): lean documentation strategy — consolidate 16 docs into 7 canonical + 3 subdirs (#2094)
* docs(arch): consolidate ARCHITECTURE + PLATFORM-TECH-STACK + NAMING + EPICS-1-6 + BOOTSTRAP-KIT-EXPANSION → docs/ARCHITECTURE.md (lean doc strategy)

Single canonical "how OpenOva works" doc per founder's lean-doc strategy.
2926 source lines → 1110 consolidated lines, no semantic loss.

Sections:
 §1  High-level model (Catalyst/Sovereign/Org/Env/Application/Blueprint)
 §2  Repo layout
 §3  Tech stack by layer (CNI/GitOps/IaC/event-spine/data/secrets/identity/...)
 §4  Naming conventions (dimensions, patterns, labels, DOMAINS-CANON)
 §5  Catalyst control plane (rules, CRDs, controllers, cutover, identity, surfaces)
 §6  Per-host-cluster infrastructure
 §7  Application Blueprints
 §8  Multi-region topology (1 cpx52/region, WireGuard-over-public-IPs, ClusterMesh)
 §9  Bootstrap-kit slot ordering (full 48-slot canonical list)
 §10 EPIC-level design overview (EPIC-0 through EPIC-6)
 §11 Per-chart DESIGN.md inventory
 §12 OAM influence
 §13 Read further

Stale literal fixes:
 - omantel.openova.io → omantel.biz / <sovereign>.<tld> / t38.omani.works (7 instances)
 - SPIRE marked DEFERRED / opt-in only (PR #665, TBD-V29 #2055)
 - failover-controller marked REPLACED by bp-continuum

New PR refs wired into §3:
 - PR #665   SPIRE deferral
 - PR #2071  bp-cnpg-pair synchronous remote_apply (zero-tx-loss multi-region)
 - PR #2087  bp-cnpg-pair pre-merge guard
 - PR #2093  bp-cnpg-pair pre-merge guard

New stack components added to §3:
 - bp-cnpg-pair  (synchronous remote_apply ReplicaCluster across ClusterMesh)
 - bp-continuum  (lease-based failover orchestrator)
 - bp-self-sovereign-cutover (8-tether pivot, ADR-0002, Principle #11)

Source docs (to be deleted by orchestrator in final PR):
 - docs/PLATFORM-TECH-STACK.md
 - docs/NAMING-CONVENTION.md
 - docs/EPICS-1-6-unified-design.md
 - docs/BOOTSTRAP-KIT-EXPANSION-PLAN.md

* docs(principles): consolidate INVIOLABLE-PRINCIPLES + ANTI-PATTERN-CATALOG → docs/PRINCIPLES.md (lean doc strategy)

* docs(dod): consolidate 5-PILLAR-DOD + DOMAINS-CANON + SOVEREIGN-MULTI-REGION-DOD + PERSONAS-AND-JOURNEYS → docs/DOD.md (lean doc strategy)

* docs(runbooks+status+glossary): consolidate 5 runbooks → RUNBOOKS.md + refresh STATUS.md + fold banned-terms into GLOSSARY.md (lean doc strategy)

Part 1 — Runbook consolidation:
- NEW docs/RUNBOOKS.md with 7 numbered sections (provisioning, day-2 ops,
  Blueprint authoring, chart conventions, demo walk, failover, troubleshooting)
- Folds BLUEPRINT-AUTHORING / CHART-AUTHORING / DEMO-RUNBOOK /
  RUNBOOK-OPERATIONS / RUNBOOK-PROVISIONING into one canonical surface
- Documents dual-annotation requirement for charts with enabled.default: false
  (GUARD 1 #2087 no-upstream + GUARD 2 #2093 smoke-render) with bp-network-policies:1.0.1
  dead-reserve incident as the live evidence
- All admin.<fqdn> legacy URL refs → console.<fqdn>/bss (BSS lives in operator console)
- All openova.io / omantel.omani.works test commands → canonical t<NN>.omani.works
- Cites PRs #2076 (docs migration), #2082 (no-auto-close-keyword), #2087, #2093

Part 2 — STATUS.md refresh (renamed from IMPLEMENTATION-STATUS.md):
- Header dated 2026-05-20 (was 2026-04-29; 22 days stale per audit)
- Adds 🟦 CODE-COMPLETE state for "controllers + CRDs + tests landed,
  awaiting fresh-prov walk" (per 5-pillar DoD)
- Pillar 3 marked CODE-COMPLETE (PRs #2071/#2072/#2073/#2074/#2075/#2053)
- Adds 3 new CRDs verified in products/catalyst/chart/crds/:
  CNPGPair, PDM, Sandbox
- Sandbox controller chain CODE-COMPLETE
  (PRs #1615/#1618/#1621/#1622/#1626/#1631/#1632)
- SPIRE marked DEFERRED — opt-in only (PRs #665, #2056, #2061)
- New §6 CI / supply-chain guards table: hollow-chart (#2087),
  smoke-render (#2093), no-auto-close-keyword (#2082), observability-toggle,
  subchart 4-step, Flux version-pin replay
- New §9 Pillar-status table — Pillars 1/2/3/4 CODE-COMPLETE, Pillar 5 🚧
- Pillar 1 (PRs #2038 V18, #2043 V18-D), Pillar 2 (PR #2029 V20),
  Pillar 3 (per above), Pillar 4 (Sandbox chain)

Part 3 — GLOSSARY.md folded as single source of truth for banned terms:
- Header dated 2026-05-20, notes "single source of truth for banned terms"
  and "no separate BANNED-TERMS.md"
- Existing 11 banned-terms rows rewritten with italicized qualifiers
- NEW Forbidden test domains subsection:
  openova.io (mothership-only), omantel.openova.io (hallucinated),
  Nova Cloud (predecessor brand), eventforge.io (hallucinated),
  admin.<fqdn> (dead BSS URL)
- SPIFFE/SPIRE identity row + acronym row marked deferred per PR #665
  with TBD-V29 (#2055) re-introduction roadmap
- Cross-links updated: IMPLEMENTATION-STATUS → STATUS,
  SOVEREIGN-PROVISIONING + BLUEPRINT-AUTHORING → RUNBOOKS.md

CLAUDE.md NOT touched. Source files NOT deleted (orchestrator owns deletion).
No push, no PR. Manifest at /tmp/merge-D-runbooks-status-glossary-manifest.txt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: assemble lean doc strategy — delete legacy sources, move ledger/sessions/archive, ADR-0004, rewrite cross-refs

Per founder direction 2026-05-20 + user-global ~/.claude/CLAUDE.md §11.

This is the orchestrator commit on top of the four cherry-picked consolidation
commits (ARCHITECTURE, PRINCIPLES, DOD, RUNBOOKS+STATUS+GLOSSARY). It:

1. Deletes 15 legacy source docs (now folded into the 7 canonical):
   PLATFORM-TECH-STACK, NAMING-CONVENTION, EPICS-1-6-unified-design,
   BOOTSTRAP-KIT-EXPANSION-PLAN, INVIOLABLE-PRINCIPLES, ANTI-PATTERN-CATALOG,
   5-PILLAR-DOD, DOMAINS-CANON, SOVEREIGN-MULTI-REGION-DOD,
   PERSONAS-AND-JOURNEYS, BLUEPRINT-AUTHORING, CHART-AUTHORING,
   DEMO-RUNBOOK, RUNBOOK-OPERATIONS, RUNBOOK-PROVISIONING.

2. Moves transient + historical docs into proper subdirs:
   - docs/ledger/{TRUST,TRACKER}.md (cron-refreshed live state)
   - docs/sessions/{2026-05-17-convergence,2026-05-19-20-trust-recovery,
     2026-05-20-trust-audit,2026-05-20-walk-runbook}.md
   - docs/archive/{validation-log,orchestrator-state,omantel-handover-wbs}.md

3. Adds docs/adr/0004-cnpg-sync-replication.md (Pillar 3 zero-tx-loss decision)
   + docs/adr/README.md index.

4. Updates CLAUDE.md reading-order + repo-structure block to match the
   lean strategy and current core/ tree (controllers/, marketplace/, etc.).

5. Sweeps all .md files + .github/workflows + scripts to repoint old doc
   paths to the new canonical homes. ADR cross-references kept intact
   (ADRs are immutable historical artifacts).

Operator-side cron scripts that still write to the old paths
(/home/openova/bin/refresh-dod-dashboard.sh, refresh-wbs.sh and
openova-private/bin/trust-audit.sh) need a one-line path update —
flagged in the PR body.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(bootstrap-kit): update repo-root sentinel to docs/PRINCIPLES.md

The bootstrap-kit Go test used `docs/INVIOLABLE-PRINCIPLES.md` as its
repo-root sentinel; the file no longer exists after the lean-doc
consolidation (it's now `docs/PRINCIPLES.md`). Update the walker to
match the new canonical filename.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 14:40:01 +04:00

9.4 KiB

OpenOva Project Memory

Last Updated: 2026-04-27 (Catalyst-unified rewrite) Purpose: Persistent context for Claude Code sessions about Catalyst platform strategy and architecture.

This file is now an index and decision log. The full architecture lives in docs/. When in doubt, the canonical docs win over this file.


1. Read these first

In strict order:

  1. docs/GLOSSARY.md — terminology source of truth
  2. docs/STATUS.md — what's built vs designed
  3. docs/ARCHITECTURE.md — Catalyst target architecture
  4. docs/ARCHITECTURE.md — naming patterns
  5. docs/DOD.md — who uses what
  6. docs/SECURITY.md — identity, secrets, rotation
  7. docs/SOVEREIGN-PROVISIONING.md — bringing a Sovereign online
  8. docs/RUNBOOKS.md — writing Blueprints

If any older notes in this file contradict those docs, those docs win.


2. Core positioning (locked 2026-04-27)

  • OpenOva = the company.
  • Catalyst = the OpenOva platform (the control plane that turns a Kubernetes cluster into a self-sufficient deployment).
  • Sovereign = a deployed instance of Catalyst.
  • Organization = multi-tenancy unit inside a Sovereign.
  • Environment = {org}-{env_type} scope where Applications run.
  • Application = an installed Blueprint.
  • Blueprint = the unified unit of installable software (replaces the older "module" + "template" split).

What was previously called Nova was just a Sovereign run by us hosting our SaaS Organizations. The "Nova" brand is retired in favor of "the openova Sovereign."

OpenOva's other products (Cortex, Axon, Fingate, Fabric, Relay, Specter, Exodus) are now positioned as composite Blueprints that run on Catalyst — not as parallel platform layers.


3. Stack decisions (locked 2026-04-27)

Concern Choice Notes
Event spine NATS JetStream Apache 2.0 (no BSL risk); native KV; native multi-tenant Accounts. Replaces the older "Redpanda + Valkey" combo for the control plane only. Application-level event needs choose freely (Redpanda, Kafka, NATS, RabbitMQ).
Secrets OpenBao + ESO Apache 2.0 fork of Vault (LF-led, IBM-backed). Replaces Vault.
Multi-region OpenBao Independent Raft per region + async perf replication NOT a stretched cluster. Each region is its own failure domain.
Workload identity SPIFFE/SPIRE 5-min rotating SVIDs, mTLS everywhere.
User identity Keycloak Per-Org realm in SME-style Sovereigns; per-Sovereign realm in corporate-style. SME tier uses minimal single-replica Keycloak (no HA).
GitOps Flux per vcluster Lightweight (source + kustomize + helm controllers). One Flux per vcluster, watching its Environment Gitea repo.
Git Gitea Per-Sovereign. Hosts public Blueprint mirror, Org-private Blueprints, per-Environment workspace repos.
IaC for non-K8s Crossplane Only IaC. Never user-facing. Advanced users author Compositions as Blueprints.
Bootstrap IaC OpenTofu One-shot only. Archived after Phase 0. Crossplane takes over.
Multi-tenancy vcluster (loft.sh) One per Organization per host cluster.
CNI / Service Mesh Cilium eBPF mTLS, L7 policies, Gateway API.
Bootstrap host catalyst-provisioner.openova.io Permanent service. Each Sovereign is fully self-sufficient after Phase 0; provisioner stays online for the next Sovereign.

4. User-facing surfaces (locked 2026-04-27)

Three first-class surfaces. No fourth.

  • UI — Catalyst console. Form / Advanced / IaC editor depths. Default for all personas.
  • Git — direct push or PR to the Environment Gitea repo (or Blueprint repos). Equal weight with UI.
  • API — REST + GraphQL for portal integrations (Backstage, ServiceNow). Not a primary IaC surface.

kubectl is debug-only, scoped to one's own vcluster. No Terraform/Pulumi/CLI for production changes.


5. Banned terms

Replaced terms — never use in new docs, code, UI strings:

Banned Use instead
Tenant Organization
Operator (entity / person) sovereign-admin (role)
Client (UX sense) User
Module / Template (Catalyst sense) Blueprint
Backstage Catalyst console
Synapse (the OpenOva product) Axon
Lifecycle Manager (separate) Catalyst
Bootstrap wizard (separate) Catalyst bootstrap
Workspace (Catalyst scope) Environment
Instance (user-facing object) Application

Full glossary: docs/GLOSSARY.md.


6. Sovereign topology

catalyst-provisioner (always on)  ──Phase 0──►  Target cloud (Hetzner / AWS / etc.)
                                                       │
                                                       ▼
                                           Sovereign deployment:
                                           ─ Management cluster (mgt)
                                             - Catalyst control plane
                                             - Gitea, JetStream, OpenBao,
                                               Keycloak, projector, …
                                           ─ Workload clusters (rtz, dmz)
                                             - Per-Org vclusters
                                             - Each with lightweight Flux

After Phase 0: Sovereign is self-sufficient. Provisioner is no longer in the path.

See docs/SOVEREIGN-PROVISIONING.md for full details.


7. Promotion model (no chain object)

There is no ApplicationGroup or ChainPolicy CRD. Promotion is the act of copying an Application's manifest from one Environment Gitea repo to another, gated by an EnvironmentPolicy attached to the destination Environment.

The Blueprint detail page in the console is the cross-Environment view: it shows every Application using a given Blueprint across all Environments in the Org, with version drift visible at a glance.


8. Multi-region semantics

  • Clusters named by building block, not failover role. Same building blocks deployed in multiple regions; k8gb routes traffic. Section 1.3 of docs/ARCHITECTURE.md.
  • Each region's OpenBao is an independent Raft cluster with async perf replication. No stretched clusters. See docs/SECURITY.md §5.
  • Catalyst Environment is a logical scope realized by N vclusters across regions — Placement metadata on each Application controls fan-out.

9. Naming changes vs older docs

Old New
{env} dimension in NAMING-CONVENTION {env_type}
"Workspace" (Catalyst scope) "Environment"
"Tenant" (anywhere) "Organization"
"Bootstrap mode" / "Manager mode" of core/ app Both fold under "Catalyst control plane"
Catalyst as a sub-product Catalyst as the platform itself
Cortex / Fingate / etc. as products Composite Blueprints running on Catalyst
OpenBao multi-region as stretched OpenBao multi-region as independent Raft + async perf replication
Vault OpenBao
Redpanda (control plane) NATS JetStream
Valkey (control plane) NATS JetStream KV (Valkey remains as Application Blueprint)

10. Component count

The historical "52 components" framing is retained at the marketing level for continuity, but the platform's identity is now Catalyst, not "the 52 components." Components are Blueprints. The list is in docs/ARCHITECTURE.md. Adding or removing components is a Blueprint addition or removal — does not require any platform-level rebrand.


11. Customer sync (unchanged in spirit)

Each Sovereign's Gitea mirrors the public Blueprint catalog from this repo. Pull cadence is Sovereign-local; air-gapped Sovereigns mirror offline. See docs/SOVEREIGN-PROVISIONING.md §9.


12. Open follow-ups (post-rewrite)

  • Per-Blueprint README.md audit — most are clean; remaining cleanup tracked in issue #37.
  • core/ directory may be reorganized to match the Catalyst component naming (no urgency; functional code unchanged).
  • Specter and Exodus positioning: Specter is a composite Blueprint (bp-specter) installed by default in corporate-style Sovereigns; Exodus is a deliverable migration service (people + playbooks), not a Blueprint. Documented at length in docs/BUSINESS-STRATEGY.md.

13. Approved key phrases

  • "Cloud-native is the foundation. Catalyst is how you operate it."
  • "Catalyst — the OpenOva platform."
  • "A Sovereign is a self-sufficient deployment of Catalyst."
  • "Nova was just a Sovereign run by us. Now we say 'the openova Sovereign'."
  • "Same code in every Sovereign — whether run by us, by Omantel, or by Bank Dhofar."

14. Phrases to avoid

  • "Tenant" anywhere in product context.
  • "Operator" as an entity (the role is "sovereign-admin").
  • "Module" / "Template" in the Catalyst sense.
  • "Backstage" — replaced.
  • "Lifecycle Manager" or "Bootstrap wizard" as separate products.
  • "Stretched cluster" in OpenBao context — we deliberately reject that pattern.
  • "Workspace" as Catalyst scope — replaced by Environment.

Older sections from earlier project-memory revisions removed during the 2026-04-27 unified rewrite. Historical decisions remain captured in git log of this repository if needed.