Milestones¶
Development progress across all phases of Civitas.
Status legend¶
| Symbol | Status |
|---|---|
| ✅ | Completed |
| 🔄 | In Progress |
| ⏳ | Planned |
| ⏸️ | Deferred |
| 💡 | Idea — to be specced |
Overview¶
Phase 1 — Core Runtime¶
Status: ✅ Completed — March 2026
| # | Deliverable | Priority | Status |
|---|---|---|---|
| M1.1 | AgentProcess base class, mailbox, handle() lifecycle |
🔴 High | ✅ |
| M1.2 | Supervisor with ONE_FOR_ONE, ONE_FOR_ALL, REST_FOR_ONE strategies |
🔴 High | ✅ |
| M1.3 | Backoff policies (CONSTANT, LINEAR, EXPONENTIAL), restart windows, crash timestamps |
🔴 High | ✅ |
| M1.4 | Serializer with msgpack + schema versioning; DeserializationError contract |
🔴 High | ✅ |
| M1.5 | InProcessTransport + MessageBus routing; request-reply with ephemeral topics |
🔴 High | ✅ |
| M1.6 | StateStore protocol; SQLite plugin; state persistence across restarts |
🟡 Medium | ✅ |
| M1.7 | Plugin system; LLM providers (Anthropic, OpenAI, Gemini, Mistral, LiteLLM) | 🔴 High | ✅ |
| M1.8 | Personal AI Assistant demo (Telegram gateway + skill agents) | 🟡 Medium | ⏸️ Deferred |
Phase 2 — Ecosystem¶
M2.1 — ZMQ Multi-Process Transport¶
Status: ✅ Completed — March 2026
| Deliverable | Status |
|---|---|
ZMQTransport with XSUB/XPUB proxy |
✅ |
ZMQProxy daemon thread |
✅ |
| PUB/SUB bridging across OS processes | ✅ |
| Request-reply over ephemeral topics | ✅ |
Worker process class for multi-process deployment |
✅ |
M2.2 — NATS Distributed Transport¶
Status: ✅ Completed — March 2026
| Deliverable | Status |
|---|---|
NATSTransport with JetStream support |
✅ |
| At-least-once delivery via durable consumers | ✅ |
| Multi-machine deployment support | ✅ |
| Worker multi-transport handoff | ✅ |
M2.3 — OTEL Observability¶
Status: ✅ Completed — April 2026
| Deliverable | Status |
|---|---|
Tracer with automatic span generation per message |
✅ |
SpanQueue with overflow protection |
✅ |
OTELAgent batch exporter with configurable flush interval |
✅ |
ConsoleBackend and FanOutBackend |
✅ |
| OTLP gRPC exporter plugin | ✅ |
| Trace propagation across agents (trace_id, parent_span_id) | ✅ |
M2.5 — EvalLoop (Local)¶
Status: ✅ Completed — April 2026
Corrective observability loop: a supervised EvalAgent process monitors agent behaviour and injects correction signals back into running agents. Local in-process evaluation only — remote eval engine integrations are M2.6. See design spec.
| Deliverable | Status |
|---|---|
civitas/evalloop.py — EvalEvent, CorrectionSignal, EvalAgent base class |
✅ |
AgentProcess.emit_eval(event_type, payload, eval_agent) — emit observable events |
✅ |
AgentProcess.on_correction(message) — override hook for nudge/redirect signals |
✅ |
civitas.eval.halt message type — cleanly stops target agent (on_stop still runs) |
✅ |
Rate limiting — sliding window per target agent (max_corrections_per_window, window_seconds) |
✅ |
EvalExporter protocol — interface defined, not implemented (M2.6) |
✅ |
Topology YAML — type: eval_agent shorthand in Runtime.from_config() |
✅ |
| 20 unit + integration tests | ✅ |
EvalAgent exported from civitas top-level package |
✅ |
Implementation checklist¶
- Core module —
civitas/evalloop.py -
EvalEventdataclass:agent_name,event_type,payload,trace_id,message_id,timestamp -
CorrectionSignaldataclass:severity(nudge / redirect / halt),reason,payload -
EvalExporterprotocol:async export(event: EvalEvent) -> None -
EvalAgent(AgentProcess)—handle()routescivitas.eval.eventmessages -
on_eval_event(event: EvalEvent) -> CorrectionSignal | None— override point - Rate limiter — sliding window, keyed by target agent name, drops + logs when exceeded
- For nudge/redirect: send
civitas.eval.correctionto target agent -
For halt: send
civitas.eval.haltto target agent -
AgentProcess integration
-
emit_eval(event_type, payload, eval_agent="eval_agent")— sendscivitas.eval.event; no-op if bus not wired -
on_correction(message: Message)— override hook called oncivitas.eval.correction -
civitas.eval.halthandled in_message_loop()— breaks loop, on_stop() still runs -
Runtime + package
-
type: eval_agentshorthand inRuntime.from_config()_build_node() -
EvalAgentexported fromcivitas.__init__ -
Tests (≥ 12 unit + ≥ 1 integration)
-
EvalEventandCorrectionSignalfield validation -
on_eval_event()returning None sends no correction - nudge signal delivered to
on_correction()hook - redirect signal delivered to
on_correction()hook - halt signal stops target agent (status → STOPPED, on_stop runs)
- Rate limiter allows corrections up to the window limit
- Rate limiter drops corrections beyond the window limit
- Rate limiter resets after window_seconds
-
emit_eval()is no-op when bus not wired -
emit_eval()reaches EvalAgent in a live runtime -
Integration: full supervision tree — EvalAgent halts a misbehaving sibling
-
Example + release
-
examples/eval_agent.py— policy enforcement with halt, redirect, nudge -
CHANGELOG.mdentry
M2.6 — Remote Eval Exporters¶
Status: ✅ Completed — v0.4 | Priority: 🔴 High
Plugin adapters connecting Civitas's EvalEvent stream to external eval engines. All platforms consume the same EvalEvent schema; each exporter translates to the platform's expected format. OTEL GenAI Semantic Conventions are the alignment layer — EvalEvent fields map directly to standard OTEL attributes. See design spec.
| Deliverable | Status |
|---|---|
EvalExporter protocol implementation + registration on EvalAgent |
✅ |
civitas[arize] — Arize Phoenix exporter (OTEL GenAI spans via OTLP) |
✅ |
civitas[fiddler] — Fiddler exporter (export to Fiddler AI; two-way guardrail receive deferred to M4.2) |
✅ |
civitas[langfuse] — Langfuse exporter (open-source, self-hostable) |
✅ |
civitas[braintrust] — Braintrust exporter |
✅ |
civitas[langsmith] — LangSmith exporter |
✅ |
emit_eval() forwards to all registered exporters in addition to local EvalAgent |
✅ |
| Topology YAML — declare exporters per eval_agent node | ✅ |
| ≥ 5 unit tests per exporter (mocked SDK calls) | ✅ |
Phase 3 — Developer Experience¶
M3.1–M3.3 — CLI and Dashboard¶
Status: ✅ Completed — March 2026
| Deliverable | Status |
|---|---|
civitas init project scaffolding |
✅ |
civitas run supervisor + worker modes |
✅ |
civitas topology validate / show / diff |
✅ |
civitas deploy docker-compose generation |
✅ |
civitas state list / clear |
✅ |
civitas dashboard live terminal dashboard |
✅ |
M3.4 — MCP Integration¶
Status: ✅ Completed — April 2026
MCP protocol plumbing — the wire layer between Civitas agents and MCP tool servers. Agents call tools by direct address (mcp://server/tool); the runtime handles handshake, transport, schema negotiation, and tracing. Agents also expose themselves as MCP servers so external LLM clients can discover and call them.
Scope: protocol wire layer only. Connection pooling, circuit breakers, unified tool namespacing, and semantic retrieval are not in scope — they belong to Fabrica. See design spec.
Dependency chain: M3.4 → M4.4 (ToolStore) → Fabrica (pooling + retrieval)
| Deliverable | Status |
|---|---|
civitas[mcp] optional extra — mcp>=1.0 dependency |
✅ |
MCPClient — connect (stdio + SSE), list_tools, call_tool, persistent session via AsyncExitStack |
✅ |
MCPTool(ToolProvider) — mcp://server_name/tool_name name scheme |
✅ |
AgentProcess.connect_mcp() — connect + auto-register tools into self.tools; idempotent |
✅ |
self.tools.get("mcp://server/tool") resolves to the registered MCPTool |
✅ |
MCPTool.execute() emits civitas.mcp.call OTEL span |
✅ |
CivitasMCPServer(GenServer) — deferred to Fabrica (scope boundary decision) |
⏸️ |
Topology YAML mcp.servers block — auto-connect at agent startup |
✅ |
| 23 unit tests | ✅ |
Explicitly out of scope for M3.4:
- Connection pooling / persistent sessions — Fabrica (MCPToolSource)
- Circuit breakers per server — Fabrica
- Semantic or keyword tool retrieval (find_tools) — Fabrica
- Unified cross-agent tool namespace — M4.4 ToolStore
- Per-agent credential isolation — M4.2 Security Hardening
Implementation checklist¶
Ordered tasks — each step is independently mergeable.
- Package setup
-
civitas/mcp/__init__.py— package stub -
civitas/mcp/types.py—MCPServerConfig(name, transport, command/args/env/url),MCPToolSchema -
civitas[mcp]extra inpyproject.toml—mcp>=1.0 -
MCP client
-
civitas/mcp/client.py—MCPClient.__init__(config: MCPServerConfig) -
MCPClient.list_tools()— stdio transport: open subprocess session, calllist_tools, close -
MCPClient.list_tools()— SSE transport: open HTTP session, calllist_tools, close -
MCPClient.call_tool(name, arguments)— stdio transport -
MCPClient.call_tool(name, arguments)— SSE transport -
MCPTool
-
civitas/mcp/tool.py—MCPTool(ToolProvider)wrappingMCPClient+MCPToolSchema -
MCPTool.namereturnsmcp://server_name/tool_name -
MCPTool.schemareturns the JSON Schema from the MCP tool definition -
MCPTool.execute(**kwargs)callsclient.call_tool()and returns result -
MCPTool.execute()emitscivitas.mcp.callOTEL span (attributes: server, tool, transport) -
AgentProcess integration
-
AgentProcess.connect_mcp(config)— createsMCPClient, callslist_tools, registers each asMCPToolinself.tools -
connect_mcp()is idempotent: deregisters existing tools for the same server before re-registering -
self.tools.get("mcp://github/create_issue")resolves correctly via registered name -
MCP server exposure
-
civitas/mcp/server.py—CivitasMCPServer(GenServer) -
CivitasMCPServer.init()— starts MCP stdio server in background task viamcp.Server -
list_toolshandler — returns schemas from injectedToolRegistry -
call_toolhandler — calls the matchingMCPTool.execute()or raisesToolNotFoundError -
Topology YAML support
- Runtime loader reads
mcp.serversblock, createsMCPServerConfiginstances - Agents auto-connect configured servers during startup (before first message)
-
mcp.expose.enabled: truestartsCivitasMCPServeras a supervised child -
civitas topology validateacceptsmcp:section without errors -
Tests (≥ 10 unit, ≥ 2 integration)
-
MCPServerConfigvalidation (missing transport fields, unknown transport) -
MCPTool.namefollowsmcp://scheme -
MCPTool.schemareturns correct JSON Schema -
MCPTool.execute()callsclient.call_tool()with correct args -
MCPTool.execute()emits OTEL span -
connect_mcp()registers tools inself.tools -
connect_mcp()deregisters old tools on reconnect (idempotency) -
self.tools.get("mcp://server/tool")returns correct tool -
CivitasMCPServerlist_toolsreturns all registered tools -
CivitasMCPServercall_toolroutes to correct tool - Integration: agent connects to real stdio MCP echo server, calls a tool
-
Integration:
CivitasMCPServerhandleslist_toolsrequest from real MCP client -
Release
-
CHANGELOG.mdentry under## [0.3.0] - Example:
examples/mcp_agent.py— agent connecting to a stdio MCP server -
mkdocs.ymlnav updated with MCP integration design doc
M3.5 — GenServer¶
Status: ✅ Completed — April 2026
OTP-style generic server primitive for separating stateful API/RPC service processes from AI agent processes on the message bus. See design spec.
| Deliverable | Status |
|---|---|
GenServer base class with handle_call / handle_cast / handle_info dispatch |
✅ |
call() — synchronous request-reply with timeout |
✅ |
cast() — async fire-and-forget |
✅ |
send_after() — delayed self-message (tick / timer support) |
✅ |
init() — startup initialisation hook |
✅ |
Supervision-compatible (works as a child of any Supervisor) |
✅ |
Topology YAML support (type: gen_server) |
✅ |
| 19 unit tests | ✅ |
examples/rate_limiter.py — token-bucket rate limiter demo |
✅ |
Implementation checklist¶
Ordered tasks — each step is independently mergeable.
- Core module —
civitas/genserver.py-
GenServer(AgentProcess)class — no LLM or tool plugin injection -
handle()dispatcher: route byreply_to→handle_call;__cast__marker →handle_cast; else →handle_info -
handle_call/handle_cast/handle_infostubs with correct signatures -
async def init()hook invoked once at process start -
send_after(delay_ms, payload)— scheduleshandle_infoto self - Track
send_aftertasks; cancel all onstop() - Enforce
handle_callreturns a dict (rejectNoneto prevent caller hangs)
-
call()/cast()aliases-
AgentProcess.call(name, payload, timeout)— alias over existingask() -
AgentProcess.cast(name, payload)—send()with__cast__marker -
Runtime.call()/Runtime.cast()— external entry points
-
- Topology YAML support
- Loader accepts
type: gen_server(module/class resolution identical totype: agent) -
civitas topology validatepasses for gen_server nodes -
civitas topology showrenders gen_server with distinct icon/label -
civitas topology difftreats gen_server nodes correctly
- Loader accepts
- Observability
- Emit
civitas.genserver.callspan forhandle_call - Emit
civitas.genserver.castspan forhandle_cast - Emit
civitas.genserver.infospan forhandle_info - Trace propagation preserved across
call()boundaries
- Emit
- Tests (≥ 15 cases in
tests/test_genserver.py)-
handle_callreturns reply viareply_to -
handle_castruns, no reply emitted -
handle_infoinvoked for non-call non-cast messages -
call()timeout raises within configured bound -
send_afterfireshandle_infoafter delay -
send_aftertasks cancelled cleanly onstop() -
init()runs before first message handled - GenServer as child of
ONE_FOR_ONE,ONE_FOR_ALL,REST_FOR_ONEsupervisors - Restart triggers
init()again (state resets unlessStateStoreconfigured) -
StateStore-backed state survives restart -
self.llmnot present on GenServer instance -
self.toolsnot present on GenServer instance -
handle_callreturning non-dict raises - GenServer ↔ AgentProcess sibling communication round-trip
- Topology YAML round-trip: load → run →
topology showmatches
-
- Example + documentation
-
examples/rate_limiter/— end-to-endRateLimiter(GenServer)with consumer agent - User guide page referencing
docs/design/genserver.md - API reference entry for
civitas.genserver -
mkdocs.ymlnav updated
-
- Release
-
CHANGELOG.mdentry under## [0.3.0] - Cross-reference M3.4 (MCP) and M2.5 (EvalLoop) for coordinated v0.3 cut
-
Infrastructure & Release¶
Status: ✅ Completed — April 2026
| Deliverable | Status | Completed |
|---|---|---|
| Agency → Civitas rename (115 files) | ✅ | Apr 2026 |
| Pre-commit hooks (ruff, mypy, file hygiene) | ✅ | Apr 2026 |
| GitHub Actions CI (Python 3.12 / 3.13 / 3.14) | ✅ | Apr 2026 |
| PyPI publishing via OIDC trusted publishing | ✅ | Apr 2026 |
| GitHub Pages documentation site | ✅ | Apr 2026 |
| Test coverage raised from 85% → 90%+ | ✅ | Apr 2026 |
| Framework adapters: LangGraph, OpenAI Agents SDK | ✅ | Mar 2026 |
| Framework adapters: CrewAI | ⏳ | — |
Phase 4 — Platform Maturation¶
M4.1b — Dynamic Agent Spawning¶
Status: ✅ Completed — April 2026 | Priority: 🔴 High
Agents spawn and decommission other agents at runtime. Enables LLM-driven orchestrators that create specialist agents on demand. See design spec.
Design decisions locked:
- DynamicSupervisor is a separate class from Supervisor (Erlang-faithful separation — ONE_FOR_ONE only, starts empty)
- DynamicSupervisor is declared as a static child in topology YAML; its children are dynamic
- self.spawn() targets the nearest ancestor DynamicSupervisor — no explicit target at the call site
- on_spawn_requested is a governance veto hook on DynamicSupervisor (return False to deny)
- max_children enforces blast radius per DynamicSupervisor
Open design questions (being resolved):
- ~~Q2 — Restart semantics~~ → transient default; no escalation on exhaustion; on_child_terminated hook
- Q3 — on_spawn_requested placement (supervisor vs agent vs both)
- ~~Q4 — Limit semantics~~ → both: max_children (concurrent) + max_total_spawns (lifetime budget)
- ~~Q5 — Despawn semantics~~ → despawn() hard stop + stop(drain, timeout) soft stop (awaitable, timeout fallback to hard stop)
- ~~Q6 — Cross-process spawning~~ → bus message protocol from day one; in-process v0.4; cross-process v0.5 (homogeneous deployments)
- ~~Q7 — topology show live state~~ → TopologyServer(GenServer) JSON HTTP endpoint; CLI pings /topology; falls back to static YAML if unreachable
| Deliverable | Status |
|---|---|
DynamicSupervisor class — starts empty, ONE_FOR_ONE, max_children + max_total_spawns limits |
✅ |
type: dynamic_supervisor in topology YAML |
✅ |
self.spawn(AgentClass, name, config) — nearest ancestor routing |
✅ |
self.despawn(name) — hard stop; self.stop(drain, timeout) — soft stop |
✅ |
on_spawn_requested governance hook on DynamicSupervisor |
✅ |
on_child_terminated notification hook on spawning agent |
✅ |
Runtime.spawn() / Runtime.despawn() / Runtime.stop_agent() — external entry points |
✅ |
SpawnError added to error hierarchy |
✅ |
| 38 unit + integration tests | ✅ |
TopologyServer(GenServer) — supervised JSON HTTP management endpoint |
✅ |
topology show pings TopologyServer; falls back to static YAML |
✅ |
examples/dynamic_spawning.py |
✅ |
M4.2 — Security Hardening¶
Status: ✅ Completed — v0.4 | Priority: 🔴 High
Design approved. Splits into five independently shippable sub-milestones — see docs/design/security-hardening.md for full rationale, design decisions, and resolved questions.
Recommended delivery order: a → c → d → e → b.
M4.2a — Identity & Signing¶
Status: ✅ Complete
| Deliverable | Status |
|---|---|
civitas/security/ package: IdentityConfig, SigningConfig, SecurityConfig |
✅ |
AgentIdentity: Ed25519 keypair generation, OpenSSH-style storage (id_ed25519 / id_ed25519.pub) |
✅ |
KeyRegistry: public key lookup by agent name |
✅ |
MessageSigner: sign outgoing envelopes (v=2 wire format), verify incoming |
✅ |
NonceCache: bounded LRU replay protection (10k entries) |
✅ |
SignatureError — new CivitasError subclass |
✅ |
SigningSerializer wrapping MsgpackSerializer |
✅ |
| Multi-node key distribution: public keys in topology YAML; spawn-message vouching for dynamic agents | ✅ |
security: YAML block parsing in Runtime.from_config() |
✅ |
| InProcess transport: signing bypassed entirely (D9 performance rule) | ✅ |
signing.allow_unsigned: true escape hatch for rolling upgrades |
✅ |
| Unit + integration tests ≥90% coverage on new code | ✅ |
M4.2b — Transport mTLS¶
Status: ✅ Complete
| Deliverable | Status |
|---|---|
| ZMQ CURVE: server keypair on proxy, client keypairs on Workers | ✅ |
| NATS TLS + nkeys: Ed25519-based subject auth, TLS cert/key/CA config | ✅ |
security.transport YAML block plumbing into ZMQ and NATS transports |
✅ |
civitas security init CLI — scaffold keys and config for ZMQ/NATS deployments |
✅ |
M4.2c — Credential Isolation¶
Status: ✅ Complete
| Deliverable | Status |
|---|---|
${VAR_NAME} env-var substitution in Runtime.from_config() |
✅ |
Unset variable raises ConfigurationError with clear message |
✅ |
civitas.secrets.SecretsProvider protocol + file/env/Vault implementations |
✅ |
Per-agent credentials: block in topology YAML |
✅ |
Plugin handles: self.llm("anthropic") resolves per-agent credential at call time |
✅ |
M4.2d — Tool Sandbox¶
Status: ✅ Complete
| Deliverable | Status |
|---|---|
| Bubblewrap wrapper for MCP subprocess execution on Linux | ✅ |
sandbox: YAML block per MCP server (network, filesystem allowlists) |
✅ |
Refuse-to-start when sandbox.enabled: true and bwrap unavailable |
✅ |
| Clear error messages with per-distro install instructions | ✅ |
M4.2e — Audit Log¶
Status: ✅ Complete
| Deliverable | Status |
|---|---|
civitas.audit module: AuditEvent TypedDict, AuditSink protocol |
✅ |
JsonlFileSink: batched fsync (100ms / 100 events), sync_writes option, SIGHUP rotation |
✅ |
NullSink for tests |
✅ |
Emission at chokepoints: MessageBus.route(), MCPTool.execute(), sandbox violations, secret access |
✅ |
SyslogSink and OtlpSink implementations |
✅ |
M4.3 — Codebase Security & Enterprise Posture¶
Status: ✅ Completed — April 2026 | Priority: 🔴 High
Complements M4.2. Where M4.2 hardens the runtime (mTLS, message signing, credential isolation, sandboxing), M4.3 hardens the codebase and supply chain so enterprises have a clear security story before adoption: known vulnerabilities tracked, dependencies scanned, secrets never committed, a published threat model, and a documented disclosure process.
The deliverables are split across tooling (CI-enforced scanners), documentation (threat model, security architecture, adoption checklist), and process (disclosure policy, release notes, third-party audit).
| Deliverable | Status |
|---|---|
SAST in CI — Bandit + Semgrep on every PR, fail build on HIGH+ |
✅ |
Dependency scanning — pip-audit in CI + Dependabot weekly |
✅ |
| SBOM generation — CycloneDX SBOM published with every release | ✅ |
Secret scanning — gitleaks pre-commit hook + CI job on full history |
✅ |
docs/security/threat-model.md — STRIDE analysis per runtime component |
✅ |
docs/security/architecture.md — security model (trust boundaries, supervision, transport isolation) |
✅ |
SECURITY.md — responsible disclosure policy, contact, supported versions, response SLAs |
✅ |
docs/security/enterprise-checklist.md — adoption checklist (deployment hardening, config review, audit log integration) |
✅ |
External security audit before v1.0 — fix all HIGH+ findings, publish summary |
⏳ Deferred to pre-v1.0 |
| Continuous posture — CVE watch on runtime deps, security release notes, CVSS-scored advisories | ⏳ Ongoing process |
M4.4 — Capability-Aware Registry¶
Status: ✅ Completed — May 2026 | Priority: 🟡 Medium
Agents declare capability tags at the class level; the registry supports filtered lookups; agents can route to any capable peer without knowing its name.
| Deliverable | Status |
|---|---|
RoutingEntry.capabilities + RoutingEntry.capability_metadata fields |
✅ |
LocalRegistry.register() / register_remote() accept capabilities |
✅ |
find_by_capability(tag) — all agents (local + remote) with that tag |
✅ |
find_by_capabilities(tags, match="any"\|"all") — multi-tag filtered lookups |
✅ |
AgentProcess.capabilities / capability_metadata class-level declarations |
✅ |
AgentProcess.send_capable(capability, payload) — fire-and-forget to any capable agent |
✅ |
CapabilityNotFoundError raised when no registered agent declares the tag |
✅ |
YAML capabilities: / capability_metadata: block overrides class-level defaults |
✅ |
Distributed propagation: Worker announcements carry capabilities; _on_remote_register populates remote entries |
✅ |
RegistryListener hook: async callbacks fired after every register/deregister (Presidium integration point) |
✅ |
LocalRegistry.add_listener() / remove_listener() — fire-and-forget tasks with error logging |
✅ |
Public exports: RoutingEntry, RegistryListener, CapabilityNotFoundError from civitas top-level |
✅ |
29 unit tests covering all registry operations, listener lifecycle, and send_capable |
✅ |
Design notes¶
Boundary with Presidium: Civitas capability tags are operational routing data — plain strings by convention (e.g., "text.summarize"). Presidium owns the controlled vocabulary, human-readable descriptions, and governance metadata. Presidium plugs in via the RegistryListener hook — it receives every register/deregister event with full capability info and maintains its own authoritative Agent Registry.
Distributed topology: Every node (Runtime and Worker) has a complete capability view of the deployment. Worker announcements include capabilities and capability_metadata; the Runtime's _on_remote_register handler populates register_remote() entries. send_capable() thus works transparently across process boundaries.
Tag format: plain strings, dot-namespaced by convention ("domain.action"). No enum enforcement — Presidium owns the controlled vocabulary and Civitas treats tags as opaque routing keys.
HTTP Gateway¶
Status: ✅ Completed — April 2026
Supervised edge process bridging external HTTP traffic into the Civitas message bus. HTTP/1.1 + HTTP/2 (uvicorn) and HTTP/3 / QUIC (aioquic) in v0.4. gRPC deferred to v0.5. See design spec.
| Deliverable | Status |
|---|---|
HTTPGateway(AgentProcess) — ASGI app, request translation, route table |
✅ |
HTTP/1.1 + HTTP/2 via uvicorn[standard] — uvloop + httptools (civitas[http]) |
✅ |
HTTP/3 / QUIC via aioquic — Alt-Svc header, 0-RTT (civitas[http3]) |
✅ |
| TLS config from topology YAML / env vars | ✅ |
Topology YAML support (type: http_gateway) |
✅ |
| Graceful drain on supervisor shutdown | ✅ |
| ≥ 20 unit tests + ≥ 5 integration tests | ✅ |
examples/http_gateway.py |
✅ |
| gRPC via grpclib / grpcio | ⏸️ v0.5 |
Custom .proto loading from proto_dir |
⏸️ v0.5 |
Implementation checklist¶
- Package setup
-
civitas/gateway/__init__.py— package stub, re-exportHTTPGateway -
civitas[http]extra inpyproject.toml—uvicorn[standard]>=0.30 -
civitas[http3]extra —aioquic>=1.0 -
Core —
civitas/gateway/core.py -
GatewayConfigdataclass —host,port,port_quic,tls_cert,tls_key,request_timeout,enable_http3 -
HTTPGateway(AgentProcess)— holds config, route table, uvicorn server reference -
on_start()— install uvloop (Linux/macOS), start uvicorn server as background task -
on_stop()— signal uvicorn to drain in-flight requests, cancel server task -
handle()— handles internal messages (e.g., topology-triggered reconfiguration); no-op for now -
ASGI app —
civitas/gateway/asgi.py -
GatewayASGI.__call__(scope, receive, send)— ASGI callable - HTTP scope: parse method, path, headers, body
- Route lookup: path + method → agent name, mode (
callvscast) - Default routes:
POST /agents/{name}→call,POST /agents/{name}/cast→cast - HTTP →
Messagetranslation: body →payload,X-Civitas-Type→type,traceparent→ trace context -
call()mode: await reply, serialisepayloadas JSON response body -
cast()mode: fire-and-forget, return HTTP 202 - Timeout:
asyncio.wait_forwithrequest_timeout; return HTTP 504 on expiry -
Error mapping:
payload.error→ 400, no route → 404, unhandled exception → 500 -
Router —
civitas/gateway/router.py -
RouteEntrydataclass —method,path_pattern,agent,mode -
RouteTable— ordered list ofRouteEntry;match(method, path)returns(RouteEntry, path_params) - Path parameter extraction:
{name}segments captured into dict - Default route fallback when no custom routes are configured
-
YAML route loading:
config.routeslist →RouteEntryinstances -
HTTP/3 —
civitas/gateway/h3.py -
H3Server— wraps aioquic QUIC server; runs onport_quic(UDP) - HTTP/3 request → same
GatewayASGIhandler (reuse ASGI layer) -
Alt-Svc: h3=":port_quic"header injected into all HTTP/1.1 and HTTP/2 responses -
H3Serverstarted / stopped alongside uvicorn inon_start()/on_stop() -
Topology YAML support
-
type: http_gatewayinRuntime.from_config()_build_node() -
GatewayConfigpopulated from YAMLconfig:block;!ENVresolver for TLS cert/key paths -
civitas topology validateacceptstype: http_gatewaynodes without errors -
civitas topology showdisplays gateway node with[http]/[http3]label -
Tests (≥ 20 unit, ≥ 5 integration)
-
RouteTable.match()— exact path, path parameters, method mismatch, no route - Default route fallback:
POST /agents/foo→call("foo", body) -
callmode: reply payload returned as JSON 200 -
castmode: 202 returned immediately - Timeout:
request_timeout=0.001→ 504 - Error mapping:
payload.error→ 400; unhandled exception → 500 - No route: 404
-
traceparentheader propagated intomessage.trace_id -
GatewayConfigvalidation: missing TLS cert whenenable_http3=True -
on_start()installs uvloop on Linux -
on_stop()cancels server task cleanly - Integration: real HTTP client (
httpx.AsyncClient) → gateway →AgentProcess→ reply - Integration: concurrent requests all return correct replies
-
Integration: gateway node in topology YAML starts correctly via
Runtime.from_config() -
Example + release
-
examples/http_gateway.py— minimal REST API with two agent endpoints -
CHANGELOG.mdentry under## [Unreleased]
Gateway API Surface¶
Status: ✅ Completed — April 2026
Declarative routes, Pydantic request/response validation, middleware chain, and auto-generated OpenAPI 3.1 docs on top of HTTPGateway. See design spec.
| Deliverable | Status |
|---|---|
@route decorator — documents HTTP method + path on agent handler (YAML is authoritative for wiring) |
✅ |
Path parameter extraction into message.payload |
✅ |
@contract decorator — Pydantic request/response validation, 422 error shape |
✅ |
GatewayRequest / GatewayResponse / NextMiddleware types |
✅ |
| Global + route-scoped middleware chain | ✅ |
Stateful GenServer middleware via request.gateway.call() |
✅ |
Auto-generated OpenAPI 3.1 spec at GET /openapi.json |
✅ |
Swagger UI at GET /docs, ReDoc at GET /redoc |
✅ |
| YAML-declared routes and schemas (no decorators required) | ✅ |
civitas topology validate cross-checks YAML routes against @route decorators |
✅ |
| ≥ 15 unit tests + ≥ 3 integration tests | ✅ |
Routing authority: YAML is the single source of truth for gateway wiring. @route stores metadata on the method object only — it is never read by the gateway at runtime. Its value is (1) colocated documentation of intent and (2) a machine-checkable annotation that civitas topology validate cross-references against YAML to warn on drift.
Implementation checklist¶
- Types —
civitas/gateway/types.py -
GatewayRequestdataclass —method,path,path_params,query_params,headers,body,client_ip,gateway(AgentProcess ref) -
GatewayResponsedataclass —status,body,headers -
NextMiddlewaretype alias —Callable[[GatewayRequest], Awaitable[GatewayResponse]] -
Route decorator —
civitas/gateway/router.py -
@route(method, path, mode="call")— stores_civitas_routemetadata dict on the decorated function; no side effects, no global registry -
RouteTable.from_config(routes_config)— sole runtime source; buildsRouteEntrylist from topology YAMLroutes:block -
RouteTable.from_class(cls)— validation-only helper; scans class methods for_civitas_routemetadata; used exclusively bycivitas topology validate -
civitas topology validate: when a gateway node references an agent, import the class and warn if a YAML route has no matching@routeon the handler, or if a@routeexists with no corresponding YAML entry -
Contract decorator —
civitas/gateway/contracts.py -
@contract(request=Model, response=Model)— stores_civitas_contractmetadata on the function;requestandresponseare optional PydanticBaseModelsubclasses - Request validation in ASGI dispatch: if route has a contract,
Model.model_validate(body)before calling the bus; 422 onValidationErrorwith FastAPI-compatible error shape{"detail": [...]} - Response validation:
Model.model_validate(reply_payload)after reply received; 500 on mismatch -
No-op when
@contractnot applied — pass-through -
Middleware —
civitas/gateway/middleware.py -
MiddlewareChain— ordered list of async callables; buildscall_nextchain via closure - Global middleware loaded from
config.middleware(dotted import path → callable) - Route-scoped middleware loaded from
route.middleware - Execution order: global → route-scoped → contract validation → bus dispatch
-
Short-circuit: middleware returning
GatewayResponsewithout callingcall_nextskips remainder -
Wire into ASGI —
civitas/gateway/asgi.pyupdates - Replace direct bus dispatch with: build
GatewayRequest→ run middleware chain → contract validate → dispatch -
GatewayRequest.gatewayset to theHTTPGatewayinstance (for stateful GenServer middleware) -
Contract metadata read from the agent class method via
@route+@contracton the matched handler -
OpenAPI —
civitas/gateway/openapi.py -
build_spec()— readsRouteTable(from YAML) + loads agent class to read@contractmetadata - Generates OpenAPI 3.1
pathsfrom route entries - Request body schema from
@contract(request=Model)viaModel.model_json_schema() - Response schema from
@contract(response=Model) - Tags from agent name
- Auto-includes 422 response schema when request model is declared
-
GET /openapi.json— returns generated spec -
GET /docs— Swagger UI (CDN-hosted, no static assets) -
docs.enabled: falseconfig disables all three endpoints -
Tests (≥ 15 unit, ≥ 3 integration)
-
@routestores metadata on the function, no global registry side-effect -
RouteTable.from_config()builds routes correctly from config dict -
RouteTable.from_class()reads@routemetadata from class methods - Path parameters extracted correctly from URL
-
@contractrequest validation: valid body → dispatched; invalid → 422 with FastAPI error shape -
@contractresponse validation: valid reply → 200; invalid → 500 - Middleware chain: all middleware called in order
- Middleware short-circuit: returning response without
call_nextskips rest of chain - Global middleware runs before route-scoped middleware
-
/openapi.jsonreturns valid OpenAPI 3.1 spec -
/docsreturns 200 with Swagger UI HTML -
docs.enabled: false→/docsreturns 404 - Tags populated from agent name
-
Integration: end-to-end with real HTTP client
-
Example + release
-
examples/http_gateway.py— minimal REST API with agent endpoints -
CHANGELOG.mdentry
Postgres StateStore + Migration¶
Status: ✅ Completed — May 2026 | Priority: 🔴 High
SQLite works for single-process deployments but breaks under concurrent cross-process writes (ZMQ Level 2+, NATS Level 3). PostgresStateStore extends the StateStore protocol — switching backends is a topology YAML change with no agent code changes.
| Deliverable | Status |
|---|---|
StateStore protocol extended with list_agents() and close() |
✅ |
InMemoryStateStore.list_agents() / close() |
✅ |
PostgresStateStore — asyncpg backend, connection pool, lazy init |
✅ |
civitas_agent_state table — JSONB, upsert, updated_at timestamp |
✅ |
Pool config via topology YAML — min_size, max_size, timeout |
✅ |
civitas[postgres] optional extra — asyncpg>=0.29 |
✅ |
Plugin loader entry: type: postgres |
✅ |
@runtime_checkable StateStore — isinstance() checks work |
✅ |
civitas state migrate <src> <dst> — dry-run by default, --execute to apply |
✅ |
_parse_dsn() — sqlite:<path>, .db/.sqlite extension, postgresql:// URL |
✅ |
Helpful ImportError with install hint if asyncpg not installed |
✅ |
| 20 unit tests covering protocol, PostgresStateStore (mocked), and migrate CLI | ✅ |
| Zero-downtime dual-write migration | ⏸️ Deferred — maintenance-window copy is sufficient for v0.4 |
| PgBouncer deployment guide | ⏸️ Deferred to docs pass |
MySQL StateStore (aiomysql/asyncmy backend) |
⏸️ Deferred — see below |
MySQL StateStore — deferred because Postgres covers the multi-process persistence gap and asyncpg is a better async foundation. Add if users are already running MySQL and cannot introduce a second database. Implementation is a clean 100-line addition following the same plugin pattern (
civitas[mysql]extra,type: mysqlloader entry,mysql://DSN in_parse_dsn).
M4.1 — Visual Topology Editor¶
Status: ⏸️ Deferred | Priority: 🟢 Low
Web-based drag-and-drop editor for designing agent topologies visually.
| Deliverable | Status |
|---|---|
| Drag-and-drop agent/supervisor canvas | ⏸️ |
| Visual message flow connections | ⏸️ |
| Supervision strategy configuration via UI | ⏸️ |
| Export to valid Civitas topology YAML | ⏸️ |
| Round-trip: imported YAML renders correctly | ⏸️ |
Phase 5 — Agentic Platform¶
Ideas awaiting full design specs. Each is a supervised GenServer (or group of GenServers) that runs inside the user's deployment — not external services, not SaaS. The SaaS boundary sits above these: hosted registries, managed observability, and multi-tenant governance are separate concerns.
Prompt Library & Playground¶
Status: 💡 Idea — to be specced | Priority: 🔴 High
Prompts as first-class versioned entities, stored and served by a supervised PromptStore(GenServer). Agents load instructions by name rather than hardcoding strings — prompt changes never require a code deploy. The playground (CLI + dashboard tab) lets you test a prompt version against a live agent before promoting it.
This is one of the strongest SaaS upgrade stories: the OSS PromptStore runs in your deployment; a hosted version adds a web UI for non-engineers, team collaboration, cross-deployment promotion, and output analytics.
| Idea | Notes |
|---|---|
PromptStore(GenServer) — versioned prompt storage on the bus |
Agents call call("prompt_store", {"agent": "assistant", "slot": "system"}) |
| SQLite backend (runtime-mutable) + YAML dir backend (git-tracked) | User chooses per deployment |
Named version aliases — latest, stable, experimental |
Pinned per agent per environment in topology YAML |
| Per-agent, per-slot prompt mapping | Each agent can have multiple slots: system, few_shot, tools |
| Hot-swap support — reload prompt without restarting agent | Agent subscribes to prompt update events |
civitas playground CLI — interactive session with a specified prompt version |
Test against live runtime before promoting |
| Dashboard tab — side-by-side prompt diff, test messages, output comparison | Lightweight eval harness backed by EvalLoop (M2.5) |
| A/B traffic splitting between prompt versions | Random split; metrics tracked via OTEL spans |
| SaaS layer — web UI, team collaboration, cross-deployment promotion, analytics | design/prompt-library.md — to be written |
| Spec | design/prompt-library.md — to be written |
LLM Gateway¶
Status: ⏸️ Moved to Presidium (presidium-llm-gateway)
Model routing without governance (multi-provider fallback for reliability) is a thin Civitas utility — CompositeModelProvider. It is not a full gateway.
The full governed LLM gateway — per-agent rate limits, cost tracking, budget enforcement, grant-based provider routing — belongs in Presidium. It wraps any Civitas ModelProvider via the plugin protocol and enforces governance policy before delegating to the underlying provider.
Civitas provides the ModelProvider protocol (integration point 2 for Presidium). Civitas does not provide rate limiting, budgets, or grant-based routing — those are governance concerns.
Residual Civitas utility: CompositeModelProvider — a simple ordered fallback chain (primary → fallback) for reliability. No governance, no per-agent tracking. Infrastructure, not governance.
See Presidium presidium-llm-gateway for the governed implementation.
See docs/design/civitas-presidium-boundary.md for the full boundary definition.
Fabrica — Tools Gateway¶
Status: 💡 Idea — to be specced | Priority: 🔴 High
Product: Fabrica (pip install fabrica) — lives in civitas-io/civitas-forge, not in python-civitas.
Fabrica solves the tool schema token problem: passing all tool schemas to every LLM call is token-expensive and degrades selection accuracy beyond ~20–30 tools. Instead of N schemas, the LLM receives one find_tools(query) meta-tool and retrieves only the schema it needs.
Fabrica aggregates tool sources (local ToolStore, MCP servers, Composio, custom), serves a unified namespace, and exposes a retrieval interface. Civitas agents connect to it as a tool source — any other LLM framework can too.
Dependency chain: M3.4 (MCP plumbing) → M4.4 (ToolStore) → Fabrica (retrieval)
See RFC 0001 (docs/rfc/0001-tool-retrieval.md) for the formal problem statement and proposed interface standard.
| Idea | Notes |
|---|---|
find_tools(query) meta-tool — one schema sent to LLM, not N |
Keyword backend (default) + embedding backend (fabrica[search]) |
| Tool source aggregation — local ToolStore, MCP servers, Composio, custom | Pluggable ToolSource protocol |
| Unified tool namespace across all sources | gateway://source/tool_name address scheme |
| Per-source credential isolation | Each source has its own auth config; agents never see other sources' secrets |
| Tool call sandboxing | Filesystem + network isolation for untrusted tool execution |
| Health monitoring + circuit breaker per source | Unhealthy sources removed from routing automatically |
| MCP-compatible interface | Fabrica itself exposes list_tools + call_tool — any MCP client can connect |
Civitas integration — ToolSource plugin pointing at Fabrica |
civitas[fabrica] extra |
| SaaS upgrade path — hosted Fabrica with team tool registry, analytics | Future |
| Spec | civitas-forge/packages/fabrica/ — to be created |
Skills Gateway¶
Status: 💡 Idea — to be specced | Priority: 🟡 Medium
A supervised registry of composable agent workflows — "skills" — that can be discovered and invoked by name or capability. A skill is a named, versioned sequence of tool calls, LLM steps, or sub-agent invocations exposed as a single callable unit on the bus.
Extends the Capability-Aware Registry (M4.4): where M4.4 answers "which agent can do X?", the Skills Gateway answers "invoke skill X, wherever it runs."
| Idea | Notes |
|---|---|
@skill decorator — declare a reusable workflow on any agent |
Versioned, named, queryable by capability tags |
| Skill discovery by capability / input type | gateway.find_skill("summarise", input_type="text/html") |
| Cross-agent skill composition | Skills can invoke other skills; gateway handles routing |
| Skill versioning with semver + forward compatibility | Old callers work when a skill is upgraded |
| Local + remote skill sources | Skills can live in the local registry or a remote Civitas deployment |
| Hosted skills marketplace | Future SaaS layer — shared skills across organisations |
| Spec | design/skills-gateway.md — to be written |