Design: EvalLoop (M2.5 + M2.6)¶

Status: M2.5 Implemented — v0.3 | M2.6 Planned — v0.4 Author: Jeryn Mathew Varghese Last updated: 2026-04

Motivation¶

LLM agents can go off-rails — they hallucinate, ignore constraints, loop, or produce unsafe outputs. Detecting this from inside the agent's own handle() is possible but conflates application logic with correctness monitoring. You also cannot catch patterns that only emerge across multiple messages.

EvalLoop introduces a dedicated, supervised EvalAgent process that sits alongside regular agents in the supervision tree. Agents emit observable events; the EvalAgent scores them and injects correction signals back. If needed, it halts the offending agent entirely.

There are two tiers:

Tier	What	When
M2.5 — Local EvalLoop	In-process `EvalAgent` with real-time correction signals	v0.3
M2.6 — Remote Eval Exporters	Plugin adapters for Arize Phoenix, Fiddler, Langfuse, Braintrust, LangSmith	v0.4

Both tiers share the same EvalEvent schema. The local agent consumes it in-process; remote exporters translate it to OTEL GenAI spans and forward to external eval engines.

Architecture¶

AgentProcess
    │
    ├── await self.emit_eval("llm_output", {"content": response, ...})
    │           │
    │           ├──▶ EvalAgent (local, in-process)
    │           │       │
    │           │       ├── on_eval_event(event) → CorrectionSignal | None
    │           │       ├── rate limit check
    │           │       └── send civitas.eval.correction / civitas.eval.halt
    │           │
    │           └──▶ EvalExporter (remote, M2.6)
    │                   ├── Arize Phoenix (OTEL GenAI spans)
    │                   ├── Fiddler (production guardrails)
    │                   ├── Langfuse (open-source)
    │                   ├── Braintrust (eval science)
    │                   └── LangSmith (LangChain ecosystem)
    │
    └── on_correction(message)   ◀── civitas.eval.correction (nudge / redirect)
        [auto-halt]              ◀── civitas.eval.halt

Why a separate process: the evaluator is independently supervised, independently restartable, stateful (rate limit counters, violation history), and swappable without touching agent code. This follows the OTP design principle: concerns that can fail independently should be separate processes.

Scope boundary¶

Concern	M2.5 (Local)	M2.6 (Remote)
`EvalAgent`, `EvalEvent`, `CorrectionSignal`	✅	—
`emit_eval()` on `AgentProcess`	✅	—
`on_correction()` hook	✅	—
Rate limiting per target agent	✅	—
Topology YAML (`type: eval_agent`)	✅	—
`EvalExporter` protocol	✅ (interface only)	—
Arize Phoenix plugin	—	✅
Fiddler plugin (two-way guardrails)	—	✅
Langfuse plugin	—	✅
Braintrust plugin	—	✅
LangSmith plugin	—	✅

Core types¶

EvalEvent¶

Emitted by agents via await self.emit_eval(event_type, payload). Schema is aligned with OTEL GenAI Semantic Conventions so remote exporters can forward as standard spans.

@dataclass
class EvalEvent:
    agent_name: str          # who emitted it
    event_type: str          # e.g. "llm_output", "tool_call", "decision", "custom"
    payload: dict[str, Any]  # event data (model output, tool result, etc.)
    trace_id: str = ""
    message_id: str = ""
    timestamp: float = field(default_factory=time.time)

Event type conventions:

event_type	When to use
`llm_output`	After receiving an LLM response
`tool_call`	Before or after a tool invocation
`decision`	When the agent makes a routing or branching decision
`message_sent`	When the agent sends a message to another agent
`custom`	Any application-specific checkpoint

CorrectionSignal¶

Returned by EvalAgent.on_eval_event(). Three severity levels:

Severity	Meaning	Agent behaviour
`nudge`	Soft guidance — minor issue detected	Agent receives correction in `on_correction()`, continues running
`redirect`	Significant concern — approach needs to change	Agent receives correction in `on_correction()`, should alter course
`halt`	Critical violation	Agent's message loop is stopped via `civitas.eval.halt`

@dataclass
class CorrectionSignal:
    severity: Literal["nudge", "redirect", "halt"]
    reason: str
    payload: dict[str, Any] = field(default_factory=dict)

EvalAgent¶

class EvalAgent(AgentProcess):
    def __init__(
        self,
        name: str,
        max_corrections_per_window: int = 10,
        window_seconds: float = 60.0,
        **kwargs,
    ): ...

    async def on_eval_event(self, event: EvalEvent) -> CorrectionSignal | None:
        """Override to implement eval logic. Return None to take no action."""
        return None

EvalAgent.handle() receives civitas.eval.event messages, calls on_eval_event(), checks the rate limiter, then sends the correction. For halt, it sends a civitas.eval.halt message which breaks the target agent's message loop cleanly (same path as graceful shutdown — on_stop() still runs).

Rate limiting uses a sliding window keyed by target agent name. Once an agent has received max_corrections_per_window corrections in the last window_seconds, further corrections are dropped (and logged). This prevents correction storms against a broken agent.

AgentProcess integration¶

Two additions to AgentProcess:

async def emit_eval(
    self,
    event_type: str,
    payload: dict[str, Any],
    eval_agent: str = "eval_agent",
) -> None:
    """Send an EvalEvent to the named EvalAgent. No-op if bus not wired."""

async def on_correction(self, message: Message) -> None:
    """Called when this agent receives a civitas.eval.correction message.
    Override to react to nudge/redirect signals. Default: no-op."""

civitas.eval.halt is handled in _message_loop() — it breaks the loop the same way _agency.shutdown does, ensuring on_stop() always runs.

Topology YAML¶

supervision:
  name: root
  strategy: ONE_FOR_ONE
  children:
    - type: civitas.evalloop.EvalAgent
      name: eval_agent

    - type: myapp.agents.ResearchAgent
      name: researcher

type: eval_agent shorthand is also supported in Runtime.from_config().

EvalExporter protocol (M2.6)¶

Defined in M2.5 as an interface; implemented in M2.6. Remote eval engines receive the same EvalEvent objects, translated to their expected format.

class EvalExporter(Protocol):
    async def export(self, event: EvalEvent) -> None:
        """Forward an EvalEvent to a remote eval engine."""
        ...

emit_eval() will forward to all registered exporters in addition to the local EvalAgent. Each exporter is responsible for translating EvalEvent to the target platform's format:

Platform	Integration model	Notes
Arize Phoenix	OTEL GenAI spans via OTLP	Strongest OTEL support; instrument once
Fiddler	Fiddler SDK + `fiddler-client`	Two-way: sends events, receives guardrail decisions
Langfuse	Langfuse Python SDK	Open-source; self-hostable
Braintrust	Braintrust Python SDK	Strong eval science focus
LangSmith	LangSmith SDK	LangChain ecosystem

Fiddler is the most interesting integration: Fiddler can return a guardrail decision (block/allow) synchronously. The FiddlerExporter would surface this as a CorrectionSignal, making Fiddler a remote eval engine that drives local halt behaviour.

OTEL alignment¶

EvalEvent fields map to OTEL GenAI Semantic Conventions:

EvalEvent field	OTEL GenAI attribute
`agent_name`	`gen_ai.agent.name`
`event_type`	`gen_ai.operation.name`
`payload["model"]`	`gen_ai.request.model`
`payload["input_tokens"]`	`gen_ai.usage.input_tokens`
`payload["output_tokens"]`	`gen_ai.usage.output_tokens`
`payload["content"]`	`gen_ai.output.text`
`trace_id`	OTEL trace context

This alignment means a single emit_eval() call produces data consumable by any OTEL-native platform without transformation.

Open questions¶

Two-way Fiddler guardrails — should FiddlerExporter.export() be async and block the agent until Fiddler responds? Fiddler claims sub-100ms latency; this is viable but adds per-eval latency to the hot path. Alternative: fire-and-forget export, Fiddler sends halt back asynchronously.
EvalExporter registration — should exporters be registered on the EvalAgent instance or globally on the Runtime? Per-agent is more flexible but adds configuration surface.
Sampling — high-throughput agents may emit thousands of eval events per second. Should emit_eval() support a sampling rate, or should that be the exporter's concern?
Correction acknowledgement — should agents be required to acknowledge corrections? Currently on_correction() is a best-effort hook with no reply.

Acceptance criteria (M2.5)¶

EvalAgent can receive eval events from any agent in the supervision tree
on_eval_event() is the single override point — no other methods required
nudge and redirect signals delivered via on_correction() hook
halt stops the target agent cleanly (on_stop() still runs)
Rate limiter drops excess corrections silently (logged at WARNING)
emit_eval() is a no-op when no bus is wired (safe to call in tests)
type: eval_agent supported in topology YAML
≥ 12 unit tests; ≥ 1 integration test with a real supervision tree
EvalExporter protocol defined and documented, not yet implemented