Rich adapters share a common substrate behind the Trace contract¶

There is no longer a single rich adapter — and the one that was meant to be it, ACP, cannot play that role yet. The rich paths that carry the load today are native: openai (Tracing + Interaction) and claude-code-stream (Tracing). ACP — the intended one-contract unifier of ADR 0003 — is only partially realized and uneven across providers (not every capability is reachable from every agent), and a native Claude Agent SDK adapter is planned. So ADR 0003's organizing rationale — one adapter giving "one event-translation path, one interaction-handling path, one place to add agents" — has inverted into several of each. This ADR supersedes 0003's operative claim while keeping ACP's aspiration alive: the unifying force today is the Trace contract plus a shared rich-adapter substrate — deep modules (Tool Kind normalization, a sandbox-confined filesystem, Trace event shaping, and permission mediation) that every rich Adapter composes rather than re-implements; ACP, as it matures toward the one-contract dream, composes the same substrate.

Status¶

accepted (supersedes ADR 0003; extends ADR 0006 and ADR 0009)

Context¶

ADR 0003 routed every Tracing/Interaction agent — even Claude, at the cost of a Node bridge — through one ACP Adapter, precisely so the framework had a single translation path, a single interaction-handling path, and one place to add agents.

That premise no longer holds:

ADR 0006 added claude-code-stream, a second rich path, and explicitly demoted ACP "from the rich layer to one of several."
ADR 0009 added the in-process openai Adapter, a third rich path, concluding "the rich path need not shell out to any external agent."

Both were accepted; neither marked 0003 superseded. So 0003 stood as accepted while its central claim — ACP is the single rich adapter — had become false twice over: not only do native rich paths now exist, but ACP itself cannot yet be the single adapter — it is implemented unevenly and not every capability is reachable from every provider, so the working rich load is carried by the native adapters while ACP remains a maturing aspiration. The cost surfaced as duplication of exactly the concerns 0003 promised to centralize (the duplicated shapes below include ACP's own partial implementation, which exists in the tree today):

Three Tool Kind tables (harness/acp.py:30 map_kind, harness/openai_agent.py:50 tool_kind, harness/claude_stream.py:27 tool_kind) produce the Tool Kind — the portable cross-model grading axis graders depend on (CONTEXT.md) — yet only the ACP Adapter supports per-agent overrides and nothing tests that the axis is applied consistently.
Two _safe_path confinement functions (harness/acp.py:453, harness/openai_agent.py:519) that have already drifted (one resolves the sandbox twice and raises AcpError; the other defaults rel="." and raises ValueError) — a security surface where a sandbox-escape fix must land twice.
Two hand-rolled permission-mediation paths (acp.py:402, openai_agent.py:317) — the very "one interaction-handling path" 0003 promised.
Free-form Trace emission — every Adapter shapes events through raw sink.emit(**payload) and tracks parent_id ad-hoc (a dict in ACP, a local variable elsewhere), so the schema graders rely on is enforced nowhere.

ADR 0006 names the fourth rich Adapter — the Claude Agent SDK path — as planned. Without a decision it will copy a fourth Tool Kind table and a third _safe_path. The duplication is not a set of independent cleanups; it is the unpaid bill from 0003's collapse, and it compounds with every adapter.

Decision¶

The Trace stays the external contract (CONTEXT.md): graders and the LangFuse export read the Trace, never a source protocol. Below that contract, introduce a rich-adapter substrate — deep modules every rich Adapter composes:

Tool Kind normalization. One module owns TOOL_KINDS plus a registry of per-Adapter mapping tables (each native vocabulary genuinely differs — ACP kinds, the openai fixed tool surface, Claude tool names) and per-Adapter overrides, with one conformance test asserting every Adapter's table resolves into the enum. Adapters register a table; none re-implement tool_kind.
Sandbox-confined filesystem. One harness/sandbox_fs.py module owns the genuinely shared, two-consumer surface — safe_resolve plus read and write — with one escape rule, one error type, one truncation constant. The ACP fs/read·fs/write handlers and the openai read_file/write_file tools both compose it. The richer openai-only bodies (edit / grep / list) stay local to that adapter, built on safe_resolve: they have a single consumer, so by "two adapters = a real seam, one = hypothetical" they do not belong in the shared module. (Named sandbox_fs, not sandbox — sandbox.py already owns sandbox creation; the two "sandbox" concepts are distinct.) Today only openai's confinement has an escape test (test_openai_agent.py:330); ACP's is untested — consolidation lets one escape test cover both, which is the security locality this seam is really for.
Trace event shaping. A typed TraceBuilder over the sink owns the canonical event shapes (message / thought / tool_call / tool_result / usage / permission_* / stop) and the parent-id nesting. Adapters call typed methods instead of free-form emit.
Permission mediation. One helper owns the permission_request → policy.answer → permission_response emit pair; ACP, openai, and any future Interaction-capable Adapter share it (the planned SDK Adapter's canUseTool callback routes through it).

Scope. The substrate is for rich Adapters (Tracing and/or Interaction). Output-only Adapters (echo, cli_agent, claude_code) are untouched — they shape no Trace and confine no paths. ACP is kept as a future adapter — the intended unifier of all clients under one contract (the 0003 dream), to be realized incrementally as providers' ACP support firms up. Its present, partial implementation composes the substrate like any other rich Adapter, so maturing ACP means filling in behaviour, not re-deriving the cross-adapter concerns. New rich Adapters MUST compose the substrate, not re-derive it.

Migration. The existing three Adapters move onto the substrate incrementally; each step is a behaviour-preserving refactor whose Trace output is unchanged, which the existing trace tests pin. The substrate is the precondition for the planned Claude Agent SDK Adapter.

Considered options¶

Keep 0003 as written (one ACP Adapter). Already abandoned in practice by 0006 and 0009; pretending otherwise leaves both the stale decision and the duplication. Rejected.
Several independent rich Adapters, duplication accepted (status quo). Cheapest today, but the cross-model grading axis has no owner or conformance test, the confinement function drifts (a security risk), and every new Adapter multiplies the cost. Rejected.
A shared substrate behind the Trace contract (chosen). One home for the genuinely cross-Adapter concerns; the Trace remains the external contract; Adapters stay free to differ where they truly differ — transport, and who owns the agent loop. Cost: an internal refactor of three Adapters.
A single mega-Adapter / deep base class holding everything. Rejected — the Adapters differ in substance (ACP translates an external process; openai owns the loop in-process). Composition of small deep modules fits that reality; a fat inheritance chain would force the shallow CLI Adapters to carry rich-path machinery they never use.

Consequences¶

The Tool Kind axis — the portable cross-model grading contract — gains one owner and a conformance test; graders can trust it across Adapters and overrides are available to all.
Sandbox-escape logic concentrates in one tested module: locality on a security surface.
A Trace schema change lands once and every Adapter inherits it; the Trace contract that is touchstone's differentiator is now enforced, not merely described.
The planned Claude Agent SDK Adapter composes the substrate instead of copy-pasting; its canUseTool callback maps onto the shared permission mediation, as ADR 0006 anticipated.
ADR 0003 is superseded, but its aspiration is preserved. ACP is retained as the hoped-for one-contract unifier of all clients — a future unifying adapter, not today's mandated mechanism. The substrate is what keeps the native adapters coherent while that dream matures (and a hedge should some providers never fully support ACP).
Short-term cost: three behaviour-preserving Adapter refactors, guarded by the existing trace tests; no change to the Trace, the Grader interface, or any Case.
CONTEXT.md gains Substrate as a term so the shared rich-adapter layer has a name in the ubiquitous language.