A native stream-json Claude adapter; ACP is no longer the only rich path¶
Claude can be driven as a Tracing-capable Harness directly over the Claude Code CLI's
--output-format stream-json, with no ACP and no Node bridge. The new claude-code-stream
harness runs claude -p --output-format stream-json --verbose --dangerously-skip-permissions
and translates the first-party streamed events (assistant tool_use blocks, tool_results,
final usage/cost) into the framework's Trace. This revisits ADR 0003's "ACP as the single
rich adapter" — not by removing ACP, but by demoting it from the rich layer to one of
several, behind the Trace contract.
Status¶
accepted (revisits ADR 0003)
Context¶
ADR 0003 routed every Tracing/Interaction-capable agent through one ACP adapter, including
Claude via the claude-agent-acp npm bridge. In practice ACP is a lowest-common-denominator
protocol that some agents implement awkwardly or unstably (BENCHMARK.md already records
droid raising an "internal ACP error"), and for Claude it is strictly less direct than the
vendor's own interfaces. ADR 0003 itself anticipated this: it listed "native
claude-agent-sdk for Claude" as the max-fidelity alternative and noted a native adapter
"may be added later without changing the Trace or Interaction contracts."
The key observation is that ACP bundles two capabilities the framework already models
separately: Tracing (observe tool calls / tokens / cost) and Interaction (answer the
agent's mid-run permission requests — the bidirectional part). Every Case in the suite uses
interaction: auto-approve. Auto-approval needs no protocol — it is a launch flag. True
bidirectional interaction is needed only by scripted / llm-based / manual policies,
which only observed-droid exercises.
So for the common case (autonomous run + observe the result), the bidirectional nature is unnecessary. What is needed is (a) headless autonomy and (b) a Trace — both of which the Claude CLI provides first-party.
Decision¶
Add claude-code-stream:
- Autonomy without a protocol —
--dangerously-skip-permissionsruns Claude fully headless, so no Interaction channel is required (Capabilities(tracing=True, interaction=False)). The runner already degrades anauto-approveInteraction request to "use the harness default" with a benign warning — and the harness default here is approve, so the existing trace-grading cases run unchanged. - Observation from stream-json — the streamed events are replayed into the Trace
(
message/thought/tool_call/tool_result/usage/stop), with Claude tool names mapped to portable Tool Kinds. The finalresultevent fills output / cost / tokens.
The Trace stays the contract (CONTEXT.md). ACP remains for agents where it is the best or
only rich path (e.g. droid). Cases that genuinely need bidirectional interaction keep using a
bidirectional adapter; the natural future home for that on Claude is the Claude Agent SDK's
canUseTool callback, not ACP.
Considered options¶
- Keep everything on ACP (ADR 0003). Uniform, but inherits each agent's ACP quality and, for Claude, needs the Node bridge and is less direct than the CLI/SDK. Rejected as the universal mechanism.
- CLI
stream-json(chosen now). First-party, no Node, gives per-tool-call events; theresultline carries cost + token usage. Output-only fallback if a future CLI drops the rollup. Lowest-effort path to a stable, native Claude Trace. - Claude Agent SDK adapter (planned next). Strictly richer: structured
SDKResultMessagewith cost/usage and a nativecanUseToolpermission callback — the right home for bidirectional Claude cases (scripted/llm-based). Heavier; deferred until an interaction case needs it.
Consequences¶
- A no-Node, native Claude Harness that satisfies the suite's
tracing: true+ trace-grader cases;claude-code(output-only, final-JSON envelope) is kept for cheap, trace-free runs. - Model selection is a plain
--modelflag (e.g.claude-opus-4-8,claude-opus-4-6), so comparing Claude versions needs no per-agent model registry. - Bidirectional interaction for Claude is intentionally out of scope here; when a case needs
it, add the SDK adapter (its
canUseToolmaps onto the existing Interaction Policy).