Claude Code runner (`claude`)¶

The claude runner drives the Claude Code CLI in headless print mode (runners.claude_runner.ClaudeRunner).

harnessgym run \
  --task task.md --workspace . --iterations 3 \
  --runner claude --claude-model sonnet

How phases map to Claude Code¶

Phase	Command
attempt	`claude -p --output-format json <prompt>`
reflect	`claude -p --output-format json --resume <session_id> <prompt>`
build / repair	`claude -p --output-format json --resume <session_id> <prompt>`

The runner parses Claude's JSON session_id, captures stdout/stderr/ transcripts, and enforces process-group timeouts just like the Codex runner.

Flags¶

Flag	Default	Meaning
`--claude-bin`	`claude`	path to the Claude Code executable
`--claude-model`	—	model alias or full name, e.g. `sonnet` or `opus`
`--claude-permission-mode`	`bypassPermissions`	permission mode for autonomous runs
`--claude-max-budget-usd`	—	optional per-phase print-mode spend cap
`--claude-extra-arg`	—	repeatable escape hatch for newer Claude CLI flags

bypassPermissions is the default because HarnessGym runs Claude Code autonomously with no human in the loop.

MCP activation for Claude Code¶

Qualified MCP servers are written into a repo-local Claude MCP config at .harnessgym/claude_mcp_config.json, and Claude Code is launched with:

--strict-mcp-config --mcp-config .harnessgym/claude_mcp_config.json \
--allowedTools=mcp__<server>

--allowedTools=mcp__<server> is Claude's server-level permission token, which grants the generated MCP's tools.

The stdio framing bridge¶

HarnessGym-generated MCP servers use Content-Length framing (the same framing Codex expects), while Claude Code speaks newline-delimited MCP JSON over stdio. To bridge the two, each generated server is wrapped with harnessgym.claude_mcp_bridge, which translates between the framings and logs every tools/call to .harnessgym/mcp_calls.jsonl. The bridge's response timeout follows each manifest's MCP tool timeout.

This is why a server generated under the Codex runner activates unchanged under the Claude runner: the bridge handles the framing difference, and the registry stays runner-agnostic.

A real validation¶

The Claude runner was validated end to end on the tensor-layout task, including hardening to confirm Claude actually called the generated MCP tools rather than merely activating them. See Claude Code Runner and Claude MCP Telemetry.

Codex runner → Telemetry →

Claude Code runner (claude)¶