Backends¶

All agent sessions in Houmao are executed by a backend. Each backend implements the InteractiveSession protocol, providing a uniform interface for prompt delivery, interruption, and termination regardless of the underlying agent tool or execution mode.

The canonical backend list is defined by the BackendKind literal type in src/houmao/agents/realm_controller/models.py. Adding a new backend requires updating BackendKind, wiring the backend through launch_plan.py, and implementing the InteractiveSession protocol.

BackendKind¶

BackendKind = Literal[
    "local_interactive",
    "claude_headless",
    "codex_headless",
    "gemini_headless",
    "codex_app_server",
    "cao_rest",
    "houmao_server_rest",
]

Backend dispatch¶

flowchart LR
    LP["LaunchPlan<br/>.backend"]

    LI["LocalInteractiveSession<br/>persistent TUI in tmux pane<br/>send_prompt via tmux paste<br/>captures pane output"]
    CH["ClaudeHeadlessSession<br/>claude -p per turn<br/>JSON streaming output<br/>--continue for resume"]
    CX["CodexHeadlessSession<br/>codex exec --json per turn<br/>resume via thread_id"]
    GH["GeminiHeadlessSession<br/>gemini -p per turn<br/>--resume <session_id>"]
    LEG["Legacy/internal backends<br/>cao_rest / houmao_server_rest<br/>rejected for new public launches"]

    LP -->|local_interactive| LI
    LP -->|claude_headless| CH
    LP -->|codex_headless<br/>codex_app_server| CX
    LP -->|gemini_headless| GH
    LP -.->|legacy manifests only| LEG

Backend reference¶

local_interactive (primary)¶

Source: backends/local_interactive.py

The primary backend for interactive agent sessions. The agent runs as a real interactive CLI process inside a tmux pane, preserving the tool's native user experience (colors, interactive prompts, streaming output).

Session class: LocalInteractiveSession
Prompt delivery: via tmux paste-buffer, which simulates typing the prompt into the agent's stdin.
Relaunch continuation: provider-native startup arguments are applied before respawning the TUI in tmux window 0.
Role injection: bootstrap message sent as the first-turn prompt (see Role Injection).
Use case: development, debugging, and any workflow where direct interactive access to the agent is valuable.

claude_headless¶

Source: backends/claude_headless.py

Runs Claude Code CLI in headless mode (claude -p --verbose). Output is captured programmatically rather than displayed in an interactive terminal.

Session class: ClaudeHeadlessSession (extends HeadlessInteractiveSession)
Resume: --continue resumes the provider's latest stored conversation; --resume <session_id> resumes an exact provider session.
Role injection: when the role prompt is non-empty, Houmao passes --append-system-prompt <prompt> and sends one bootstrap message on the first turn. Empty prompts skip both startup inputs.
Use case: automated pipelines, batch processing, and non-interactive agent orchestration.

codex_headless¶

Source: backends/codex_headless.py

Runs Codex CLI in headless mode (codex exec --json). Produces structured JSON output for programmatic consumption.

Session class: CodexHeadlessSession (extends HeadlessInteractiveSession)
Resume: resume --last asks Codex to continue its latest stored chat; resume <session_id> resumes an exact provider session.
Role injection: when the role prompt is non-empty, Houmao passes -c developer_instructions=<prompt>. Empty prompts skip the developer-instructions flag.
Use case: automated pipelines, structured output processing, and non-interactive agent orchestration.

gemini_headless¶

Source: backends/gemini_headless.py

Runs Gemini CLI in headless mode (gemini -p).

Session class: GeminiHeadlessSession (extends HeadlessInteractiveSession)
Auth lanes: managed Gemini homes support GEMINI_API_KEY with optional GOOGLE_GEMINI_BASE_URL, or OAuth via projected oauth_creds.json. OAuth-backed homes inject GOOGLE_GENAI_USE_GCA=true when no explicit API-key or Vertex selector is already present.
Managed skills: Houmao-owned Gemini skills project into .gemini/skills; .agents/skills is only Gemini's upstream alias surface and is not Houmao's maintained contract.
Resume: --resume latest asks Gemini to continue its latest stored chat; --resume <session_id> resumes an exact provider session. Resume stays bound to the same recorded working directory/project context.
Role injection: bootstrap message sent as the first-turn prompt.
Use case: automated pipelines and non-interactive agent orchestration.

Gemini validation checklist¶

API-key lane: create a Gemini auth bundle with --api-key and optional --base-url, build or launch a managed Gemini home, and confirm the effective launch environment exports GEMINI_API_KEY plus GOOGLE_GEMINI_BASE_URL when configured.
OAuth lane: create a Gemini auth bundle with oauth_creds.json only, build or launch a managed Gemini home, and confirm the runtime exports GOOGLE_GENAI_USE_GCA=true without depending on a user-global Gemini settings.json.
Skill projection: inspect the constructed home and confirm Houmao-owned Gemini skills land under top-level .gemini/skills/houmao-.../; Houmao does not maintain a parallel .agents/skills mirror.
First-turn capture: verify the first stream-json Gemini turn emits a session_id and that Houmao persists that id into the managed session manifest.
Resume behavior: send a follow-up Gemini prompt from the same working directory and confirm Houmao launches gemini -p --resume <persisted-session-id>; changing the working directory should fail explicitly instead of silently retargeting another Gemini project store.

codex_app_server¶

Source: backends/codex_app_server.py

Runs Codex in app-server mode, which exposes a local HTTP interface for communication instead of using stdin/stdout.

Role injection: native developer instructions (same as codex_headless).
Use case: scenarios requiring HTTP-based interaction with the Codex agent.

cao_rest (legacy/internal)¶

Source: backends/cao_rest.py

Legacy backend that delegated session management to an external REST API. Public operator starts for this backend are retired and fail with migration guidance.

Role injection: profile-based injection via the external server.
Note: standalone operator use of backend='cao_rest' is retired in favor of maintained houmao-mgr local/headless flows and houmao-passive-server API workflows.

houmao_server_rest (legacy/internal)¶

Source: removed public backend implementation; retained manifests are rejected before new session creation.

Legacy old-server-backed path. New runtime manifests with this backend are not supported.

Role injection: profile-based injection via the server.
Note: use maintained local/headless runtime backends or passive-server-owned headless launch APIs.

Headless backend base class¶

The three headless backends (claude_headless, codex_headless, gemini_headless) share a common base class: HeadlessInteractiveSession.

This base class manages:

tmux-backed process execution: even in headless mode, the agent process runs inside a tmux pane for uniform process management, signal delivery, and output capture.
Resumable session state via HeadlessSessionState, which tracks:
session_id — backend-specific session/thread identifier for resume.
turn_index — number of prompt turns completed in this session.
role_bootstrap_applied — whether the first-turn role bootstrap message has been delivered.

This shared infrastructure ensures consistent behavior across headless backends for concerns like process lifecycle, output buffering, and session persistence.

Relaunch Chat-Session Mapping¶

Tmux-backed relaunch accepts a relaunch-only chat-session selector. It does not affect first launch and is distinct from gateway prompt-control chat-session selectors.

Tool	Local interactive latest	Local interactive exact	Native headless latest	Native headless exact
Codex	`codex resume --last`	`codex resume <session_id>`	`codex exec resume --last <prompt>`	`codex exec resume <session_id> <prompt>`
Claude Code	`claude --continue`	`claude --resume <session_id>`	`claude -p --continue <prompt>`	`claude -p --resume <session_id> <prompt>`
Gemini CLI	`gemini --resume latest`	`gemini --resume <session_id>`	`gemini --resume latest -p <prompt>`	`gemini --resume <session_id> -p <prompt>`

When the relaunch selector is omitted, the runtime uses new and starts a fresh provider chat. When a local interactive relaunch resumes an existing provider chat, bootstrap-message role injection is not replayed into that chat.

InteractiveSession protocol¶

All backends implement the InteractiveSession protocol:

class InteractiveSession(Protocol):
    def send_prompt(self, prompt: str) -> list[SessionEvent]: ...
    def interrupt(self) -> SessionControlResult: ...
    def terminate(self) -> SessionControlResult: ...
    def close(self) -> None: ...

See Session Lifecycle for details on how the protocol is used.