Skip to content

Backends

All agent sessions in Houmao are executed by a backend. Each backend implements the InteractiveSession protocol, providing a uniform interface for prompt delivery, interruption, and termination regardless of the underlying agent tool or execution mode.

The canonical backend list is defined by the BackendKind literal type in src/houmao/agents/realm_controller/models.py. Adding a new backend requires updating BackendKind, wiring the backend through launch_plan.py, and implementing the InteractiveSession protocol.

BackendKind

BackendKind = Literal[
    "local_interactive",
    "claude_headless",
    "codex_headless",
    "gemini_headless",
    "codex_app_server",
    "cao_rest",
    "houmao_server_rest",
]

Backend dispatch

flowchart LR
    LP["LaunchPlan<br/>.backend"]

    LI["LocalInteractiveSession<br/>persistent TUI in tmux pane<br/>send_prompt via tmux paste<br/>captures pane output"]
    CH["ClaudeHeadlessSession<br/>claude -p per turn<br/>JSON streaming output<br/>--continue for resume"]
    CX["CodexHeadlessSession<br/>codex exec --json per turn<br/>resume via thread_id"]
    GH["GeminiHeadlessSession<br/>gemini -p per turn<br/>--resume <session_id>"]
    LEG["Legacy/internal backends<br/>cao_rest / houmao_server_rest<br/>rejected for new public launches"]

    LP -->|local_interactive| LI
    LP -->|claude_headless| CH
    LP -->|codex_headless<br/>codex_app_server| CX
    LP -->|gemini_headless| GH
    LP -.->|legacy manifests only| LEG

Backend reference

local_interactive (primary)

Source: backends/local_interactive.py

The primary backend for interactive agent sessions. The agent runs as a real interactive CLI process inside a tmux pane, preserving the tool's native user experience (colors, interactive prompts, streaming output).

  • Session class: LocalInteractiveSession
  • Prompt delivery: via tmux paste-buffer, which simulates typing the prompt into the agent's stdin.
  • Relaunch continuation: provider-native startup arguments are applied before respawning the TUI in tmux window 0.
  • Role injection: bootstrap message sent as the first-turn prompt (see Role Injection).
  • Use case: development, debugging, and any workflow where direct interactive access to the agent is valuable.

claude_headless

Source: backends/claude_headless.py

Runs Claude Code CLI in headless mode (claude -p --verbose). Output is captured programmatically rather than displayed in an interactive terminal.

  • Session class: ClaudeHeadlessSession (extends HeadlessInteractiveSession)
  • Resume: --continue resumes the provider's latest stored conversation; --resume <session_id> resumes an exact provider session.
  • Role injection: when the role prompt is non-empty, Houmao passes --append-system-prompt <prompt> and sends one bootstrap message on the first turn. Empty prompts skip both startup inputs.
  • Use case: automated pipelines, batch processing, and non-interactive agent orchestration.

codex_headless

Source: backends/codex_headless.py

Runs Codex CLI in headless mode (codex exec --json). Produces structured JSON output for programmatic consumption.

  • Session class: CodexHeadlessSession (extends HeadlessInteractiveSession)
  • Resume: resume --last asks Codex to continue its latest stored chat; resume <session_id> resumes an exact provider session.
  • Role injection: when the role prompt is non-empty, Houmao passes -c developer_instructions=<prompt>. Empty prompts skip the developer-instructions flag.
  • Use case: automated pipelines, structured output processing, and non-interactive agent orchestration.

gemini_headless

Source: backends/gemini_headless.py

Runs Gemini CLI in headless mode (gemini -p).

  • Session class: GeminiHeadlessSession (extends HeadlessInteractiveSession)
  • Auth lanes: managed Gemini homes support GEMINI_API_KEY with optional GOOGLE_GEMINI_BASE_URL, or OAuth via projected oauth_creds.json. OAuth-backed homes inject GOOGLE_GENAI_USE_GCA=true when no explicit API-key or Vertex selector is already present.
  • Managed skills: Houmao-owned Gemini skills project into .gemini/skills; .agents/skills is only Gemini's upstream alias surface and is not Houmao's maintained contract.
  • Resume: --resume latest asks Gemini to continue its latest stored chat; --resume <session_id> resumes an exact provider session. Resume stays bound to the same recorded working directory/project context.
  • Role injection: bootstrap message sent as the first-turn prompt.
  • Use case: automated pipelines and non-interactive agent orchestration.

Gemini validation checklist

  • API-key lane: create a Gemini auth bundle with --api-key and optional --base-url, build or launch a managed Gemini home, and confirm the effective launch environment exports GEMINI_API_KEY plus GOOGLE_GEMINI_BASE_URL when configured.
  • OAuth lane: create a Gemini auth bundle with oauth_creds.json only, build or launch a managed Gemini home, and confirm the runtime exports GOOGLE_GENAI_USE_GCA=true without depending on a user-global Gemini settings.json.
  • Skill projection: inspect the constructed home and confirm Houmao-owned Gemini skills land under top-level .gemini/skills/houmao-.../; Houmao does not maintain a parallel .agents/skills mirror.
  • First-turn capture: verify the first stream-json Gemini turn emits a session_id and that Houmao persists that id into the managed session manifest.
  • Resume behavior: send a follow-up Gemini prompt from the same working directory and confirm Houmao launches gemini -p --resume <persisted-session-id>; changing the working directory should fail explicitly instead of silently retargeting another Gemini project store.

codex_app_server

Source: backends/codex_app_server.py

Runs Codex in app-server mode, which exposes a local HTTP interface for communication instead of using stdin/stdout.

  • Role injection: native developer instructions (same as codex_headless).
  • Use case: scenarios requiring HTTP-based interaction with the Codex agent.

cao_rest (legacy/internal)

Source: backends/cao_rest.py

Legacy backend that delegated session management to an external REST API. Public operator starts for this backend are retired and fail with migration guidance.

  • Role injection: profile-based injection via the external server.
  • Note: standalone operator use of backend='cao_rest' is retired in favor of maintained houmao-mgr local/headless flows and houmao-passive-server API workflows.

houmao_server_rest (legacy/internal)

Source: removed public backend implementation; retained manifests are rejected before new session creation.

Legacy old-server-backed path. New runtime manifests with this backend are not supported.

  • Role injection: profile-based injection via the server.
  • Note: use maintained local/headless runtime backends or passive-server-owned headless launch APIs.

Headless backend base class

The three headless backends (claude_headless, codex_headless, gemini_headless) share a common base class: HeadlessInteractiveSession.

This base class manages:

  • tmux-backed process execution: even in headless mode, the agent process runs inside a tmux pane for uniform process management, signal delivery, and output capture.
  • Resumable session state via HeadlessSessionState, which tracks:
  • session_id — backend-specific session/thread identifier for resume.
  • turn_index — number of prompt turns completed in this session.
  • role_bootstrap_applied — whether the first-turn role bootstrap message has been delivered.

This shared infrastructure ensures consistent behavior across headless backends for concerns like process lifecycle, output buffering, and session persistence.

Relaunch Chat-Session Mapping

Tmux-backed relaunch accepts a relaunch-only chat-session selector. It does not affect first launch and is distinct from gateway prompt-control chat-session selectors.

Tool Local interactive latest Local interactive exact Native headless latest Native headless exact
Codex codex resume --last codex resume <session_id> codex exec resume --last <prompt> codex exec resume <session_id> <prompt>
Claude Code claude --continue claude --resume <session_id> claude -p --continue <prompt> claude -p --resume <session_id> <prompt>
Gemini CLI gemini --resume latest gemini --resume <session_id> gemini --resume latest -p <prompt> gemini --resume <session_id> -p <prompt>

When the relaunch selector is omitted, the runtime uses new and starts a fresh provider chat. When a local interactive relaunch resumes an existing provider chat, bootstrap-message role injection is not replayed into that chat.

InteractiveSession protocol

All backends implement the InteractiveSession protocol:

class InteractiveSession(Protocol):
    def send_prompt(self, prompt: str) -> list[SessionEvent]: ...
    def interrupt(self) -> SessionControlResult: ...
    def terminate(self) -> SessionControlResult: ...
    def close(self) -> None: ...

See Session Lifecycle for details on how the protocol is used.

See also