Backends¶
All agent sessions in Houmao are executed by a backend. Each backend implements the InteractiveSession protocol, providing a uniform interface for prompt delivery, interruption, and termination regardless of the underlying agent tool or execution mode.
The canonical backend list is defined by the BackendKind literal type in src/houmao/agents/realm_controller/models.py. Adding a new backend requires updating BackendKind, wiring the backend through launch_plan.py, and implementing the InteractiveSession protocol.
BackendKind¶
BackendKind = Literal[
"local_interactive",
"claude_headless",
"codex_headless",
"gemini_headless",
"codex_app_server",
"cao_rest",
"houmao_server_rest",
]
Backend dispatch¶
flowchart LR
LP["LaunchPlan<br/>.backend"]
LI["LocalInteractiveSession<br/>persistent TUI in tmux pane<br/>send_prompt via tmux paste<br/>captures pane output"]
CH["ClaudeHeadlessSession<br/>claude -p per turn<br/>JSON streaming output<br/>--continue for resume"]
CX["CodexHeadlessSession<br/>codex exec --json per turn<br/>resume via thread_id"]
GH["GeminiHeadlessSession<br/>gemini -p per turn<br/>--resume <session_id>"]
LEG["Legacy backends<br/>cao_rest / houmao_server_rest<br/>delegate to external server"]
LP -->|local_interactive| LI
LP -->|claude_headless| CH
LP -->|codex_headless<br/>codex_app_server| CX
LP -->|gemini_headless| GH
LP -->|legacy| LEG
Backend reference¶
local_interactive (primary)¶
Source: backends/local_interactive.py
The primary backend for interactive agent sessions. The agent runs as a real interactive CLI process inside a tmux pane, preserving the tool's native user experience (colors, interactive prompts, streaming output).
- Session class:
LocalInteractiveSession - Prompt delivery: via tmux paste-buffer, which simulates typing the prompt into the agent's stdin.
- Role injection: bootstrap message sent as the first-turn prompt (see Role Injection).
- Use case: development, debugging, and any workflow where direct interactive access to the agent is valuable.
claude_headless¶
Source: backends/claude_headless.py
Runs Claude Code CLI in headless mode (claude -p --verbose). Output is captured programmatically rather than displayed in an interactive terminal.
- Session class:
ClaudeHeadlessSession(extendsHeadlessInteractiveSession) - Resume:
--continueflag to resume a previous conversation. - Role injection: when the role prompt is non-empty, Houmao passes
--append-system-prompt <prompt>and sends one bootstrap message on the first turn. Empty prompts skip both startup inputs. - Use case: automated pipelines, batch processing, and non-interactive agent orchestration.
codex_headless¶
Source: backends/codex_headless.py
Runs Codex CLI in headless mode (codex exec --json). Produces structured JSON output for programmatic consumption.
- Session class:
CodexHeadlessSession(extendsHeadlessInteractiveSession) - Resume:
resume <thread_id>command to continue a previous thread. - Role injection: when the role prompt is non-empty, Houmao passes
-c developer_instructions=<prompt>. Empty prompts skip the developer-instructions flag. - Use case: automated pipelines, structured output processing, and non-interactive agent orchestration.
gemini_headless¶
Source: backends/gemini_headless.py
Runs Gemini CLI in headless mode (gemini -p).
- Session class:
GeminiHeadlessSession(extendsHeadlessInteractiveSession) - Auth lanes: managed Gemini homes support
GEMINI_API_KEYwith optionalGOOGLE_GEMINI_BASE_URL, or OAuth via projectedoauth_creds.json. OAuth-backed homes injectGOOGLE_GENAI_USE_GCA=truewhen no explicit API-key or Vertex selector is already present. - Managed skills: Houmao-owned Gemini skills project into
.gemini/skills;.agents/skillsis only Gemini's upstream alias surface and is not Houmao's maintained contract. - Resume:
--resume <session_id>when the session manifest already persists a Gemini session id. Resume stays bound to the same recorded working directory/project context. - Role injection: bootstrap message sent as the first-turn prompt.
- Use case: automated pipelines and non-interactive agent orchestration.
Gemini validation checklist¶
- API-key lane: create a Gemini auth bundle with
--api-keyand optional--base-url, build or launch a managed Gemini home, and confirm the effective launch environment exportsGEMINI_API_KEYplusGOOGLE_GEMINI_BASE_URLwhen configured. - OAuth lane: create a Gemini auth bundle with
oauth_creds.jsononly, build or launch a managed Gemini home, and confirm the runtime exportsGOOGLE_GENAI_USE_GCA=truewithout depending on a user-global Geminisettings.json. - Skill projection: inspect the constructed home and confirm Houmao-owned Gemini skills land under top-level
.gemini/skills/houmao-.../; Houmao does not maintain a parallel.agents/skillsmirror. - First-turn capture: verify the first
stream-jsonGemini turn emits asession_idand that Houmao persists that id into the managed session manifest. - Resume behavior: send a follow-up Gemini prompt from the same working directory and confirm Houmao launches
gemini -p --resume <persisted-session-id>; changing the working directory should fail explicitly instead of silently retargeting another Gemini project store.
codex_app_server¶
Source: backends/codex_app_server.py
Runs Codex in app-server mode, which exposes a local HTTP interface for communication instead of using stdin/stdout.
- Role injection: native developer instructions (same as
codex_headless). - Use case: scenarios requiring HTTP-based interaction with the Codex agent.
cao_rest (legacy)¶
Source: backends/cao_rest.py
Legacy backend that delegates session management to an external server via REST API. Planned for removal.
- Role injection: profile-based injection via the external server.
- Note: standalone operator use of
backend='cao_rest'is retired in favor ofhoumao-server+houmao-mgr.
houmao_server_rest (legacy)¶
Source: backends/houmao_server_rest.py
Legacy server-backed path that wraps cao_rest internally, routing through the Houmao server layer.
- Role injection: profile-based injection via the server.
- Note: this backend exists for backward compatibility and is expected to be consolidated with the newer
houmao-serverarchitecture.
Headless backend base class¶
The three headless backends (claude_headless, codex_headless, gemini_headless) share a common base class: HeadlessInteractiveSession.
This base class manages:
- tmux-backed process execution: even in headless mode, the agent process runs inside a tmux pane for uniform process management, signal delivery, and output capture.
- Resumable session state via
HeadlessSessionState, which tracks: session_id— backend-specific session/thread identifier for resume.turn_index— number of prompt turns completed in this session.role_bootstrap_applied— whether the first-turn role bootstrap message has been delivered.
This shared infrastructure ensures consistent behavior across headless backends for concerns like process lifecycle, output buffering, and session persistence.
InteractiveSession protocol¶
All backends implement the InteractiveSession protocol:
class InteractiveSession(Protocol):
def send_prompt(self, prompt: str) -> list[SessionEvent]: ...
def interrupt(self) -> SessionControlResult: ...
def terminate(self) -> SessionControlResult: ...
def close(self) -> None: ...
See Session Lifecycle for details on how the protocol is used.
See also¶
- Launch Plan — how backend-specific launch plans are composed
- Role Injection — per-backend role injection strategies
- Session Lifecycle — how backends are used within the session lifecycle