Realm Controller¶
houmao.agents.realm_controller is the run-phase orchestration layer in Houmao. It takes a built brain manifest and a role, composes them into a backend-specific launch plan, and manages the full lifecycle of an interactive agent session — start, resume, prompt, and stop.
Source Location¶
src/houmao/agents/realm_controller/ — lazy public exports for realm-controller runtime helpers.
Key modules:
| Module | Responsibility |
|---|---|
models.py |
Canonical backend and session contracts (BackendKind, InteractiveSession, RuntimeSessionController) |
launch_plan.py |
Bridge between build-time manifest and run-time launch (LaunchPlan, build_launch_plan()) |
session.py |
Session lifecycle functions (start_runtime_session(), resume_runtime_session()) |
cli.py |
houmao-cli entrypoint (deprecated; prefer houmao-mgr) |
backends/ |
Per-backend session implementations |
Backend Model¶
Every agent session runs on a specific backend. The BackendKind literal type in models.py is the authoritative list:
BackendKind = Literal[
"local_interactive",
"codex_headless",
"codex_app_server",
"claude_headless",
"gemini_headless",
"cao_rest",
"houmao_server_rest",
]
local_interactive — Primary Backend¶
The default and recommended backend. Launches agent CLI tools as tmux-backed interactive sessions via LocalInteractiveSession. This gives each agent a real terminal with full native UX, scrollback, and the ability to attach/detach at will.
All local backends (codex, claude, gemini headless modes) ultimately use the shared tmux runtime primitives.
Model Selection (Claude Code)¶
Claude model selection remains environment-driven. Set ANTHROPIC_MODEL for the primary model, and use optional companion variables such as ANTHROPIC_SMALL_FAST_MODEL, CLAUDE_CODE_SUBAGENT_MODEL, and ANTHROPIC_DEFAULT_*_MODEL aliases when your local Claude profile needs explicit pins.
Houmao does not invent a second model-selection layer in the realm controller. The launch plan projects the effective runtime home and allowlisted environment, and Claude reads the selected model configuration from that environment during launch.
codex_headless¶
Runs codex exec --json for non-interactive, structured prompt–response cycles. Supports resume via resume <thread_id>.
codex_app_server¶
Runs Codex in app-server mode for persistent, server-backed sessions.
claude_headless¶
Runs claude -p for non-interactive prompt–response cycles. Supports session continuation via --continue.
gemini_headless¶
Runs gemini -p for non-interactive prompt–response cycles. Managed Gemini homes support either GEMINI_API_KEY with optional GOOGLE_GEMINI_BASE_URL, or OAuth via oauth_creds.json; OAuth-backed homes inject GOOGLE_GENAI_USE_GCA=true when no explicit API-key or Vertex selector is already present. Houmao-owned Gemini skills project into .gemini/skills, while .agents/skills remains only Gemini's upstream alias surface. Follow-up turns resume with --resume <persisted-session-id> and must stay in the same recorded working directory.
Legacy Backends¶
Unmaintained — Deprecated Backends. The
cao_restandhoumao_server_restbackends remain in the codebase as legacy escape hatches, but their documentation is no longer actively maintained. Content below may be incorrect or stale. Preferlocal_interactivefor new work.
The following backends exist for backward compatibility and are planned for removal:
Unmaintained — Deprecated Backend. The
cao_restbackend remains in the codebase as an escape hatch. Its documentation is no longer actively maintained and the description below may be incorrect or stale. Preferlocal_interactivefor new work.
cao_rest— Delegates session lifecycle to an external CAO (CLI Agent Orchestrator) server over REST. This was the original orchestration path and carries significant complexity (~86 KB of integration code). New workflows should not target this backend.
Unmaintained — Deprecated Backend. The
houmao_server_restbackend remains in the codebase as an escape hatch. Its documentation is no longer actively maintained and the description below may be incorrect or stale. Preferlocal_interactivefor new work.
houmao_server_rest— A thin wrapper that routes requests throughhoumao-server, which itself delegates tocao_rest. Shares the same deprecation trajectory.
Launch Plan¶
LaunchPlan is the backend-agnostic data object that bridges build-time and run-time. It is constructed by build_launch_plan() from a brain manifest and a role, and encapsulates everything needed to start a session:
| Field | Description |
|---|---|
backend |
Target BackendKind |
tool |
Agent CLI tool name (e.g., codex, claude, gemini) |
executable |
Resolved path to the tool binary |
args |
CLI arguments for the tool |
working_directory |
Working directory for the agent process |
home_env_var |
Environment variable name for the tool’s home directory (e.g., CODEX_HOME) |
home_path |
Resolved path to the projected runtime home |
env |
Merged environment variables (allowlisted vars + launch overrides) |
role_injection |
Backend-specific role injection payload |
metadata |
Freeform metadata passed through to the session |
mailbox |
Optional mailbox binding for inter-agent messaging |
Launch overrides from recipes and direct builds are intentionally limited to secret-free settings. Protocol-required arguments such as claude -p, gemini -p, codex exec --json, resume, and app-server stay backend-owned and are not exposed as overrides.
Session Lifecycle¶
The InteractiveSession Protocol¶
All backends implement the InteractiveSession protocol defined in models.py:
send_prompt(prompt)— Send a prompt to the running agent and return the response.interrupt()— Send an interrupt signal to the agent process.terminate()— Terminate the agent session.close()— Clean up resources associated with the session.
RuntimeSessionController¶
The RuntimeSessionController manages the full session lifecycle. It holds references to the active session, its manifest, and backend-specific state.
Starting a Session¶
start_runtime_session() takes a LaunchPlan and:
- Resolves the backend implementation from
LaunchPlan.backend. - Creates the tmux session (for local backends) or connects to the remote endpoint (for REST backends).
- Injects the role via the backend-specific injection strategy.
- Persists the session manifest to disk.
- Returns the active
InteractiveSession.
Resuming a Session¶
resume_runtime_session() reattaches to an existing session from a persisted manifest. The resume mechanism is backend-specific:
local_interactive: Reattaches to the existing tmux session.codex_headless: Usesresume <thread_id>to continue the Codex thread.claude_headless: Uses--continueto resume the Claude session.gemini_headless: Uses--resume <persisted-session-id>to resume the Gemini session in the same recorded working directory/project context.
Sending Prompts¶
Once a session is running, send_prompt() delivers a user prompt to the agent and returns the response. For local_interactive sessions, this sends keystrokes to the tmux pane. For headless backends, this invokes the CLI tool with the prompt as an argument.
Stopping a Session¶
terminate() stops the agent process and close() cleans up session resources. For tmux-backed sessions, this kills the tmux session.
Role Injection¶
Role injection is backend-specific and handled during launch plan construction in launch_plan.py:
| Backend | Injection Strategy |
|---|---|
codex_headless / codex_app_server |
Native developer instructions when the role prompt is non-empty |
claude_headless |
Native appended system prompt plus a bootstrap message when the role prompt is non-empty |
gemini_headless |
Bootstrap message when the role prompt is non-empty |
local_interactive |
Tool-dependent native injection or bootstrap, skipped when the role prompt is empty |
The role content comes from the role package (roles/<role>/system-prompt.md) in the agent definition directory.
Versioned Unattended Launch Policy¶
Unattended startup is a versioned launch policy resolved at launch time against the installed CLI version of the target tool. If the installed version does not match a known launch policy, the session fails closed rather than guessing a bootstrap strategy. This prevents silent behavioral drift when CLI tools update their interfaces.
CLI Surface¶
The realm controller exposes a CLI via houmao-cli (the cli.py module) with the following commands:
build-brain— Run the build phase and emit a brain manifest.start-session— Build a launch plan and start a new session.send-prompt— Send a prompt to a running session.gateway-send-prompt— Send a prompt through a gateway-attached session.send-keys— Send raw control input to a resumed session.gateway-interrupt— Submit an interrupt through an attached live gateway.stop-session— Stop a session.attach-gateway— Attach a live gateway to a running session.detach-gateway— Detach a live gateway without stopping the session.gateway-status— Read gateway status from the live gateway or stable state artifact.cleanup-registry— Remove stale shared-registry live-agent directories.mail— Run mailbox operations against a resumed session.
Note:
houmao-cliis a deprecated compatibility entrypoint. The preferred CLI surface ishoumao-mgr(src/houmao/srv_ctrl/cli.py), which provides the same lifecycle operations along with managed agent, gateway, mailbox, and server control. See houmao-mgr CLI reference.