Realm Controller¶
houmao.agents.realm_controller is the run-phase orchestration layer in Houmao. It takes a built brain manifest and a role, composes them into a backend-specific launch plan, and manages the full lifecycle of an interactive agent session — start, resume, prompt, and stop.
Source Location¶
src/houmao/agents/realm_controller/ — lazy public exports for realm-controller runtime helpers.
Key modules:
| Module | Responsibility |
|---|---|
models.py |
Canonical backend and session contracts (BackendKind, InteractiveSession, RuntimeSessionController) |
launch_plan.py |
Bridge between build-time manifest and run-time launch (LaunchPlan, build_launch_plan()) |
session.py |
Session lifecycle functions (start_runtime_session(), resume_runtime_session()) |
cli.py |
Retired compatibility CLI helpers retained only for internal/debug use; the package no longer installs houmao-cli |
backends/ |
Per-backend session implementations |
Backend Model¶
Every agent session runs on a specific backend. The BackendKind literal type in models.py is the authoritative list:
BackendKind = Literal[
"local_interactive",
"codex_headless",
"codex_app_server",
"claude_headless",
"gemini_headless",
"cao_rest",
"houmao_server_rest",
]
local_interactive — Primary Backend¶
The default and recommended backend. Launches agent CLI tools as tmux-backed interactive sessions via LocalInteractiveSession. This gives each agent a real terminal with full native UX, scrollback, and the ability to attach/detach at will.
All local backends (codex, claude, gemini headless modes) ultimately use the shared tmux runtime primitives.
Model Selection (Claude Code)¶
Claude model selection remains environment-driven. Set ANTHROPIC_MODEL for the primary model, and use optional companion variables such as ANTHROPIC_SMALL_FAST_MODEL, CLAUDE_CODE_SUBAGENT_MODEL, and ANTHROPIC_DEFAULT_*_MODEL aliases when your local Claude profile needs explicit pins.
Houmao does not invent a second model-selection layer in the realm controller. The launch plan projects the effective runtime home and allowlisted environment, and Claude reads the selected model configuration from that environment during launch.
codex_headless¶
Runs codex exec --json for non-interactive, structured prompt–response cycles. Supports resume via resume <thread_id>.
codex_app_server¶
Runs Codex in app-server mode for persistent, server-backed sessions.
claude_headless¶
Runs claude -p for non-interactive prompt–response cycles. Supports session continuation via --continue.
gemini_headless¶
Runs gemini -p for non-interactive prompt–response cycles. Managed Gemini homes support either GEMINI_API_KEY with optional GOOGLE_GEMINI_BASE_URL, or OAuth via oauth_creds.json; OAuth-backed homes inject GOOGLE_GENAI_USE_GCA=true when no explicit API-key or Vertex selector is already present. Houmao-owned Gemini skills project into .gemini/skills, while .agents/skills remains only Gemini's upstream alias surface. Follow-up turns resume with --resume <persisted-session-id> and must stay in the same recorded working directory.
Legacy/Internal Backends¶
cao_rest and houmao_server_rest may still appear in type definitions or old manifests so retained internal compatibility code can reject or inspect them explicitly. They are not supported public launch targets. New operator workflows should use local_interactive, native headless backends, houmao-mgr managed-agent commands, or passive-server-owned headless routes.
cao_rest— Retired standalone CAO compatibility boundary. Public CLI paths reject new standalone starts with migration guidance.houmao_server_rest— Retired old-server REST backend identity. Runtime refuses to create new sessions with this backend; use maintained local/headless manager flows orhoumao-passive-serverAPIs.
Launch Plan¶
LaunchPlan is the backend-agnostic data object that bridges build-time and run-time. It is constructed by build_launch_plan() from a brain manifest and a role, and encapsulates everything needed to start a session:
| Field | Description |
|---|---|
backend |
Target BackendKind |
tool |
Agent CLI tool name (e.g., codex, claude, gemini) |
executable |
Resolved path to the tool binary |
args |
CLI arguments for the tool |
working_directory |
Working directory for the agent process |
home_env_var |
Environment variable name for the tool’s home directory (e.g., CODEX_HOME) |
home_path |
Resolved path to the projected runtime home |
env |
Merged environment variables (allowlisted vars + launch overrides) |
role_injection |
Backend-specific role injection payload |
metadata |
Freeform metadata passed through to the session |
mailbox |
Optional mailbox binding for inter-agent messaging |
Launch overrides from recipes and direct builds are intentionally limited to secret-free settings. Protocol-required arguments such as claude -p, gemini -p, codex exec --json, resume, and app-server stay backend-owned and are not exposed as overrides.
Session Lifecycle¶
The InteractiveSession Protocol¶
All backends implement the InteractiveSession protocol defined in models.py:
send_prompt(prompt)— Send a prompt to the running agent and return the response.interrupt()— Send an interrupt signal to the agent process.terminate()— Terminate the agent session.close()— Clean up resources associated with the session.
RuntimeSessionController¶
The RuntimeSessionController manages the full session lifecycle. It holds references to the active session, its manifest, and backend-specific state.
Starting a Session¶
start_runtime_session() takes a LaunchPlan and:
- Resolves the backend implementation from
LaunchPlan.backend. - Creates the tmux session or native headless controller for maintained local/headless backends.
- Injects the role via the backend-specific injection strategy.
- Persists the session manifest to disk.
- Returns the active
InteractiveSession.
Resuming a Session¶
resume_runtime_session() reattaches to an existing session from a persisted manifest. The resume mechanism is backend-specific:
local_interactive: Reattaches to the existing tmux session.codex_headless: Usesresume <thread_id>to continue the Codex thread.claude_headless: Uses--continueto resume the Claude session.gemini_headless: Uses--resume <persisted-session-id>to resume the Gemini session in the same recorded working directory/project context.
Sending Prompts¶
Once a session is running, send_prompt() delivers a user prompt to the agent and returns the response. For local_interactive sessions, this sends keystrokes to the tmux pane. For headless backends, this invokes the CLI tool with the prompt as an argument.
Stopping a Session¶
terminate() stops the agent process and close() cleans up session resources. For tmux-backed sessions, this kills the tmux session.
Role Injection¶
Role injection is backend-specific and handled during launch plan construction in launch_plan.py:
| Backend | Injection Strategy |
|---|---|
codex_headless / codex_app_server |
Native developer instructions when the role prompt is non-empty |
claude_headless |
Native appended system prompt plus a bootstrap message when the role prompt is non-empty |
gemini_headless |
Bootstrap message when the role prompt is non-empty |
local_interactive |
Tool-dependent native injection or bootstrap, skipped when the role prompt is empty |
The role content comes from the role package (roles/<role>/system-prompt.md) in the agent definition directory.
Versioned Unattended Launch Policy¶
Unattended startup is a versioned launch policy resolved at launch time against the installed CLI version of the target tool. If the installed version does not match a known launch policy, the session fails closed rather than guessing a bootstrap strategy. This prevents silent behavioral drift when CLI tools update their interfaces.
CLI Surface¶
The package no longer installs houmao-cli. Use houmao-mgr for maintained runtime lifecycle, managed-agent, gateway, mailbox, credential, cleanup, and project workflows. Use houmao-passive-server for maintained API-based discovery, observation, gateway proxying, mailbox proxying, and managed-headless work.