Houmao Server State Tracking¶
houmao-server owns the public tracked-state contract for supported interactive TUIs, but it is no longer always the component that runs the live tracker. Clients and dashboards consume HoumaoTerminalStateResponse; they do not run a second reducer. Interactive Codex tracking now resolves through the codex_tui tracked-TUI family, while headless backend names such as codex_app_server remain outside this subsystem.
There are now two live-tracking execution modes behind the same public models:
- Direct fallback:
houmao-serverruns the tracker locally for eligible managed TUI sessions. - Gateway-owned tracking: an attached healthy per-agent gateway runs the live tracker for that one session, and
houmao-serverprojects the gateway state back through the same server routes.
This keeps one public state contract while enforcing single-owner tracking authority during attach, detach, and gateway health transitions.
The public models live in src/houmao/server/models.py, which re-exports core types from src/houmao/shared_tui_tracking/models.py. The standalone tracker engine now lives in src/houmao/shared_tui_tracking/session.py, while src/houmao/server/tui/tracking.py acts as the live host adapter that merges shared tracker state with server-owned diagnostics and lifecycle metadata. Timed readiness and completion behavior still reuse the shared ReactiveX lifecycle kernel in src/houmao/lifecycle/rx_lifecycle_kernel.py. For the file-by-file package layout of the server watch plane, see internals/tui_tracking_module.md.
Public Contract¶
For the full definition of each state value (intuitive meaning, technical derivation, operational implications), see the State Reference Guide. For state transition diagrams and operation acceptability, see the State Transitions Guide.
The public state is intentionally small:
| Field | Values | Meaning |
|---|---|---|
diagnostics.availability |
available, unavailable, tui_down, error, unknown |
Whether the current sample is usable for normal tracked-state interpretation |
surface.accepting_input |
yes, no, unknown |
Whether typed input would currently land in the prompt area |
surface.editing_input |
yes, no, unknown |
Whether prompt-area input is actively being edited now |
surface.ready_posture |
yes, no, unknown |
Whether the visible surface looks ready for immediate submit |
turn.phase |
ready, active, unknown |
Current turn posture |
last_turn.result |
success, interrupted, known_failure, none |
Most recent completed terminal outcome |
last_turn.source |
explicit_input, surface_inference, none |
Where that recorded terminal turn came from |
stability |
signature, stable, stable_for_seconds, stable_since_utc |
Generic visible-state stability for the published response |
recent_transitions |
bounded list of HoumaoRecentTransition |
Recent visible state changes kept in memory for diagnostics |
Low-level observation detail still remains available alongside the simplified model:
transport_stateprocess_stateparse_status- optional
probe_error - optional
parse_error - nullable
parsed_surface - optional
probe_snapshot - tracked-session metadata such as
tracked_session.observed_tool_version
Server-owned lifecycle sidecars still remain available for consumers that need timing or authority detail:
operator_statelifecycle_timinglifecycle_authority
lifecycle_timing now includes the configured stale-active and final stable-active recovery windows in addition to readiness unknown and anchored completion timing.
Those lower-level fields are diagnostic evidence, not the primary consumer-facing lifecycle vocabulary.
End-To-End Flow¶
A more detailed state composition flowchart reflecting the
shared_tui_tracking/module extraction is maintained in state-transitions.md § State Composition.
One tracking cycle moves through these layers:
flowchart TD
Probe["tmux probe<br/>pane capture"]
Proc["process inspection"]
Parse["official parser"]
Surf["HoumaoParsedSurface"]
Detect["tool/version<br/>raw-text detector profile"]
Shared["standalone tracker session<br/>surface / turn / last_turn"]
Ready["readiness pipeline"]
Anchor{"turn anchor?"}
Anch["anchored settle pipeline"]
Public["diagnostics + shared tracker state<br/>+ server sidecars"]
Stable["stability"]
Hist["recent_transitions"]
Probe --> Proc --> Parse --> Surf
Probe --> Detect
Surf --> Ready
Surf --> Anchor
Anchor -->|active| Anch
Surf --> Public
Detect --> Shared
Ready --> Public
Anch --> Public
Shared --> Public --> Stable --> Hist
LiveSessionTracker.record_cycle() keeps the internal timing and anchor bookkeeping, but it now maps those internals into the simplified public contract rather than exposing readiness/completion/authority terms directly. The standalone tracker session may combine single-snapshot detector output with profile-owned temporal hints over a recent sliding window before those public fields are published.
Mapping Rules¶
Diagnostics¶
diagnostics.availability is derived from the low-level observation outcome:
error: probe or parse failed for this sampleunavailable: tracked tmux target is gonetui_down: tmux is reachable but the supported TUI process is not runningavailable: parser produced a supported parsed surfaceunknown: the server is still watching, but the sample is not classifiable confidently enough for normal interpretation
Foundational Surface Observables¶
surface.accepting_input, surface.editing_input, and surface.ready_posture are built from raw tmux pane text through the shared tracker’s tool/version detector profile. Parsed surface data is published as server-owned sidecar evidence and does not feed tracker authority.
Important consequences:
- progress indicators are supporting evidence only; they are not required for
turn.phase=active - ambiguous interactive UI such as menus, selection boxes, permission prompts, or slash-command pickers degrades to
unknownunless stronger active or terminal evidence exists - unexplained churn may still change diagnostics, surface observables, stability, or transitions without creating a turn
- full tmux scrollback may still be captured for parser and history use, but current activity cues do not have to trust arbitrary historical rows
For Codex, the tracker intentionally splits the surface into two scopes:
- the prompt-scoped latest-turn region for interruption, success context, and temporal transcript growth
- the live-edge tail for current status-row and tool-cell activity
That split prevents stale historical • Working (... esc to interrupt) rows far above the prompt from pinning turn.phase=active when the current surface is already prompt-ready.
Turn Phase¶
turn.phase is intentionally narrower than the internal reducer graph:
ready: the surface looks ready for another turn nowactive: there is enough evidence that a turn is currently in flightunknown: the server cannot safely classify the posture asreadyoractive
Explicit server-owned input acceptance is enough to arm an active turn immediately. Direct interactive prompting can still become active through shared-tracker raw-snapshot evidence without any parser-derived bridge.
There are two host-owned corrections on top of the shared tracker. The narrow stale-active correction clears an unanchored session after it stays submit-ready for the configured stale-active recovery window while the shared tracker is still stuck on stale status_row evidence. The final stable-active correction uses a longer default 20-second window and clears a stable false-active posture when raw TUI evidence and published state stop changing while independent parser evidence still says the prompt is idle/freeform and input-ready. These recoveries:
- require parsed submit-ready posture plus
surface.accepting_input=yesandsurface.editing_input=no - keep the 5-second narrow recovery limited to unanchored stale
status_rowevidence - allow the final 20-second recovery to correct
surface.ready_posture=no - stop anchored completion monitoring when final recovery expires a stale active anchor
- do not manufacture
last_turn.result=success
Last Turn¶
last_turn is sticky and only changes when a tracked turn reaches a terminal outcome:
success: the latest answer-bearing ready surface passed the settle window and has no current interrupt, failure, or advisory blockerinterrupted: the detector saw a supported current interruption signatureknown_failure: the detector saw a specifically supported failure signaturenone: no terminal result has been recorded yet
Two clarifications matter:
- success does not require a
Worked for <duration>-style marker on every turn - a premature success may be retracted if a later observation proves the same turn surface was still evolving; the tracker keeps the success anchor alive briefly to allow that correction before expiry
Turn Sources And Server Anchors¶
The public last_turn.source field and the server-owned lifecycle anchor state are related, but they are not the same thing.
The shared tracker can publish two turn sources:
| Source | How it is armed | Public source |
|---|---|---|
terminal_input |
POST /houmao/terminals/{terminal_id}/input succeeds and the service calls note_prompt_submission() |
explicit_input |
surface_inference |
raw tmux snapshots show enough active/success evidence for the shared tracker to infer a direct in-terminal turn | surface_inference |
Separately, the server still owns explicit-input lifecycle anchors for parser-fed operator_state, lifecycle_timing, and lifecycle_authority.
explicit_inputcan arm both the shared tracker and the server-owned lifecycle anchor.surface_inferenceis tracker-owned only. It can produce publiclast_turn.source=surface_inferencewhile server-ownedlifecycle_authority.turn_anchor_stateremainsabsent.- Parser-derived surface churn does not arm tracker authority. Parsed surface remains server-owned evidence and lifecycle input only.
- Observed tool version for tracker profile selection comes from tracked-session metadata, primarily manifest launch-policy provenance with graceful fallback when unavailable.
Stability And Recent Transitions¶
stability is generic visible-state stability, not public completion vocabulary.
The stability signature includes:
- diagnostics
- parsed surface
- foundational surface observables
- public
turn - public
last_turn
Any change resets stable_since_utc and stable_for_seconds.
recent_transitions is a bounded in-memory change log. It now records public transition facts such as:
diagnostics_availabilitysurface_accepting_inputsurface_editing_inputsurface_ready_postureturn_phaselast_turn_resultlast_turn_source
The low-level transport/process/parse fields remain attached for debugging and coarse managed-agent availability projection.
Because stale-active and final stable-active recovery change the published surface / turn fields, they also reset visible-state stability just like any other public-state change.
Maintainer Validation¶
Maintainer-facing validation is replay-grade and content-first:
live tmux pane + runtime liveness
|
v
content-first groundtruth
^
| compare
ReactiveX replay reducer
Use the scripts/explore/claude-code-state-tracking/ workflow to compare groundtruth against replayed observations. When detector behavior is refined, capture the stable matcher change in openspec/changes/.../tui-signals/ rather than leaving it implicit in code alone.