No push-to-talk — it's always-listening with barge-in (tap the orb / chip to interrupt). The breathing orb shows it's alive; the spoken caption renders the words as they're said. As the agent narrates, each claim materialises a card — appraisal, then comp chart on "here's the chart," then the suburb mini-profile — so you can scan and audit what you can't hold in your head.
RESEARCH §4 + §6: the 2025 move was OpenAI dissolving the voice orb back into content-rich cards. A content-free orb is the failure mode; a spoken valuation you can't check is worse. Here voice carries intent + a short verdict; the eyes get the computed facts (KNOW, mint) with a “tap to see the 6 comps it just mentioned” path — cite-or-abstain, spoken.
The comp chart is provenance-as-visual: every sold bar is a real comp, the over-asking guide bar in coral. Inline [1] on the spoken claim + "tap to see the 6 comps" opens the source cards.
Real shape: Bondi Beach 3-bed comps, $14.2k/m² land, decile 9, Bondi Junction transit, the KNOW-vs-RATE split. Faked: no live audio/Gemini; orb, waveforms and "materialise" timing are CSS animations; picsum/illustrative figures.
The boldest take: it abandons the chat-thread metaphor for an ambient, time-synced surface. Higher risk (timing the cards to speech is hard; latency shows), higher reward (feels like a real phone call with a smart agent). The conservative variant keeps the safe linear transcript.