You hold to talk (PTT — safe for a public inspection). The agent speaks a short verdict and drops an appraisal card into the transcript, pinned to the call so you can audit it after you hang up. Voice is the fast way to ask; the eyes do the comparing.
RESEARCH §4: voice is multimodal, not voice-only — a spoken valuation you can't check is useless. Every number is a computed fact (KNOW, mint-coloured) the agent narrates around; the verdict links to the 6 comps (cite-or-abstain). The agent volunteers a missing-data gap rather than faking completeness.
The “6 comps behind this number” strip inside the card — each comp is a tappable source card linking to the agency-direct listing, caught 3 days early.
Real shape: Eastern-Suburbs addresses, plausible Woollahra 3-bed comps, $/m², SEIFA decile, the KNOW-vs-RATE split. Faked: static mockup — no live audio, the waveforms/spinner are CSS loops, picsum stands in for listing photos, the comp figures are illustrative.
Most shippable of the three: it's a familiar call screen grammar (PTT, transcript, end-call). The card "drops in" but the layout stays a linear thread. The inventive variant dissolves the orb itself into the artifacts.