01 - Real-time voice agent architecture
Front office in crisis.
A LiveKit voice agent for an orthopedic practice drowning in calls. The product call: build a constrained scheduling agent, not a broad receptionist, and prove it through a deterministic policy gate, a medical-grade voice stack, and a staff-facing review queue.
Runtime
LiveKit Agents
Low-latency voice orchestration with inspectable traces.
Voice stack
Deepgram / Cartesia / GPT-4o
Chosen for streaming, speed, and controllability.
System boundary
eClinicalWorks
Enough integration to schedule, not enough to verify payer truth.
Delivery scope
Scheduling and routing MVP
Narrower than a receptionist, safer than a broad automation claim.
02 - Operational State
900 calls. 200 callers. Patients redialing.
Inbound calls / day
~900
Observed peak call load in the operating frame used for the concept.
Unique callers
~200
The gap points to repeat dialing because callers were not getting through cleanly.
Scheduling intent
50%
Scheduling is the highest-volume candidate for bounded automation.
Billing calls
~33%
Billing routes outside the agent to an external revenue-cycle queue.
Operating environment
A practice growing faster than its front-office process.
eClinicalWorks support
Not available
Risk signal
03 - Decision Point
The decision is not whether voice AI can answer phones. The decision is whether Summit should trust an agent to perform part of its access workflow while the organization is still restructuring.
The product ruling is narrow: launch a constrained scheduling and routing agent with hard exclusions, staff review, and staged rollout. Earn scope through performance, not ambition.
04 - Product Strategy
A bounded scheduling agent, not a fantasy receptionist.
Why workers comp is excluded
Why insurance logic waits
Included in MVP
Excluded from MVP
05 - Architecture
LiveKit is the runtime. Twilio is the carrier.
This build is designed as a LiveKit showcase, not a Twilio bot. Telephony becomes an ingress channel into LiveKit, not the center of the system. Each inbound call creates a LiveKit room: a bounded context for the entire patient interaction.
The room model matters because callers interrupt, pause, search for insurance cards, ask side questions, and become frustrated. A request-response chatbot architecture would feel brittle.
LiveKit room per call
Participants
Room metadata
SIP ingress flow
06 - Technical Design
Every layer earns its place.
STT evaluation set
Voice style parameters
Body-part routing
07 - Safety Layer
The LLM proposes. The policy gate decides.
Core tool signatures
Allowed
Appointment with explicit confirmation
The scheduling tool is valid only after the caller confirms the slot and the path remains inside approved policy.
Excluded
Workers compensation booking
Workers compensation flips the session into transfer-only mode because the integration cannot validate authorization and claim data.
Excluded
Medical advice response
Clinical questions route to capture-and-escalate. The system prompt and tool gate both block generated medical guidance.
Human approval mode
08 - Voice Engineering
Barge-in and end-of-turn are two different problems.
Fast barge-in protects the caller from being talked over. Slower end-of-turn detection protects elderly or uncertain callers from being interrupted mid-thought. They are different engineering problems.
Patience prompts
No-action-on-uncertainty rule
Normal scheduling conversation
Elderly or slow caller
Yes / no confirmation
Caller searching for insurance card
Noisy audio
09 - Operational Control
The dashboard is the trust layer.
Without the staff review queue, the system becomes an invisible automation layer and staff will not trust it. The dashboard turns the agent from a black box into an auditable work surface.
Most important early metric
Not automation rate. Corrected appointment rate.
Control surface
Room inspector
Control surface
Live transcript stream
Control surface
Turn detection timeline
Control surface
Tool-call panel
Control surface
Latency metrics
Control surface
Replay mode
Control surface
Failure injection
Control surface
Staff review queue
Control surface
Transfer simulation
Control surface
Correction reason ledger
10 - Staged Rollout
Earn scope through performance.
Rollback triggers
Phase 0
Discovery and rule capture
Phase 1
Internal shadow mode
Phase 2
Human approval mode
Phase 3
Limited direct booking
Phase 4
Production V1
11 - Acceptance Gates
Targets are design gates, not fake production bragging.
Every number is a target measured against a call simulation set, not a public claim about live production performance.
Intent classification
>=92%
Target on a simulation set after one clarification pass.
DOB capture
>=96%
Required before any patient lookup or appointment logic.
Median turn latency
<700ms
End-of-turn detection to first agent audio frame.
P95 turn latency
<1.5s
Tail latency budget for harder tool-bound replies.
Invariant 01
Appointment without confirmation
Invariant 02
Workers comp booking
Invariant 03
Medical advice response
12 - Live Build
The agent on the line.
This section describes the instrumented build target without pretending a live production phone path is active.
What the demo will do
What is instrumented
Failure injection
13 - PM Recommendation
Build the voice agent, but do not build the fantasy version.
The product should be judged by whether it makes Summit calmer, not by whether it maximizes automation. The agent earns more scope only after it proves that it can schedule safely, transfer honestly, and create less work than it removes.
14 - Source Note
Serious proposal. Not a deployment claim.
The value of this page is not that it pretends to be shipped work. The value is that it treats Summit like a real product problem with operational pressure, a bounded MVP, explicit safeguards, and a concrete path to a production-ready voice system.
What this case study is
What it is not
Start Here
Start with the business boundary, then choose the AI.
Use the assessment to determine whether the right first move is a constrained intake agent, a deeper workflow build, or a more review-heavy path before any live voice rollout is considered.