01 - Real-time voice agent architecture

Front office in crisis.

A LiveKit voice agent for an orthopedic practice drowning in calls. The product call: build a constrained scheduling agent, not a broad receptionist, and prove it through a deterministic policy gate, a medical-grade voice stack, and a staff-facing review queue.

Public-source implementation simulation. This is a serious proposal for how the system should be designed, not a claim of live production deployment, partnership, or access to private company systems.

Start Free AI Assessment View Concept Studies

Runtime

LiveKit Agents

Low-latency voice orchestration with inspectable traces.

Voice stack

Deepgram / Cartesia / GPT-4o

Chosen for streaming, speed, and controllability.

System boundary

eClinicalWorks

Enough integration to schedule, not enough to verify payer truth.

Delivery scope

Scheduling and routing MVP

Narrower than a receptionist, safer than a broad automation claim.

02 - Operational State

900 calls. 200 callers. Patients redialing.

Summit Orthopedics fired its outsourced call center after sustained patient complaints, poor insurance knowledge, and incorrect scheduling. The product problem was not a generic phone bot. It was an access workflow under pressure.

Inbound calls / day

~900

Observed peak call load in the operating frame used for the concept.

Unique callers

~200

The gap points to repeat dialing because callers were not getting through cleanly.

Scheduling intent

50%

Scheduling is the highest-volume candidate for bounded automation.

Billing calls

~33%

Billing routes outside the agent to an external revenue-cycle queue.

Operating environment

A practice growing faster than its front-office process.

The practice had removed an outsourced call center after patient complaints, poor insurance knowledge, and incorrect scheduling. What remained was high call volume hitting a front office that had no infrastructure to absorb it.

eClinicalWorks support

Patient lookup by name and date of birth

Provider schedule and availability reads

Appointment creation

Basic patient demographics

Not available

Insurance eligibility checking

Complex insurance mapping

Referral management

Risk signal

Repeat dialing suggests callers were not just waiting on hold. They were failing to complete the access workflow at all.

03 - Decision Point

The decision is not whether voice AI can answer phones. The decision is whether Summit should trust an agent to perform part of its access workflow while the organization is still restructuring.

The product ruling is narrow: launch a constrained scheduling and routing agent with hard exclusions, staff review, and staged rollout. Earn scope through performance, not ambition.

04 - Product Strategy

A bounded scheduling agent, not a fantasy receptionist.

A broad AI receptionist would be impressive in a demo and dangerous in production. A narrow scheduling agent can be tested, launched, reviewed, and improved.

Why workers comp is excluded

Workers compensation scheduling requires authorization data, claim numbers, and insurance coordination that no current integration supports. One wrong booking creates downstream legal exposure.

Why insurance logic waits

Insurance knowledge exists in staff heads more than in structured systems. Until that knowledge is documented and tested, the agent should not present itself as an eligibility authority.

Included in MVP

Inbound call answer and intent classification

New and existing patient scheduling

Name and DOB lookup against eClinicalWorks

Provider availability lookup

Appointment creation after confirmation

Referred-provider and body-part routing

Billing transfer to RCM queue

Workers compensation hard transfer

Medical request capture and escalation

Staff review dashboard

Excluded from MVP

Medical advice of any kind

Complex insurance interpretation

Insurance eligibility verification

Referral validation

Workers compensation scheduling

Surgery scheduling

Post-operative triage

Prior authorization handling

Payment collection

05 - Architecture

LiveKit is the runtime. Twilio is the carrier.

This build is designed as a LiveKit showcase, not a Twilio bot. Telephony becomes an ingress channel into LiveKit, not the center of the system. Each inbound call creates a LiveKit room: a bounded context for the entire patient interaction.

The room model matters because callers interrupt, pause, search for insurance cards, ask side questions, and become frustrated. A request-response chatbot architecture would feel brittle.

LiveKit room per call

Participants

patient_sip_participant - caller from PSTN via SIP ingress

summit_ai_agent - greeting, classification, scheduling, escalation

human_staff_participant - optional warm transfer or supervisor override

Room metadata

Call ID

Phone number

Agent version

Location

Patient identity status

Appointment workflow state

Exclusion flags

Tool-call eligibility

Escalation status

Transcript confidence

SIP ingress flow

Patient dials Summit phone number

Twilio or Telnyx receives the PSTN call

SIP trunk routes the call to LiveKit SIP

LiveKit creates a SIP participant in a room

The LiveKit Agent joins and begins the workflow

Human transfer bridges back out through SIP when needed

06 - Technical Design

Every layer earns its place.

The product challenge is not generating answers. The product challenge is managing real-time human conversation over audio in a medical office context while keeping irreversible actions policy-gated.

Runtime. LiveKit Agents for real-time audio sessions, programmable agent participation, tool use, and observability.

Telephony. LiveKit SIP plus Twilio SIP trunk. Twilio is the phone carrier, not the voice-agent runtime.

STT. Deepgram Nova-3 Medical for orthopedic terms, provider names, and phone audio quality.

Turn handling. Silero VAD plus LiveKit turn detector for both barge-in and slow callers.

Live LLM. GPT-4o for constrained tool use and intent classification, with policy carrying safety-critical logic.

Offline review. Stronger batch reasoning for transcript review, policy checks, and QA categorization.

TTS. Cartesia Sonic 3 streaming with calm pacing and immediate interruption.

Workflow. Deterministic state machine outside free-form model behavior.

STT evaluation set

Provider names must clear confirmation-loop testing.

Body-part vocabulary must clear the common orthopedic set.

DOB capture must be confirmed through readback before patient lookup.

Clinical terms are evaluated separately from LLM behavior.

Voice style parameters

Slightly slower than default speaking speed

One question at a time

Calm and clear, not clinical

No sales tone or exaggerated warmth

Pause after confirmations

Body-part routing

The LLM normalizes caller language. The routing table chooses the provider group. No freeform routing decision gets to book an appointment.

07 - Safety Layer

The LLM proposes. The policy gate decides.

Appointment creation is an irreversible action in eClinicalWorks. It writes to a schedule, creates downstream work, and cannot be treated as a loose model suggestion.

Core tool signatures

lookup_patient(name, dob)

get_provider_availability(provider, appointment type, date range)

create_appointment(payload) - policy-gated

transfer_call(destination, reason)

create_staff_task(payload)

flag_for_review(call_id, reason, payload)

log_patient_statement(payload)

Allowed

Appointment with explicit confirmation

The scheduling tool is valid only after the caller confirms the slot and the path remains inside approved policy.

Excluded

Workers compensation booking

Workers compensation flips the session into transfer-only mode because the integration cannot validate authorization and claim data.

Excluded

Medical advice response

Clinical questions route to capture-and-escalate. The system prompt and tool gate both block generated medical guidance.

Human approval mode

During pilot, create_appointment prepares the booking record and sends it to staff review for one-click confirmation instead of writing directly to eClinicalWorks.

08 - Voice Engineering

Barge-in and end-of-turn are two different problems.

Fast barge-in protects the caller from being talked over. Slower end-of-turn detection protects elderly or uncertain callers from being interrupted mid-thought. They are different engineering problems.

Patience prompts

"No problem, take your time."

"I heard part of that, but I want to make sure I get it right."

"I may have misheard the provider name. Did you say Dr. Chen or Dr. Cohen?"

No-action-on-uncertainty rule

If STT confidence falls below threshold on a provider name, DOB, or date, the agent must request confirmation before proceeding. No irreversible action is taken on low-confidence capture.

Normal scheduling conversation

700-900ms

Elderly or slow caller

1000-1400ms

Yes / no confirmation

500-700ms

Caller searching for insurance card

1500-2500ms

Noisy audio

Conservative plus clarification

09 - Operational Control

The dashboard is the trust layer.

Without the staff review queue, the system becomes an invisible automation layer and staff will not trust it. The dashboard turns the agent from a black box into an auditable work surface.

Most important early metric

Not automation rate. Corrected appointment rate.

Control surface