/install llc-phone
Lowest Latency Calls
Architecture, configuration, and reference for the OpenAI Realtime API + Twilio phone system.
To PLACE calls, manage prospects, and run campaigns: pair this skill with your own outbound dialer / campaign layer. This skill is about the real-time call infrastructure itself.
DO NOT CHANGE (confirmed working, breaks if altered)
The call flow, session config format, and audio path below were debugged through many iterations. Do not restructure without reading this entire skill.
Session config — FLAT format only
// CORRECT:
{ type: "session.update", session: {
modalities: ["text", "audio"], voice: "cedar",
turn_detection: { type: "semantic_vad", eagerness: "high", create_response: true, interrupt_response: true },
input_audio_format: "g711_ulaw", output_audio_format: "g711_ulaw",
}}
// WRONG (API rejects): session: { type: "realtime", audio: { input: { format: ... } } }
Outbound call flow — caller-first
Callee picks up, says hello, THEN the agent responds. No forced greeting. Semantic VAD with create_response: true handles it automatically.
Audio path — direct passthrough
Audio deltas from OpenAI are already base64 g711_ulaw. Forward directly to Twilio. No PCM conversion, no gain control, no resampling.
Greeting trigger
conversation.item.create (user message) + response.create. NOT response.create with instructions. Trigger on session.updated, NOT session.created.
Twilio webhook
Must point to /twiml. Verify: check Twilio API, not assumptions.
SAFE TO TUNE
- Prompt size: smaller = faster inference. Reference outbound prompt is ~478 tokens.
- VAD eagerness:
"high"first turn,"medium"after. Configurable. - Tool loading: lean tools first turn, full set after first
response.done. - Voice: cedar is a solid default for all scenarios. Can change per scenario.
- Inference priming: text-only
response.createduring pre-warm warms pipeline without audio. - Twilio edge: configure to colocate with your deployment region and OpenAI region for lowest RTT.
Debugging Checklist
Before adding patches when calls fail:
- Is the websocket server process running? (
systemctl status \x3Cyour-service>,pm2 status, or your equivalent) - Single owner on the websocket port?
lsof -i :\x3CPORT> - Twilio webhook URL correct? Check the Twilio API, not local config files.
- Check your server log (whatever path you configured — stdout, file, or journald)
- OpenAI outage? Check status.openai.com
- Session config accepted? Look for
session.updatedin logs.erroraftersession.created= wrong config format.
Do not pile patches. If it worked before and doesn't now, check infrastructure first.
Restart Procedure (pattern)
Whatever process supervisor you use, the correct sequence is:
stop the websocket server
→ kill any orphaned listeners on the websocket port (lsof -i :\x3CPORT> -t | xargs kill)
→ start the websocket server
Always stop → kill orphans → start. A bare restart can leave a stale listener holding the port.
Restore from Snapshot (pattern)
Keep a known-good copy of sessionManager.ts (the file most affected by tuning) in a snapshots directory alongside the source. To restore:
copy snapshots/sessionManager-TUNED-\x3Cdate>.ts → src/sessionManager.ts
restart using the procedure above
Key Files (relative to the websocket-server project)
| What | Path |
|---|---|
| sessionManager.ts | websocket-server/src/sessionManager.ts |
| server.ts | websocket-server/src/server.ts |
| Snapshots | websocket-server/snapshots/ |
| Service unit | your process supervisor unit file (systemd user unit, pm2 ecosystem file, etc.) |
| Logs | wherever you configured (stdout + journald, /var/log/..., pm2 logs, etc.) |
| .env | websocket-server/.env (contains PORT) |
Reference Documents
All reference docs in {baseDir}/docs/:
| File | Content |
|---|---|
{baseDir}/docs/01-overview.md |
Model landscape, changelog |
{baseDir}/docs/02-session-config.md |
session.update reference + defaults |
{baseDir}/docs/03-prewarm-outbound.md |
Pre-warm: buffer, fallback, edge cases |
{baseDir}/docs/04-inbound-modes.md |
AI IVR, Receptionist, CSR with DB |
{baseDir}/docs/05-async-tools.md |
Async tool calling |
{baseDir}/docs/06-latency-tuning.md |
All latency levers |
{baseDir}/docs/07-twilio-integration.md |
PCMU format, edge, AMD, stream events |
{baseDir}/docs/08-known-issues.md |
Bugs, workarounds, watch-later |
{baseDir}/docs/09-openclaw-config.md |
Config + install/publish |
Load the relevant doc before answering architecture or config questions.
Key Facts (always available without file load)
- Model:
gpt-realtime-1.5(flagship),gpt-realtime-mini(cost-sensitive) - WebSocket:
wss://api.openai.com/v1/realtime?model=gpt-realtime-1.5 - Audio: mu-law / PCMU at 8 kHz mono, base64 encoded
- Turn detection:
semantic_vadwitheagerness: "high"is the tested default - Pre-warm timeout: 10 seconds (fallback to cold connect)
Lessons
- Session config: flat format only. Nested is rejected.
- Trigger greeting on
session.updated, notsession.created. - Semantic VAD works without prior audio response.
- Verify infrastructure before debugging behavior.
- Audio is already PCMU. No conversion needed.
- Prompt size directly affects per-turn latency.
- When patches pile up: stop, read docs, rewrite from baseline.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install llc-phone - 安装完成后,直接呼叫该 Skill 的名称或使用
/llc-phone触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Llc Phone 是什么?
Low-latency inbound and outbound AI phone calls via the OpenAI Realtime API and Twilio, covering pre-warm and pre-accept patterns, IVR and receptionist flows... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 160 次。
如何安装 Llc Phone?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install llc-phone」即可一键安装,无需额外配置。
Llc Phone 是免费的吗?
是的,Llc Phone 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Llc Phone 支持哪些平台?
Llc Phone 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Llc Phone?
由 Chris M.(@cygnostik)开发并维护,当前版本 v3.0.4。