/install gladia-live-transcription
Live Transcription
Gladia's live API transcribes audio in real-time over WebSocket.
SDK-first: always use the official SDK — see gladia-sdk-integration for policy, setup, and fallback criteria.
When to Use
- Real-time transcription for microphone, telephony, or broadcast streams
- Voice agents, meeting assistants, call center tools, or live subtitles
- Live audio intelligence (translation, sentiment, NER)
When NOT to use: If the user has a pre-existing audio/video file or URL to transcribe after the fact, use the gladia-pre-recorded-transcription skill instead. Pre-recorded supports additional features like speaker diarization and PII redaction that are unavailable in live mode.
References
Consult these resources as needed:
- ./references/recommended-params.md -- Use-case presets and tuning
- ./references/session-config.md -- Full
startSession()config (JS + Python) - ./references/managing-sessions.md --
get,list,getFile,delete - ./references/websocket-events.md -- WebSocket event reference
- ../gladia-audio-intelligence/SKILL.md -- Feature availability
- ../gladia-audio-intelligence/references/live-audio-intelligence.md -- Live feature details
- ../gladia-sdk-integration/SKILL.md -- Setup, config, SDK vs raw API
- ../gladia-sdk-integration/references/sdk-versions.md -- Current SDK versions
- ../gladia-troubleshooting/SKILL.md -- Errors and diagnostics
API Endpoints (reference — prefer SDK methods)
| Endpoint | Method | SDK equivalent |
|---|---|---|
/v2/live |
POST | startSession() |
/v2/live |
GET | list() |
/v2/live/:id |
GET | get(id) |
/v2/live/:id |
DELETE | delete(id) |
/v2/live/:id/file |
GET | getFile(id) |
| WebSocket URL from init | — | sendAudio() / session.on() |
Session Lifecycle
SDK flow: startSession() -> sendAudio() -> receive transcript events -> stopRecording() -> get(id) for final result.
Quick Start
For SDK installation and client initialization, see the gladia-sdk-integration skill.
JavaScript/TypeScript
const session = client.liveV2().startSession({
model: "solaria-1",
encoding: "wav/pcm",
sample_rate: 16000,
bit_depth: 16,
channels: 1,
language_config: { languages: ["en"] },
messages_config: { receive_partial_transcripts: true },
});
session.on("message", (msg) => {
if (msg.type === "transcript") console.log(msg.data.utterance.text);
});
session.sendAudio(audioBuffer);
session.stopRecording();
Python (sync)
from gladiaio_sdk import (
LiveV2InitRequest,
LiveV2LanguageConfig,
LiveV2MessagesConfig,
LiveV2WebSocketMessage,
)
live_client = client.live()
session = live_client.start_session(
LiveV2InitRequest(
model="solaria-1",
encoding="wav/pcm",
sample_rate=16000,
bit_depth=16,
channels=1,
language_config=LiveV2LanguageConfig(languages=["en"]),
messages_config=LiveV2MessagesConfig(receive_partial_transcripts=True),
)
)
@session.on("message")
def on_message(message: LiveV2WebSocketMessage):
if message.type == "transcript":
print(message.data.utterance.text.strip())
session.send_audio(audio_bytes)
session.stop_recording()
Session Configuration
Core fields to set on every session:
- Audio format:
encoding,sample_rate,bit_depth,channels(must exactly match the stream) - Language:
language_config.languagesand optionalcode_switching - Message behavior:
messages_config.receive_partial_transcriptsand speech events - Optional processing:
pre_processing,realtime_processing,post_processing
See ./references/session-config.md for full examples and gladia-sdk-integration for client retry/timeout settings.
Key Tuning Parameters
endpointing is the primary latency-versus-completeness control for final transcripts.
| Use case | Recommended value |
|---|---|
| Voice agent | 0.05 - 0.1 |
| Call center | 0.1 - 0.3 |
| Live subtitles | 0.2 - 0.4 |
| Meeting recorder | 0.3 - 0.5 |
For maximum_duration_without_endpointing, speech_threshold, and full tuning guidance, see ./references/recommended-params.md.
Audio Streaming
Use session.sendAudio(chunk) (JS) / session.send_audio(chunk) (Python) to stream audio data. The SDK sends each chunk as a binary WebSocket frame.
- Chunk size: 100ms of audio per frame (recommended)
- Send continuously — do not batch large chunks
- Audio format MUST match the
encoding,sample_rate,bit_depth, andchannelsin session config
Stopping and Reconnection
Normal stop
session.stopRecording(); // Triggers post-processing, then session ends
session.stop_recording() # Triggers post-processing, then session ends
Force end (skip post-processing)
session.endSession(); // Immediately closes, no post-processing
session.end_session() # Immediately closes, no post-processing
Reconnection
SDK reconnection is automatic (wsRetry). For raw WebSocket fallback, reconnect to the same URL.
Limits
| Constraint | Value |
|---|---|
| Max session duration | 3 hours |
| Supported encodings | wav/pcm, wav/alaw, wav/ulaw |
| Concurrency (paid) | 30 concurrent sessions |
| Concurrency (free) | 1 concurrent session |
| Billing | Per-second of streamed audio |
| Multi-channel | Billed as N x duration |
Managing Sessions
Use SDK methods for post-capture operations:
- JavaScript:
client.liveV2().get(id),.list(filters),.getFile(id),.delete(id) - Python:
client.live().get(id),.list(filters),.get_file(id),.delete(id)
For full examples and pagination filters, see ./references/managing-sessions.md.
Common Mistakes
- Audio format mismatch: the
encoding,sample_rate,bit_depth, andchannelsin session config MUST match the actual audio stream exactly. - Forgetting to stop recording: leaving a session open without
stopRecording()keeps it hanging. - Wrong audio file path: the audio download endpoint is
/v2/live/:id/file, not/v2/live/:id/audio.
For the full list of gotchas and diagnostics, see the gladia-troubleshooting skill.
Further Reading
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install gladia-live-transcription - 安装完成后,直接呼叫该 Skill 的名称或使用
/gladia-live-transcription触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Gladia Live Transcription 是什么?
Real-time speech-to-text streaming with Gladia via WebSocket. Use when the user needs live transcription, builds a voice agent, meeting recorder, call center... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 31 次。
如何安装 Gladia Live Transcription?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install gladia-live-transcription」即可一键安装,无需额外配置。
Gladia Live Transcription 是免费的吗?
是的,Gladia Live Transcription 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Gladia Live Transcription 支持哪些平台?
Gladia Live Transcription 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Gladia Live Transcription?
由 Gladia(@gladiaio)开发并维护,当前版本 v1.0.1。