/install gladia-audio-intelligence
Audio Intelligence
Gladia's audio intelligence features extract structured data and insights from transcripts. They work on top of the base transcription — most are enabled by adding options to the transcribe() call (pre-recorded) or the startSession() config (live).
SDK-first: always use the official SDK — see gladia-sdk-integration for policy, setup, and fallback criteria.
When to Use
- User asks about a specific feature: diarization, translation, PII redaction, sentiment, NER, subtitles, summarization, etc.
- Enabling or configuring one or more audio intelligence features on pre-recorded or live transcription
- Understanding which features are available in live vs pre-recorded mode
- Combining multiple features in a single transcription job
When NOT to use: For basic transcription without audio intelligence features, go directly to gladia-pre-recorded-transcription or gladia-live-transcription. For gotchas and errors related to specific features, see gladia-troubleshooting.
References
Consult these resources as needed:
- ./references/live-audio-intelligence.md -- Detailed config and WebSocket responses for all live-mode features
- ./references/pre-recorded-audio-intelligence.md -- Detailed config and response structures for all pre-recorded audio intelligence features
- ../gladia-pre-recorded-transcription/SKILL.md -- Pre-recorded transcription workflow and options
- ../gladia-live-transcription/SKILL.md -- Live transcription session config and event handling
- ../gladia-sdk-integration/SKILL.md -- SDK setup, client initialization, and SDK vs raw API decision guide
- ../gladia-troubleshooting/SKILL.md -- Common errors, gotchas, and verification checklist
Feature Availability
| Feature | Pre-recorded | Live | Config key |
|---|---|---|---|
| Speaker diarization | Yes | No | diarization |
| Translation | Yes | Yes | translation |
| Sentiment analysis | Yes | Yes | sentiment_analysis |
| Named entity recognition | Yes | Yes | named_entity_recognition |
| Subtitles (SRT/VTT) | Yes | No | subtitles |
| Custom vocabulary | Yes | Yes | custom_vocabulary |
| PII redaction | Yes | No | pii_redaction |
| Chapterization | Yes | Yes | chapterization (post-process) |
| Summarization | Yes | Yes | summarization (post-process) |
| Audio-to-LLM | Yes | No | audio_to_llm |
| Custom spelling | Yes | Yes | custom_spelling |
| Custom metadata | Yes | Yes | custom_metadata |
Live features split into two groups: real-time (results stream during the session) and post-processing (results arrive after stopRecording()). See ./references/live-audio-intelligence.md for details.
Quick Config Examples
Code examples assume GladiaClient is already initialized — see gladia-sdk-integration for setup.
Speaker Diarization (pre-recorded only)
const result = await client.preRecorded().transcribe("audio.mp3", {
diarization: true,
diarization_config: { number_of_speakers: 2 },
});
// Each utterance includes a `speaker` field (0-indexed integer)
result = client.prerecorded().transcribe("audio.mp3", {
"diarization": True,
"diarization_config": {"number_of_speakers": 2},
})
Translation (pre-recorded and live)
Pre-recorded:
const result = await client.preRecorded().transcribe("audio.mp3", {
translation: true,
translation_config: { target_languages: ["fr", "es"] },
});
result = client.prerecorded().transcribe("audio.mp3", {
"translation": True,
"translation_config": {"target_languages": ["fr", "es"]},
})
Live (result streams as translation WebSocket events — see live-audio-intelligence.md):
const session = client.liveV2().startSession({
// ... audio format options ...
realtime_processing: {
translation: true,
translation_config: { target_languages: ["fr"] },
},
});
from gladiaio_sdk import LiveV2InitRequest, LiveV2RealtimeProcessing
session = client.live().start_session(
LiveV2InitRequest(
# ... audio format options ...
realtime_processing=LiveV2RealtimeProcessing(
translation=True,
translation_config={"target_languages": ["fr"]},
),
)
)
Summarization (pre-recorded and live)
Pre-recorded:
const result = await client.preRecorded().transcribe("audio.mp3", {
summarization: true,
summarization_config: { type: "bullet_points" },
});
Live (arrives after stopRecording() as post_summarization event):
const session = client.liveV2().startSession({
// ... audio format options ...
post_processing: {
summarization: true,
summarization_config: { type: "bullet_points" },
},
});
session.on("message", (msg) => {
if (msg.type === "post_summarization") console.log(msg.data.results);
});
For full per-feature config options and response structures, see:
- Pre-recorded: ./references/pre-recorded-audio-intelligence.md
- Live: ./references/live-audio-intelligence.md
Common Mistakes
code_switching: truewith emptylanguages: triggers evaluation across 100+ languages and causes frequent misdetections. Always provide 3-5 expected languages.- Custom vocabulary
intensityabove 0.6: values over 0.6 cause false positives where unrelated words get replaced. Keep at 0.4-0.6 and usepronunciationsfor better results. - Expecting diarization, PII redaction, subtitles, or audio-to-LLM in live mode: these four features are pre-recorded only.
- Enabling many features simultaneously without considering cost/latency: each enabled feature adds processing time. Enable only what you need; combine
diarization + summarization + translationonly when all are required.
For the full gotcha list, see gladia-troubleshooting.
Further Reading
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install gladia-audio-intelligence - After installation, invoke the skill by name or use
/gladia-audio-intelligence - Provide required inputs per the skill's parameter spec and get structured output
What is Gladia Audio Intelligence?
Configure and use Gladia audio intelligence features: speaker diarization, translation, sentiment analysis, named entity recognition (NER), PII redaction, su... It is an AI Agent Skill for Claude Code / OpenClaw, with 29 downloads so far.
How do I install Gladia Audio Intelligence?
Run "/install gladia-audio-intelligence" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Gladia Audio Intelligence free?
Yes, Gladia Audio Intelligence is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Gladia Audio Intelligence support?
Gladia Audio Intelligence is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Gladia Audio Intelligence?
It is built and maintained by Gladia (@gladiaio); the current version is v1.0.1.