← 返回 Skills 市场

Gladia Pre Recorded Transcription

Name: Gladia Pre Recorded Transcription
Author: gladiaio

作者 Gladia · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install gladia-pre-recorded-transcription

功能描述

Transcribe pre-recorded audio files or URLs with Gladia. Use when the user needs batch/async transcription, speaker diarization, subtitles (SRT/VTT), PII red...

使用说明 (SKILL.md)

Pre-Recorded Transcription

Gladia's pre-recorded API transcribes audio and video files asynchronously.

SDK-first: always use the official SDK — see gladia-sdk-integration for policy, setup, and fallback criteria.

When to Use

Existing audio/video files or URLs (including social/video links)
Batch or asynchronous transcription workflows
Pre-recorded-only features: diarization, PII redaction, subtitles

When NOT to use: If the user needs real-time / live transcription of a stream, microphone, or ongoing audio feed, use the gladia-live-transcription skill instead. Live transcription uses WebSocket sessions, not the pre-recorded API.

References

Consult these resources as needed:

./references/transcription-options.md -- Full options (JS + Python)
./references/managing-jobs.md -- get, list, getFile, delete
./references/delivery-and-response.md -- Response shape and events
../gladia-audio-intelligence/SKILL.md -- Feature availability and config
../gladia-sdk-integration/SKILL.md -- Setup, config, SDK vs raw API
../gladia-sdk-integration/references/sdk-versions.md -- Current SDK versions
../gladia-troubleshooting/SKILL.md -- Errors and diagnostics

API Endpoints (reference — prefer SDK methods instead)

Endpoint	Method	SDK equivalent
`/v2/upload`	POST	`transcribe()` auto-uploads local files
`/v2/pre-recorded`	POST	`create()` / `transcribe()`
`/v2/pre-recorded`	GET	`list()`
`/v2/pre-recorded/:id`	GET	`get()` / `poll()` / `transcribe()`
`/v2/pre-recorded/:id`	DELETE	`delete()`
`/v2/pre-recorded/:id/file`	GET	`getFile()`

Workflow

Recommended (SDK)

The SDK transcribe() method handles upload, job creation, and polling in one call. Use this by default.

const result = await client.preRecorded().transcribe("./audio.mp3", {
  language_config: { languages: ["en"] },
  diarization: true,
});

console.log(result.result?.transcription?.full_transcript);

result = client.prerecorded().transcribe(
    "audio.mp3",
    {"language_config": {"languages": ["en"]}, "diarization": True},
)

print(result.result.transcription.full_transcript)

Audio input can be a local file path, HTTP(S) URL, social/video URL, or binary file object. For full input types, see gladia-sdk-integration.

Fallback (raw REST — only when SDK is not feasible)

Use raw REST only when SDK use is not possible.

Upload (if local file): POST /v2/upload with multipart form data → get audio_url
Create job: POST /v2/pre-recorded with audio_url and config → get id
Poll: GET /v2/pre-recorded/:id until status: "done" (or use webhooks/callbacks)
Parse results: Extract transcription, diarization, translation, etc. from response

Managing Jobs

Use SDK methods for post-processing operations:

JavaScript: client.preRecorded().get(id), .list(filters), .getFile(id), .delete(id)
Python: client.prerecorded().get(id), .list(filters), .get_file(id), .delete(id)

For full JS/Python examples, pagination filters, and REST equivalents, see ./references/managing-jobs.md.

Transcription Options

All options are passed as the second argument to transcribe(). Key options:

Option	Description
`language_config`	Expected languages, code switching
`diarization`	Speaker identification (pre-recorded only)
`translation`	Translate to target languages
`summarization`	Generate bullet points or paragraph summary
`subtitles`	Generate SRT/VTT files
`pii_redaction`	Redact PII (pre-recorded only)
`audio_to_llm`	Run custom LLM prompts on transcript
`callback_url`	Async webhook delivery

For full option details, see ./references/transcription-options.md. For audio intelligence config, see gladia-audio-intelligence. For client-level retry/timeouts, see gladia-sdk-integration.

Response and Delivery

For full response JSON and event names, see ./references/delivery-and-response.md.

Limits and Specifications

Constraint	Value
Max file size	1000 MB
Max duration	135 minutes (120 min for YouTube)
Enterprise max duration	4h15
Concurrency (paid)	25 concurrent jobs
Concurrency (free)	3 concurrent jobs

Polling Best Practices

The SDK handles polling automatically — transcribe() polls until the job completes with configurable interval and timeout:

const result = await client.preRecorded().transcribe(audio, options, {
  interval: 5000, // Poll every 5s
  timeout: 600000, // Timeout after 10 minutes
});

If using raw REST instead of the SDK:

Use webhooks or callbacks instead of polling when possible
If polling, implement exponential backoff (start at 3s, max 30s)

Common Mistakes

Code switching without language list: enabling code_switching: true with empty languages triggers 100+ language evaluation. Always provide 3-5 expected languages.
Polling without backoff: rapid polling wastes requests and may trigger 429s. The SDK handles this; for raw REST, use webhooks or exponential backoff.
Expecting live-only features: diarization, PII redaction, and subtitles are pre-recorded only — not available in live mode.
Wrong audio file path: the audio download endpoint is /v2/pre-recorded/:id/file, not /v2/pre-recorded/:id/audio.

For the full list of gotchas and diagnostics, see the gladia-troubleshooting skill.

Install this only if you intend to use Gladia for transcription. Do not submit confidential, regulated, or third-party recordings unless you are authorized to share them with Gladia and any callback or downstream LLM destination. Confirm job IDs before using delete because it removes remote job data.

能力评估

✓ Purpose & Capability

The skill purpose, examples, endpoints, and references consistently describe Gladia pre-recorded transcription, audio intelligence options, callbacks, job lookup, file retrieval, and deletion; these capabilities fit the stated purpose.

ℹ Instruction Scope

The instructions disclose Gladia API use, uploads, callbacks, audio-to-LLM, and job deletion, but they do not prominently require user confirmation before processing sensitive recordings or deleting remote job data.

✓ Install Mechanism

The artifact contains only Markdown documentation and references, with no executable scripts, package installs, hidden hooks, or runtime persistence mechanism.

ℹ Credentials

Sending local audio files, remote URLs, transcripts, summaries, and callback payloads to Gladia or configured webhook endpoints is proportionate for transcription, but users should treat those data flows as external processing.

ℹ Persistence & Privilege

The skill documents remote Gladia job listing, original file retrieval, and deletion; this is expected job-management authority and there is no local privilege escalation, though delete actions should be explicit and verified.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install gladia-pre-recorded-transcription
安装完成后，直接呼叫该 Skill 的名称或使用 /gladia-pre-recorded-transcription 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.1

- Documentation formatting updated in SKILL.md to add a missing frontmatter separator line. - No functionality or content changes; only a minor correction to conform with formatting standards.

v1.0.0

- Initial release: enables transcription and audio intelligence for pre-recorded files and URLs using Gladia's async/batch API. - Supports speaker diarization, PII redaction, subtitles, translation, summarization, chapterization, NER, and audio-to-LLM features. - Strongly prefers official SDK methods (JS/Python) for file handling, with guidance to use REST API only as fallback. - Includes full documentation for workflow, job management, options, polling best practices, and common mistakes. - Documents limits (file size, duration, concurrency) and provides extensive troubleshooting and integration references.

元数据

Slug gladia-pre-recorded-transcription

版本 1.0.1

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 2

常见问题