← 返回 Skills 市场
devcsde

Oatda Transcribe Audio

作者 devcsde · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ✓ 安全检测通过
33
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install oatda-transcribe-audio
功能描述
Transcribe audio to text using OATDA's unified audio API. Triggers when the user wants speech-to-text, transcription of meetings, podcasts, voice notes, subt...
使用说明 (SKILL.md)

OATDA Audio Transcription

Transcribe audio files to text through OATDA's unified audio API.

API Key Resolution

All commands need the OATDA API key. Resolve it inline for each exec call:

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}"

If the key is empty or null, tell the user to get one at https://oatda.com and configure it.

Security: Never print the full API key. Only verify existence or show first 8 chars.

Model Mapping

User says Provider Model
whisper, whisper-1, openai whisper (default) openai whisper-1
transcription, speech to text, stt openai whisper-1

Default: openai / whisper-1 if no model specified.

If the user provides provider/model format directly (for example openai/whisper-1), split on /.

⚠️ Models change over time. If a model ID fails, query oatda-list-models with ?type=audio first.

Input Preparation

The transcription endpoint supports:

  • multipart/form-data with a local file upload
  • JSON with a base64 data URL in file
  • JSON with file_base64 for providers that support direct base64 payloads

Maximum audio file size is 25MB.

For local files, prefer multipart upload because it is simpler and avoids large JSON bodies.

Discovering Audio Model Parameters

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X GET "https://oatda.com/api/v1/llm/models?type=audio" \
  -H "Authorization: Bearer $OATDA_API_KEY" | jq '.audio_models[] | {id, supported_params}'

Look for:

  • audio_modes containing transcription
  • supported response_format values
  • optional timestamp, diarization, or streaming support

API Call (multipart)

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/transcriptions" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -F "provider=\x3CPROVIDER>" \
  -F "model=\x3CMODEL>" \
  -F "file=@\x3CAUDIO_FILE>" \
  -F "response_format=json"

Alternative API Call (base64 JSON)

AUDIO_DATA_URL="data:audio/mpeg;base64,$(base64 -w 0 audio.mp3)"

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/transcriptions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d "$(jq -n \
    --arg provider \"\x3CPROVIDER>\" \
    --arg model \"\x3CMODEL>\" \
    --arg file \"$AUDIO_DATA_URL\" \
    '{provider: $provider, model: $model, file: $file, response_format: \"json\"}')"

Common Parameters

  • language: ISO-639-1 language code like en, de, fr
  • prompt: Context for names, acronyms, or domain-specific terms
  • response_format: json, text, srt, verbose_json, vtt, or diarized_json
  • temperature: 0 to 1
  • timestamp_granularities: word and/or segment
  • chunking_strategy: auto
  • hotwords: Provider-specific keyword hints
  • stream: true if supported by the selected model

Response Format

The API returns JSON like:

{
  "text": "The transcribed text...",
  "language": "en",
  "duration": 42.5,
  "segments": [],
  "words": [],
  "costs": {
    "inputCost": 0,
    "outputCost": 0.0001,
    "totalCost": 0.0001,
    "currency": "USD"
  }
}

Present the text field to the user. Include subtitles, segments, or words if the requested format includes them.

Error Handling

HTTP Status Meaning Action
401 Invalid API key Tell user to check their key
402 Insufficient credits Tell user to check balance
400 Bad request / model not supported Check model or file format and query oatda-list-models with type=audio
413 File too large Keep audio under 25MB or split it
429 Rate limited or monthly cap Wait briefly and retry once

Example

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/transcriptions" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -F "provider=openai" \
  -F "model=whisper-1" \
  -F "[email protected]" \
  -F "response_format=json"

Notes

  • Endpoint: /api/v1/llm/transcriptions
  • Prefer multipart upload for local files
  • Use response_format=srt or vtt for subtitles
  • Use language to improve recognition when source language is known
  • Equivalent capability name: transcribe_audio
  • Related skills: oatda-generate-speech, oatda-translate-audio, oatda-list-models
安全使用建议
This skill appears coherent and limited in scope, but before installing: 1) Confirm you trust oatda.com — audio you send will be transmitted to that third party. 2) Store and use a dedicated OATDA_API_KEY with minimal privileges and don’t reuse high-privilege keys. 3) Verify the ~/.oatda/credentials.json file contents and permissions; the skill reads that file to obtain the API key. 4) Be careful with sensitive audio (personal data, secrets) because transcripts are sent to an external service. 5) The SKILL.md tries to avoid printing the full API key, but agents can still expose secrets through logs or mistakes — consider limiting logging and rotating keys if they may be exposed.
功能分析
Type: OpenClaw Skill Name: oatda-transcribe-audio Version: 1.0.1 The skill is a standard integration for the OATDA audio transcription service. It uses curl and jq to interact with the oatda.com API, handling audio files via multipart uploads or base64 encoding. It includes appropriate logic for API key resolution from environment variables or a local configuration file (~/.oatda/credentials.json) and follows safe practices by advising against printing full secrets. No evidence of malicious intent, data exfiltration to unauthorized domains, or obfuscated code was found.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
Name/description (transcribe audio via OATDA) align with requested resources: curl, jq, OATDA_API_KEY, and ~/.oatda/credentials.json. Those are expected for an instruction-only wrapper around a remote transcription API.
Instruction Scope
SKILL.md only instructs the agent to resolve the OATDA API key (from env or the declared ~/.oatda/credentials.json), call OATDA endpoints, and format/handle transcription responses. It does not direct the agent to read unrelated files, scan system state, or transmit data to destinations other than oatda.com.
Install Mechanism
No install spec — instruction-only. Nothing is downloaded or written to disk by the skill itself, which minimizes install-time risk.
Credentials
Only a single provider credential (OATDA_API_KEY) and a local credentials path are required. This is proportionate for a service that forwards audio to a third-party API. Required binaries (curl, jq) are standard for the described curl/jq examples.
Persistence & Privilege
always is false and the skill does not request persistent or elevated privileges. It only reads a declared per-user config path and an API key; it does not modify other skills or system-wide settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install oatda-transcribe-audio
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /oatda-transcribe-audio 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
Fix: replaced with correct OpenClaw skill format
v1.0.0
Initial release: Speech-to-text transcription via OATDA unified audio API
元数据
Slug oatda-transcribe-audio
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

Oatda Transcribe Audio 是什么?

Transcribe audio to text using OATDA's unified audio API. Triggers when the user wants speech-to-text, transcription of meetings, podcasts, voice notes, subt... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 33 次。

如何安装 Oatda Transcribe Audio?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install oatda-transcribe-audio」即可一键安装,无需额外配置。

Oatda Transcribe Audio 是免费的吗?

是的,Oatda Transcribe Audio 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Oatda Transcribe Audio 支持哪些平台?

Oatda Transcribe Audio 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Oatda Transcribe Audio?

由 devcsde(@devcsde)开发并维护,当前版本 v1.0.1。

💬 留言讨论