← Back to Skills Marketplace

Oatda Transcribe Audio

Name: Oatda Transcribe Audio
Author: devcsde

by devcsde · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install oatda-transcribe-audio

Description

Transcribe audio to text using OATDA's unified audio API. Triggers when the user wants speech-to-text, transcription of meetings, podcasts, voice notes, subt...

README (SKILL.md)

OATDA Audio Transcription

Transcribe audio files to text through OATDA's unified audio API.

API Key Resolution

All commands need the OATDA API key. Resolve it inline for each exec call:

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}"

If the key is empty or null, tell the user to get one at https://oatda.com and configure it.

Security: Never print the full API key. Only verify existence or show first 8 chars.

Model Mapping

User says	Provider	Model
whisper, whisper-1, openai whisper (default)	openai	whisper-1
transcription, speech to text, stt	openai	whisper-1

Default: openai / whisper-1 if no model specified.

If the user provides provider/model format directly (for example openai/whisper-1), split on /.

⚠️ Models change over time. If a model ID fails, query oatda-list-models with ?type=audio first.

Input Preparation

The transcription endpoint supports:

multipart/form-data with a local file upload
JSON with a base64 data URL in file
JSON with file_base64 for providers that support direct base64 payloads

Maximum audio file size is 25MB.

For local files, prefer multipart upload because it is simpler and avoids large JSON bodies.

Discovering Audio Model Parameters

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X GET "https://oatda.com/api/v1/llm/models?type=audio" \
  -H "Authorization: Bearer $OATDA_API_KEY" | jq '.audio_models[] | {id, supported_params}'

Look for:

audio_modes containing transcription
supported response_format values
optional timestamp, diarization, or streaming support

API Call (multipart)

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/transcriptions" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -F "provider=\x3CPROVIDER>" \
  -F "model=\x3CMODEL>" \
  -F "file=@\x3CAUDIO_FILE>" \
  -F "response_format=json"

Alternative API Call (base64 JSON)

AUDIO_DATA_URL="data:audio/mpeg;base64,$(base64 -w 0 audio.mp3)"

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/transcriptions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d "$(jq -n \
    --arg provider \"\x3CPROVIDER>\" \
    --arg model \"\x3CMODEL>\" \
    --arg file \"$AUDIO_DATA_URL\" \
    '{provider: $provider, model: $model, file: $file, response_format: \"json\"}')"

Common Parameters

language: ISO-639-1 language code like en, de, fr
prompt: Context for names, acronyms, or domain-specific terms
response_format: json, text, srt, verbose_json, vtt, or diarized_json
temperature: 0 to 1
timestamp_granularities: word and/or segment
chunking_strategy: auto
hotwords: Provider-specific keyword hints
stream: true if supported by the selected model

Response Format

The API returns JSON like:

{
  "text": "The transcribed text...",
  "language": "en",
  "duration": 42.5,
  "segments": [],
  "words": [],
  "costs": {
    "inputCost": 0,
    "outputCost": 0.0001,
    "totalCost": 0.0001,
    "currency": "USD"
  }
}

Present the text field to the user. Include subtitles, segments, or words if the requested format includes them.

Error Handling

HTTP Status	Meaning	Action
401	Invalid API key	Tell user to check their key
402	Insufficient credits	Tell user to check balance
400	Bad request / model not supported	Check model or file format and query `oatda-list-models` with `type=audio`
413	File too large	Keep audio under 25MB or split it
429	Rate limited or monthly cap	Wait briefly and retry once

Example

export OATDA_API_KEY="${OATDA_API_KEY:-$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)}" && \
curl -s -X POST "https://oatda.com/api/v1/llm/transcriptions" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -F "provider=openai" \
  -F "model=whisper-1" \
  -F "[email protected]" \
  -F "response_format=json"

Notes

Endpoint: /api/v1/llm/transcriptions
Prefer multipart upload for local files
Use response_format=srt or vtt for subtitles
Use language to improve recognition when source language is known
Equivalent capability name: transcribe_audio
Related skills: oatda-generate-speech, oatda-translate-audio, oatda-list-models

Usage Guidance

This skill appears coherent and limited in scope, but before installing: 1) Confirm you trust oatda.com — audio you send will be transmitted to that third party. 2) Store and use a dedicated OATDA_API_KEY with minimal privileges and don’t reuse high-privilege keys. 3) Verify the ~/.oatda/credentials.json file contents and permissions; the skill reads that file to obtain the API key. 4) Be careful with sensitive audio (personal data, secrets) because transcripts are sent to an external service. 5) The SKILL.md tries to avoid printing the full API key, but agents can still expose secrets through logs or mistakes — consider limiting logging and rotating keys if they may be exposed.

Capability Analysis

Type: OpenClaw Skill Name: oatda-transcribe-audio Version: 1.0.1 The skill is a standard integration for the OATDA audio transcription service. It uses curl and jq to interact with the oatda.com API, handling audio files via multipart uploads or base64 encoding. It includes appropriate logic for API key resolution from environment variables or a local configuration file (~/.oatda/credentials.json) and follows safe practices by advising against printing full secrets. No evidence of malicious intent, data exfiltration to unauthorized domains, or obfuscated code was found.

Capability Tags

requires-sensitive-credentials

Capability Assessment

✓ Purpose & Capability

Name/description (transcribe audio via OATDA) align with requested resources: curl, jq, OATDA_API_KEY, and ~/.oatda/credentials.json. Those are expected for an instruction-only wrapper around a remote transcription API.

✓ Instruction Scope

SKILL.md only instructs the agent to resolve the OATDA API key (from env or the declared ~/.oatda/credentials.json), call OATDA endpoints, and format/handle transcription responses. It does not direct the agent to read unrelated files, scan system state, or transmit data to destinations other than oatda.com.

✓ Install Mechanism

No install spec — instruction-only. Nothing is downloaded or written to disk by the skill itself, which minimizes install-time risk.

✓ Credentials

Only a single provider credential (OATDA_API_KEY) and a local credentials path are required. This is proportionate for a service that forwards audio to a third-party API. Required binaries (curl, jq) are standard for the described curl/jq examples.

✓ Persistence & Privilege

always is false and the skill does not request persistent or elevated privileges. It only reads a declared per-user config path and an API key; it does not modify other skills or system-wide settings.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install oatda-transcribe-audio
After installation, invoke the skill by name or use /oatda-transcribe-audio
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

Fix: replaced with correct OpenClaw skill format

v1.0.0

Initial release: Speech-to-text transcription via OATDA unified audio API

Metadata

Slug oatda-transcribe-audio

Version 1.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Oatda Transcribe Audio?

Transcribe audio to text using OATDA's unified audio API. Triggers when the user wants speech-to-text, transcription of meetings, podcasts, voice notes, subt... It is an AI Agent Skill for Claude Code / OpenClaw, with 33 downloads so far.

How do I install Oatda Transcribe Audio?

Run "/install oatda-transcribe-audio" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Oatda Transcribe Audio free?

Yes, Oatda Transcribe Audio is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Oatda Transcribe Audio support?

Oatda Transcribe Audio is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Oatda Transcribe Audio?

It is built and maintained by devcsde (@devcsde); the current version is v1.0.1.

More Skills