← 返回 Skills 市场
q1lin570

audio-transcribe-summarize

作者 q1lin570 · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
238
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install audio-transcribe-summarize
功能描述
Transcribe audio/video files to text and generate structured summaries using SenseAudio ASR API. Use when the user asks to transcribe, summarize, or take not...
使用说明 (SKILL.md)

Audio/Video Transcription & Summarization

Transcribe audio/video files using the SenseASR API (api.senseaudio.cn), then summarize the content into structured notes.

{baseDir} refers to this skill's directory.

Prerequisites

  • Environment variable SENSEAUDIO_API_KEY configured (get your key at https://senseaudio.cn/platform/api-key)
  • Python 3.8+ with requests installed
  • For large files (>10MB): ffmpeg installed for splitting(macOS: brew install ffmpeg,Windows: ffmpeg.org 下载并加入 PATH,Linux: apt install ffmpeg

Quick Start

  1. Run the transcription script:
python {baseDir}/scripts/transcribe.py \x3Caudio_file> [--model sense-asr-pro] [--language zh] [--speakers] [--sentiment] [--translate en]
  1. The script outputs a transcript .txt file alongside the source file
  2. Read the transcript and generate a summary (see Summary Format below)

Workflow

Step 1: Assess the Audio File

Check file size and format:

  • Supported formats: wav, mp3, ogg, flac, aac, m4a, mp4
  • Max file size per request: 10MB
  • If file > 10MB, the script auto-splits using ffmpeg

Step 2: Choose the Right Model

Model Use When
sense-asr-lite Quick batch transcription, simple audio, cost-sensitive
sense-asr General transcription, need speaker separation or timestamps
sense-asr-pro High accuracy needed: meetings, interviews, complex audio
sense-asr-deepthink Noisy audio, dialects, heavy jargon, speech-to-clean-text

Default to sense-asr-pro for best quality.

Step 3: Transcribe

Run the transcription script. Key options:

# Basic transcription
python {baseDir}/scripts/transcribe.py recording.mp3

# Meeting with multiple speakers + emotion
python {baseDir}/scripts/transcribe.py meeting.wav \
  --model sense-asr-pro \
  --speakers --max-speakers 4 \
  --sentiment \
  --timestamps segment

# Transcribe and translate to English
python {baseDir}/scripts/transcribe.py lecture.mp3 \
  --model sense-asr \
  --translate en

Step 4: Summarize

After transcription, read the transcript file and produce a summary using the format below.

Summary Format

Generate summaries in this structure:

# [Title - inferred from content]

**Source**: filename.mp3
**Duration**: X min Y sec
**Date**: YYYY-MM-DD
**Speakers**: [if speaker diarization was used]

## Key Points
- Point 1
- Point 2
- ...

## Detailed Summary
[2-4 paragraph summary of the content organized by topic/chronology]

## Action Items
- [ ] Action item 1 (assigned to Speaker X, if applicable)
- [ ] Action item 2

## Notable Quotes
> "Direct quote from transcript" — Speaker X, [timestamp if available]

## Full Transcript
\x3Cdetails>
\x3Csummary>Click to expand full transcript\x3C/summary>

[Full transcript text here, with speaker labels and timestamps if available]

\x3C/details>

Adapt the template based on content type:

  • Meeting: emphasize action items, decisions, speaker contributions
  • Lecture/Talk: emphasize key concepts, learning points, structure
  • Interview: emphasize Q&A pairs, key responses
  • Podcast: emphasize topics discussed, interesting insights

API Reference

For full SenseASR API parameters and response formats, see api-reference.md.

安全使用建议
This skill appears to do what it claims (send audio to SenseAudio and produce transcripts/summaries), but note two things before installing/using it: (1) It requires a SENSEAUDIO_API_KEY (the SKILL.md and script require it) even though the registry metadata omitted that — make sure you supply a key and understand where it will be stored. (2) All audio is uploaded to https://api.senseaudio.cn, so transcripts and possibly speaker/emotion metadata are sent to a third party — consider privacy/confidentiality and cost. If you proceed, verify the API host, only use a dedicated API key with appropriate permissions/quota, run the script in an isolated environment if the audio is sensitive, and confirm the registry metadata is corrected or ask the publisher why the API key was not declared.
功能分析
Type: OpenClaw Skill Name: audio-transcribe-summarize Version: 1.0.1 The skill provides a legitimate utility for transcribing and summarizing audio/video files using the SenseAudio ASR API (api.senseaudio.cn). The Python script `scripts/transcribe.py` correctly handles file splitting via `ffmpeg` and communicates with the API as described in the documentation. No evidence of data exfiltration, malicious execution, or prompt injection was found; the code follows best practices such as using argument lists in `subprocess.run` to prevent shell injection.
能力评估
Purpose & Capability
The skill's name/description (transcribe & summarize using SenseAudio) align with the included code and API reference. However the registry metadata declared no required environment variables while SKILL.md and scripts/transcribe.py clearly require a SENSEAUDIO_API_KEY — an inconsistency between declared requirements and actual needs.
Instruction Scope
SKILL.md instructs the agent to run the included Python script which uploads audio to api.senseaudio.cn and then writes local transcript (.txt/.json) files. The instructions and script operate within the stated purpose and do not attempt to read unrelated system files or additional environment variables beyond SENSEAUDIO_API_KEY. They do call ffmpeg/ffprobe via subprocess which is expected to split large audio files.
Install Mechanism
There is no install spec (instruction-only with an included script). No packages are downloaded at install time. The risk surface is limited to running the provided Python script and any subprocesses it spawns (ffmpeg).
Credentials
The script requires SENSEAUDIO_API_KEY (used in Authorization header) but the registry metadata did not declare this environment variable. Requesting an API key for the remote ASR service is proportional to the functionality, but the metadata omission is misleading and could cause users to miss a sensitive requirement. Other environment access is minimal (PATH lookups for ffmpeg).
Persistence & Privilege
The skill is not always-enabled and is user-invocable. It does not request elevated or persistent platform privileges and does not modify other skills or system-wide configuration. Autonomous invocation is allowed by default but is not combined with other high-risk patterns here.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install audio-transcribe-summarize
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /audio-transcribe-summarize 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
- Removed the `.env` file from the repository. - Updated setup instructions: now require configuring the `SENSEAUDIO_API_KEY` environment variable instead of using a `.env` file. - Prerequisites section now provides OS-specific installation steps for ffmpeg. - Dependency on `python-dotenv` is no longer mentioned; only `requests` is required. - Maintains existing workflow and summary guidelines.
v1.0.0
- Initial release of audio-transcribe-summarize skill. - Transcribes audio/video files to text using the SenseAudio ASR API. - Supports automatic splitting of large files and multiple audio formats. - Generates structured summaries tailored for meetings, lectures, interviews, and podcasts. - Provides customizable transcription options, including speaker separation, sentiment analysis, and translation. - Includes a markdown-based summary template for consistent and readable output.
元数据
Slug audio-transcribe-summarize
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

audio-transcribe-summarize 是什么?

Transcribe audio/video files to text and generate structured summaries using SenseAudio ASR API. Use when the user asks to transcribe, summarize, or take not... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 238 次。

如何安装 audio-transcribe-summarize?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install audio-transcribe-summarize」即可一键安装,无需额外配置。

audio-transcribe-summarize 是免费的吗?

是的,audio-transcribe-summarize 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

audio-transcribe-summarize 支持哪些平台?

audio-transcribe-summarize 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 audio-transcribe-summarize?

由 q1lin570(@q1lin570)开发并维护,当前版本 v1.0.1。

💬 留言讨论