← 返回 Skills 市场
thelapyae

Burmese Audio Understanding

作者 thelapyae · GitHub ↗ · v1.2.2 · MIT-0
cross-platform ✓ 安全检测通过
182
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install burmese-audio-understanding
功能描述
High-accuracy Burmese audio transcription using Gemini 3.1 Flash Preview.
使用说明 (SKILL.md)

Burmese Audio Understanding Skill

This skill allows you to transcribe Burmese audio (voice notes, speech) directly into Burmese text using your own Google Gemini API key. It uses the official Google GenAI SDK for secure and reliable file handling.

Required Environment Variables

  • GEMINI_API_KEY: Required. Set your Google Gemini API key to allow the skill to access transcription services.

Usage

Ensure GEMINI_API_KEY is set in your environment, then run:

node scripts/transcribe-direct.js /path/to/my-audio.ogg

Features

  • Official SDK: Uses the official @google/genai SDK.
  • Improved Security: No shell commands (ffmpeg/child_process) used; file processing is handled via SDK file upload directly to Gemini.
  • Model: Uses gemini-3.1-flash-preview for high-quality audio transcription.

Security Notes

  • This skill sends audio data to Google Gemini API for transcription.
  • No data is stored locally after processing.
  • Requires a valid GEMINI_API_KEY with minimal permissions.

Prerequisites

  • Dependencies must be installed: npm install @google/genai.
安全使用建议
This skill appears to do exactly what it says: upload a local audio file to Google Gemini for transcription. Before installing or running it: ensure you trust the @google/genai package version you'll install, protect the GEMINI_API_KEY (store it securely and grant only necessary permissions), be aware that audio data is transmitted to Google's servers (including any personally identifying content), and revoke the key if it is ever exposed. Also consider running npm installs in an isolated environment (container or VM) and confirm the package.json version matches the published registry artifact if version provenance matters to you.
功能分析
Type: OpenClaw Skill Name: burmese-audio-understanding Version: 1.2.2 The skill is a legitimate implementation for Burmese audio transcription using the official @google/genai SDK. The script 'scripts/transcribe-direct.js' follows documented procedures for uploading audio files to the Gemini API, generating content, and performing cleanup by deleting the uploaded file. There are no signs of data exfiltration, shell execution, or malicious prompt injection.
能力评估
Purpose & Capability
Name/description, SKILL.md, package.json, and the provided script all focus on sending a local audio file to the Google Gemini API for transcription. The declared requirement (GEMINI_API_KEY) matches the stated integration with Google Gemini. Minor: package.json version (1.2.1) differs from registry metadata (1.2.2), but this is a packaging/version inconsistency rather than a functional mismatch.
Instruction Scope
Runtime instructions are limited and specific: set GEMINI_API_KEY, run the Node script on a local audio file. The script uploads the file via the SDK, requests a transcription, prints it, and deletes the uploaded file. It does not reference unrelated files, extra environment variables, or external endpoints beyond the Google Gemini API.
Install Mechanism
There is no install spec; dependencies are standard npm (the SKILL.md requests running `npm install @google/genai`). Installing a single official SDK dependency is proportional to the task. No downloads from untrusted URLs or archive extraction are present.
Credentials
Only GEMINI_API_KEY is required and is appropriate for calling the Google Gemini API. No unrelated secrets or system credential paths are requested.
Persistence & Privilege
The skill is not always-enabled, does not request persistent elevated privileges, and does not modify other skills or system-wide agent settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install burmese-audio-understanding
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /burmese-audio-understanding 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.2.2
- Added new Security Notes section detailing how audio data is handled and privacy considerations. - Clarified that no data is stored locally after processing. - Stated API key requirements and minimal permission guidance.
v1.2.1
- Added a "Required Environment Variables" section clarifying that `GEMINI_API_KEY` must be set for transcription services. - Usage instructions and features remain unchanged.
v1.2.0
- Updated to use Gemini 3.1 Flash Preview for higher transcription accuracy. - Switched to the official @google/genai SDK for secure file handling. - Removed reliance on shell commands and ffmpeg; all audio processing is managed through the SDK. - Skill name and description updated to reflect improved model and capabilities.
v1.1.0
- Added support for transcribing Burmese audio directly to text using Gemini 2.0 Flash. - New script: `transcribe-direct.js` for audio file transcription. - Now requires setting the `GEMINI_API_KEY` environment variable. - Updated prerequisites: ffmpeg must be installed, and dependency on `@google/generative-ai` via npm.
元数据
Slug burmese-audio-understanding
版本 1.2.2
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 4
常见问题

Burmese Audio Understanding 是什么?

High-accuracy Burmese audio transcription using Gemini 3.1 Flash Preview. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 182 次。

如何安装 Burmese Audio Understanding?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install burmese-audio-understanding」即可一键安装,无需额外配置。

Burmese Audio Understanding 是免费的吗?

是的,Burmese Audio Understanding 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Burmese Audio Understanding 支持哪些平台?

Burmese Audio Understanding 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Burmese Audio Understanding?

由 thelapyae(@thelapyae)开发并维护,当前版本 v1.2.2。

💬 留言讨论