← Back to Skills Marketplace
yzwu2017

fun-voice-type

by Yuzhong WU · GitHub ↗ · v2.0.0 · MIT-0
cross-platform ⚠ suspicious
150
Downloads
0
Stars
1
Active Installs
4
Versions
Install in OpenClaw
/install fun-voice-type
Description
一个语音输入法插件。它基于阿里云FunASR实时语音识别技术,允许用户通过长按快捷键(Right Option键)直接将语音转换为文字并“打”在当前光标所在的任何输入框中。此外,还能将语音翻译为多种语言(例:中英日韩)。
README (SKILL.md)

激活条件

触发场景 说明
请求语音转译 "实时录音转写"、"语音转文字"、"实时语音翻译"
功能咨询 "怎么用语音打字?"
效率需求 "我不方便打字"、"帮我记录这段话"

核心功能

  • 长按即说:将鼠标光标点击到任何你想输入文字的地方,长按 Right Option \x3Ckbd>⌥\x3C/kbd> 开始录音,松开自动完成。
  • 全场景兼容:无缝支持浏览器、文档编辑器、IM 聊天软件等任何 macOS 标准输入控件。
  • 多语种兼容:支持多语种输入,以及翻译功能(点击fun-voice-type图标选择目标语种)。

环境依赖

1. 系统库依赖

由于使用了 pyaudio,你需要先在系统中安装portaudio以及python依赖:

brew install portaudio
pip install dashscope pynput pyaudio pystray

2. 设置 DashScope API Key

为了安全起见,建议将API Key设置为环境变量:

export DASHSCOPE_API_KEY='你的API_KEY'

如果还没有API Key,建议访问 阿里云 DashScope 控制台,申请并获取API Key。

安装与运行

运行脚本,fun-voice-type将显示为Mac菜单栏右上角的小图标:

nohup python fun-voice-type.py > /dev/null 2>&1

此时长按右Option即可实现语音输入功能。

权限授予

由于该 Skill 需要监听全局键盘按键并模拟键盘输入,在不同系统下需要额外权限:

macOS

  • 辅助功能 (Accessibility):前往 系统设置 -> 隐私与安全性 -> 辅助功能,将你运行脚本的终端(如 Terminal, iTerm2 或 VSCode)勾选开启。
  • 麦克风 (Microphone):首次运行时,系统会弹出麦克风权限请求,请点击允许。
  • 输入监听 (Input Monitoring):同样在隐私设置中确保终端有权监听键盘。

版本: 2.0.0 日期: 2026-03-21

Usage Guidance
What to consider before installing: - The code will send microphone audio (via FunASR) and recognized text to DashScope cloud services, and may send recognized text to the qwen-plus model for translation. Do not use it for sensitive audio/text unless you trust DashScope and your API key. - You must set DASHSCOPE_API_KEY in your environment for full functionality, but that env var is not declared in the skill registry metadata — verify this discrepancy with the publisher before providing a real API key. - The script requires Accessibility/Input Monitoring permission for the terminal you run it from and will simulate typing into whatever input has focus. Granting those permissions to a terminal is powerful: consider running in a dedicated account or VM, and avoid using the tool while focused on password fields, banking apps, or other sensitive inputs. - The PKG is small and readable; if you can, review the included script yourself (or have someone you trust do so). Confirm network destinations (dashscope endpoints) and consider monitoring outbound network traffic the first time you run it. - If you need higher assurance: ask the publisher for a verified source/homepage, or request that the registry metadata be updated to declare DASHSCOPE_API_KEY as a required env var and to provide a signed release or provenance information.
Capability Analysis
Type: OpenClaw Skill Name: fun-voice-type Version: 2.0.0 The skill is a legitimate voice-to-text utility that uses Alibaba's FunASR and Qwen LLM for transcription and translation. It requires standard macOS permissions (Microphone, Accessibility) to capture audio and simulate keyboard input, which are necessary for its core functionality. The code in fun-voice-type.py is transparent, uses environment variables for API keys, and lacks any indicators of data exfiltration or malicious execution.
Capability Assessment
Purpose & Capability
The name/description (voice input + translation via FunASR) matches the code and instructions: it records microphone audio, sends frames to DashScope FunASR, optionally sends recognized text to DashScope Generation (qwen-plus) for translation, and types results into the active input. This capability set is coherent for the stated purpose.
Instruction Scope
SKILL.md instructs installing portaudio and several Python packages and to set DASHSCOPE_API_KEY; it also asks the user to grant Accessibility/Input Monitoring and Microphone to the terminal used to run the script. The runtime instructions and code access the environment variable DASHSCOPE_API_KEY, but the package metadata declared 'Required env vars: none' — the instructions therefore access an env var not declared in metadata.
Install Mechanism
There is no install spec (instruction-only with an included script). Required system and Python deps are documented in SKILL.md. No downloads from untrusted URLs or archive extraction are used.
Credentials
The code requires a DashScope API key (DASHSCOPE_API_KEY) to call ASR and generation APIs, which is appropriate for cloud ASR/LLM usage — but the registry metadata does not declare this required env var or a primary credential. The missing declaration is an inconsistency the user should note. No other unrelated credentials are requested.
Persistence & Privilege
The skill is not always-enabled and does not request elevated platform privileges beyond the macOS accessibility/input monitoring grants required to listen to global keys and simulate typing. It does simulate keystrokes into any focused input (expected for an input method), which gives it potential to inject or exfiltrate sensitive text if used with sensitive inputs.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install fun-voice-type
  3. After installation, invoke the skill by name or use /fun-voice-type
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.0.0
Version 2.0.0 introduces multi-language support and translation features. - Adds the ability to translate speech input into multiple languages (e.g., Chinese, English, Japanese, Korean).
v1.0.2
- Update Icon and Usage Explanation.
v1.0.1
- 更新激活条件,更多关于语音输入和效率类需求场景的描述。 - 优化描述语,强调 Right Option(⌥)为触发键及全场景兼容性。 - 新增“全场景兼容”功能说明,明确支持主流输入环境。 - 小幅完善使用说明和控件键提示,提升文档清晰度。 - 更新版本号及日期。
v1.0.0
fun-voice-type 1.0.0 - 首次发布 - 提供基于阿里云FunASR实时语音识别的语音输入,支持长按右Option键将语音快速转换为文字并自动输入到当前光标处 - 集成菜单栏小图标方便退出。 - 详细说明环境依赖、API Key 配置及 macOS 权限设置
Metadata
Slug fun-voice-type
Version 2.0.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 4
Frequently Asked Questions

What is fun-voice-type?

一个语音输入法插件。它基于阿里云FunASR实时语音识别技术,允许用户通过长按快捷键(Right Option键)直接将语音转换为文字并“打”在当前光标所在的任何输入框中。此外,还能将语音翻译为多种语言(例:中英日韩)。 It is an AI Agent Skill for Claude Code / OpenClaw, with 150 downloads so far.

How do I install fun-voice-type?

Run "/install fun-voice-type" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is fun-voice-type free?

Yes, fun-voice-type is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does fun-voice-type support?

fun-voice-type is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created fun-voice-type?

It is built and maintained by Yuzhong WU (@yzwu2017); the current version is v2.0.0.

💬 Comments