← 返回 Skills 市场
328
总下载
0
收藏
0
当前安装
3
版本数
在 OpenClaw 中安装
/install bailian-studio
功能描述
Call Aliyun Bailian via DashScope; support OCR, TTS, text-to-image and image-to-image.
使用说明 (SKILL.md)
Bailian Studio
Use DashScope for OCR、TTS、文生图、图生图。
Requirements
- Python 3
dashscope(>=1.24.0)oss2requestsffmpeg(TTS 播放依赖,使用 ffplay)
Install:
pip install -r requirements.txt
Config
API Key (priority order):
DASHSCOPE_API_KEYenvsecrets/bailian.env
OSS (priority order):
OSS_ACCESS_KEY,OSS_SECRET_KEY,OSS_BUCKET,OSS_ENDPOINT,OSS_REGIONenvsecrets/bailian.env
Example secrets/bailian.env:
DASHSCOPE_API_KEY=sk-xxx
DASHSCOPE_BASE_URL=https://dashscope.aliyuncs.com/api/v1
# TTS 可选配置(留空走默认)
BAILIAN_TTS_MODEL=qwen3-tts-flash
BAILIAN_TTS_VOICE=
BAILIAN_TTS_SAMPLE_RATE=16000
OSS_ACCESS_KEY=ak-xxx
OSS_SECRET_KEY=sk-xxx
OSS_BUCKET=your-bucket
OSS_ENDPOINT=oss-cn-beijing.aliyuncs.com
OSS_REGION=cn-beijing
Defaults:
- Region/base URL: Beijing (
https://dashscope.aliyuncs.com/api/v1) - Image model:
qwen-image-2.0-pro - Output dir:
tmp/bailian-studio/ - Output format: PNG
Usage
TTS (speak)
python3 {baseDir}/scripts/tts_speak.py --text "你好"
OCR (text)
From local image (uploads to OSS):
python3 {baseDir}/scripts/ocr_text.py --image /path/to.png
From URL:
python3 {baseDir}/scripts/ocr_text.py --url https://example.com/image.png
Image generate (text-to-image)
python3 {baseDir}/scripts/image_generate.py \
--prompt "一只坐在云端的橘猫" \
--width 1024 \
--height 1024
Image generate (image-to-image)
Local image:
python3 {baseDir}/scripts/image_generate.py \
--prompt "改成赛博朋克风格" \
--image /path/to/reference.png \
--width 1024 \
--height 1024
URL image:
python3 {baseDir}/scripts/image_generate.py \
--prompt "改成水彩插画风格" \
--image https://example.com/reference.png \
--width 1024 \
--height 1024
stdin prompt
echo "一只会发光的鲸鱼漂浮在夜空" | python3 {baseDir}/scripts/image_generate.py
Behavior
- 本地参考图:先上传 OSS,再传给 DashScope
- URL 参考图:直接透传给 DashScope
- 默认一次生成 1 张图
- 成功后 stdout 打印保存路径
- 若文件名已存在,自动重命名
- 失败时输出错误信息并返回非 0 退出码
安全使用建议
This skill's code appears coherent with its advertised features, but there are important mismatches and privacy implications to check before installing:
- Credentials: The package metadata claims no required env vars, but the code requires DASHSCOPE_API_KEY and a full OSS credential set (ACCESS_KEY, SECRET_KEY, BUCKET, ENDPOINT, REGION). These are mandatory for many flows and will raise runtime errors if missing. Confirm you are comfortable providing those secrets.
- Data flow: Local reference images are uploaded to your configured OSS bucket by scripts/oss_upload.py and the code constructs public URLs. That means any local files you pass will be uploaded (and potentially publicly addressable depending on bucket ACLs). Avoid supplying private images unless you trust the bucket configuration.
- Execution requirements: TTS playback uses ffplay (ffmpeg). If you don't want playback, use the --output option to save WAV instead.
- Source provenance: The skill's registry metadata omitted required env vars and primary credential. That may be a packaging oversight, but it also makes it easy to miss the need for sensitive keys. Verify the author/source before adding credentials — prefer setting DASHSCOPE_API_KEY and OSS credentials via environment variables rather than leaving a secrets/bailian.env file in the repository.
- Mitigations: Run the code in an isolated environment, inspect/modify scripts if you want uploads to go to a private location, and test with throwaway credentials or a test OSS bucket first. If you need the skill but don't want it to upload anything, pass only URL references to avoid local uploads.
功能分析
Type: OpenClaw Skill
Name: bailian-studio
Version: 0.2.0
The bailian-studio skill bundle provides legitimate integration with Aliyun's Bailian (DashScope) services for OCR, Text-to-Speech, and image generation. The code follows standard practices for API interaction, using the official 'dashscope' and 'oss2' SDKs, and handles credentials through environment variables or a local configuration file (scripts/env.py). No evidence of data exfiltration, malicious command execution, or prompt injection was found; the use of subprocess.run in scripts/tts_speak.py is limited to playing audio via ffplay with safe argument handling.
能力评估
Purpose & Capability
The skill's name/description (Bailian Studio via DashScope for OCR/TTS/image generation) matches the contained scripts: image_generate.py, ocr_text.py, tts_speak.py and an OSS uploader. Requiring OSS credentials is coherent because local reference images are uploaded to OSS before being sent to DashScope. However the registry metadata claims "no required env vars / primary credential", which is inconsistent with the code that requires DASHSCOPE_API_KEY and OSS_* values.
Instruction Scope
SKILL.md instructs the agent to read environment variables or a secrets/bailian.env file, install listed Python deps, and run the included CLIs. The runtime instructions and code are scoped to the declared features: contacting DashScope API, uploading to configured OSS, downloading generation results, and playing/writing WAV for TTS. The scripts do not attempt to read arbitrary unrelated system files or send data to unexpected third-party endpoints beyond DashScope, OSS endpoints, and user-supplied image URLs.
Install Mechanism
There is no install spec — this is instruction + bundled code. Dependencies are standard Python packages listed in requirements.txt (dashscope, oss2, requests). No remote downloads or archive extraction are used. The skill does require an external binary (ffplay/ffmpeg) for TTS playback per README/SKILL.md; the registry metadata didn't declare binary requirements.
Credentials
The code requires sensitive credentials: DASHSCOPE_API_KEY (mandatory via get_dashscope_key, raises if missing) and OSS_ACCESS_KEY, OSS_SECRET_KEY, OSS_BUCKET, OSS_ENDPOINT, OSS_REGION (get_oss_config raises if any missing). Those are appropriate for the skill's functionality, but the registry metadata declares no required env vars or primary credential — a mismatch that could mislead users. Also note that giving OSS credentials allows the skill to upload local files (potentially sensitive) to the configured bucket and produce public URLs.
Persistence & Privilege
The skill does not request permanent 'always' inclusion, does not modify other skills or system-wide settings, and has no install hooks. Autonomous invocation remains possible (platform default) but there is no extra persistence or privilege escalation in the package itself.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install bailian-studio - 安装完成后,直接呼叫该 Skill 的名称或使用
/bailian-studio触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.2.0
Add qwen-image-2.0-pro text-to-image and image-to-image support, local PNG output, requirements, docs, and tests
v0.0.2
Add TTS speak via Bailian; playback with ffplay; docs updated.
v0.0.1
Initial release of bailian-studio.
- First feature: OCR text extraction via DashScope.
- Supports both local images (uploaded to OSS) and image URLs.
- Simple configuration via environment variables or secrets file.
- Python dependencies: dashscope (>=1.22.2), oss2.
- Usage scripts and example configuration included.
元数据
常见问题
Bailian Studio 是什么?
Call Aliyun Bailian via DashScope; support OCR, TTS, text-to-image and image-to-image. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 328 次。
如何安装 Bailian Studio?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install bailian-studio」即可一键安装,无需额外配置。
Bailian Studio 是免费的吗?
是的,Bailian Studio 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Bailian Studio 支持哪些平台?
Bailian Studio 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Bailian Studio?
由 yab(@yab)开发并维护,当前版本 v0.2.0。
推荐 Skills