Description

专业处理B站视频字幕问题，支持语音转文字、字幕下载、内容分析。基于实际B站字幕系统错误问题开发，提供完整的解决方案。

README (SKILL.md)

🎬 B站视频转录专家

Name: Bilibili Video Transcriber
Author: adolescen-he

专业处理B站视频字幕问题，支持语音转文字、字幕下载、内容分析

📋 功能特性

✅ 核心功能

智能字幕处理：自动检测B站字幕系统状态，智能选择最佳方案
语音转文字：使用Whisper模型进行高精度语音识别
国内镜像支持：自动使用国内镜像源，解决网络问题
错误处理：自动检测字幕关联错误，切换到语音转文字
批量处理：支持批量处理多个B站视频

🔧 技术特点

绕过B站字幕系统：直接处理音频，避免字幕关联错误
多模型支持：Whisper base/small/medium模型可选
Cookie管理：支持Cookie文件管理和自动刷新
进度显示：实时显示下载和转录进度
结果验证：自动验证转录内容与视频标题相关性

🚀 快速开始

1. 安装依赖

# 安装技能包
clawhub install bilibili-transcriber-pro

# 或手动安装依赖
pip install bilibili-api requests pydub faster-whisper

2. 配置Cookie

# 创建Cookie文件
echo "SESSDATA=xxx; bili_jct=xxx; buvid3=xxx; DedeUserID=xxx" > ~/.bilibili_cookie.txt

3. 基本使用

# 处理单个视频
bilibili-transcribe BV1txQGByERW

# 指定Cookie文件
bilibili-transcribe BV1txQGByERW --cookie ~/.bilibili_cookie.txt

# 批量处理
bilibili-transcribe --batch bv_list.txt

📖 详细用法

命令行工具

# 查看帮助
bilibili-transcribe --help

# 处理视频并保存结果
bilibili-transcribe BV1txQGByERW --output ./results

# 使用指定模型
bilibili-transcribe BV1txQGByERW --model medium

# 仅下载音频
bilibili-transcribe BV1txQGByERW --audio-only

# 检查字幕状态
bilibili-transcribe BV1txQGByERW --check-only

Python API

from bilibili_transcriber import BilibiliTranscriber

# 初始化
transcriber = BilibiliTranscriber(
    cookie_file="~/.bilibili_cookie.txt",
    model="base",
    use_china_mirror=True
)

# 处理视频
result = transcriber.process(
    bvid="BV1txQGByERW",
    output_dir="./output"
)

# 批量处理
results = transcriber.process_batch(
    bvids=["BV1txQGByERW", "BV1xxxxxxx"],
    output_dir="./batch_output"
)

🛠️ 配置选项

配置文件 `~/.config/bilibili_transcriber/config.yaml`

# Cookie配置
cookie:
  file: "~/.bilibili_cookie.txt"
  auto_refresh: true
  refresh_interval: 86400  # 24小时

# 模型配置
model:
  name: "base"  # base/small/medium
  device: "cpu"  # cpu/cuda
  compute_type: "int8"
  language: "zh"

# 网络配置
network:
  hf_endpoint: "https://hf-mirror.com"
  timeout: 30
  retry_times: 3

# 输出配置
output:
  default_dir: "./bilibili_transcripts"
  save_audio: true
  save_subtitles: true
  format: "txt"  # txt/json/markdown

# 验证配置
validation:
  keyword_match_threshold: 0.3
  min_transcript_length: 50
  check_duration_match: true

📊 输出格式

1. 文本格式 (`transcript.txt`)

[0.00s -> 3.90s] 兄弟们HermesAgent刚刚发布了更新4.13
[3.90s -> 5.76s] 那么这一次最大的一个升级呢
[5.76s -> 9.00s] 是它带来了本地的外部控制面板
...

2. JSON格式 (`transcript.json`)

{
  "video_info": {
    "bvid": "BV1txQGByERW",
    "title": "HermesAgent突然上WebUI了！这一波，体验直接拉满",
    "duration": 210,
    "up": "磊哥聊AI"
  },
  "transcript": [
    {
      "start": 0.0,
      "end": 3.9,
      "text": "兄弟们HermesAgent刚刚发布了更新4.13",
      "confidence": 0.95
    },
    ...
  ],
  "metadata": {
    "model": "base",
    "language": "zh",
    "processing_time": 45.2
  }
}

3. Markdown格式 (`summary.md`)

# HermesAgent突然上WebUI了！这一波，体验直接拉满

**视频信息**
- BV号: BV1txQGByERW
- 时长: 210秒
- UP主: 磊哥聊AI
- 处理时间: 2026-04-15 08:16:00

**核心内容**
1. HermesAgent 4.13版本发布
2. 新增本地WebUI控制面板
3. 支持中英文界面
4. 提供状态监控、会话管理等功能

**完整转录**
[0.00s -> 3.90s] 兄弟们HermesAgent刚刚发布了更新4.13
...

🔍 高级功能

1. 字幕验证系统

# 自动验证字幕准确性
validator = SubtitleValidator()
result = validator.validate(
    transcript=transcript_text,
    video_title=video_title,
    keywords=["HermesAgent", "WebUI", "控制面板"]
)

if result["is_valid"]:
    print(f"✅ 字幕验证通过: {result['match_rate']:.1%} 匹配度")
else:
    print(f"⚠️ 字幕可能有问题: {result['match_rate']:.1%} 匹配度")

2. 批量处理

# 创建视频列表文件
echo "BV1txQGByERW" > bv_list.txt
echo "BV1xxxxxxx" >> bv_list.txt

# 批量处理
bilibili-transcribe --batch bv_list.txt --parallel 3

3. 结果分析

from bilibili_transcriber.analyzer import TranscriptAnalyzer

analyzer = TranscriptAnalyzer()
analysis = analyzer.analyze(transcript_text)

print(f"总时长: {analysis['duration']}秒")
print(f"段落数: {analysis['segment_count']}")
print(f"关键词: {analysis['top_keywords']}")
print(f"摘要: {analysis['summary']}")

⚙️ 故障排除

常见问题

1. Cookie失效

# 重新获取Cookie
bilibili-transcribe --update-cookie

# 手动设置Cookie
export BILIBILI_COOKIE="SESSDATA=xxx; bili_jct=xxx"

2. 网络问题

# 使用代理
bilibili-transcribe BV1txQGByERW --proxy http://127.0.0.1:7890

# 切换镜像源
bilibili-transcribe BV1txQGByERW --mirror https://mirror.example.com

3. 模型下载失败

# 使用本地模型
bilibili-transcribe BV1txQGByERW --model-path ./local_models/

# 跳过模型下载检查
bilibili-transcribe BV1txQGByERW --skip-model-check

调试模式

# 启用详细日志
bilibili-transcribe BV1txQGByERW --verbose

# 调试模式
bilibili-transcribe BV1txQGByERW --debug

# 保存中间文件
bilibili-transcribe BV1txQGByERW --keep-temp

📈 性能优化

1. 缓存机制

# 启用缓存
transcriber = BilibiliTranscriber(
    use_cache=True,
    cache_dir="~/.cache/bilibili_transcriber",
    cache_ttl=3600  # 1小时
)

2. 并行处理

# 并行处理多个视频
bilibili-transcribe --batch bv_list.txt --parallel 4

# 指定线程数
bilibili-transcribe BV1txQGByERW --threads 2

3. 资源限制

# 限制内存使用
bilibili-transcribe BV1txQGByERW --max-memory 2G

# 限制CPU使用
bilibili-transcribe BV1txQGByERW --cpu-limit 50%

🔗 集成示例

1. 与OpenClaw集成

from openclaw.skills import bilibili_transcriber

@skill("bilibili-transcribe")
def handle_bilibili_transcribe(request):
    """处理B站视频转录请求"""
    bvid = request.params.get("bvid")
    
    # 调用转录功能
    result = bilibili_transcriber.process(bvid)
    
    # 返回结果
    return {
        "success": True,
        "data": result
    }

2. 自动化工作流

# workflow.yaml
name: B站视频处理流水线
steps:
  - name: 下载视频
    action: bilibili-transcribe
    params:
      bvid: "{{ input.bvid }}"
      output: "./raw"
  
  - name: 内容分析
    action: analyze-transcript
    params:
      input: "./raw/transcript.txt"
      output: "./analysis"
  
  - name: 生成报告
    action: generate-report
    params:
      analysis: "./analysis"
      template: "video_report.md"

📚 使用案例

案例1：技术教程转录

# 转录AI技术教程
bilibili-transcribe BV1txQGByERW --output ./ai_tutorials

# 生成学习笔记
bilibili-transcribe BV1txQGByERW --format markdown --template study_note.md

案例2：内容分析

# 分析多个视频内容
from bilibili_transcriber import BatchAnalyzer

analyzer = BatchAnalyzer()
results = analyzer.analyze_batch(
    bvids=["BV1txQGByERW", "BV1xxxxxxx"],
    analysis_types=["keywords", "summary", "sentiment"]
)

# 生成对比报告
report = analyzer.generate_comparison_report(results)

案例3：自动化监控

# 监控特定UP主的新视频
from bilibili_transcriber.monitor import VideoMonitor

monitor = VideoMonitor(
    up_mid="12345678",  # UP主ID
    check_interval=3600,  # 每小时检查一次
    callback=process_new_video
)

monitor.start()

🧪 测试

单元测试

# 运行测试
python -m pytest tests/

# 测试特定功能
python -m pytest tests/test_download.py
python -m pytest tests/test_transcribe.py

集成测试

# 测试完整流程
python -m pytest tests/integration/test_full_flow.py

# 使用测试Cookie
BILIBILI_TEST_COOKIE="test_cookie" python -m pytest

📄 许可证

MIT License

🤝 贡献指南

Fork项目
创建功能分支
提交更改
创建Pull Request

📞 支持

问题反馈: GitHub Issues
文档: https://github.com/yourname/bilibili-transcriber-pro
讨论: Discord/微信群

基于实际经验开发，专门解决B站字幕系统错误问题，稳定可靠！

Usage Guidance

This skill appears to be what it claims: a Bilibili video transcriber that downloads audio and runs a local Whisper model. Before installing or running it, consider: 1) The tool expects you to supply Bilibili cookies (SESSDATA, bili_jct, etc.) — these are sensitive session credentials. Only paste/store cookies you trust and keep the cookie file permissions restricted. 2) The code and docs reference a custom mirror (https://hf-mirror.com) for model downloads — verify you trust that mirror before allowing model downloads from it (or configure an official Hugging Face endpoint). 3) The installer will create config and cookie files in your home directory and may create a symlink under ~/.local/bin; review setup.py if you want different locations. 4) Notification options (SMTP/webhook) exist in the config; do not populate them with real credentials or endpoints unless you trust the code and destination. 5) If you need higher assurance, review the included Python files (bilibili_transcriber.py, cli.py, setup.py) yourself or run them in a sandboxed environment; consider using a throwaway Bilibili account or API token rather than your main account if possible.

Capability Analysis

Type: OpenClaw Skill Name: bilibili-video-transcriber Version: 1.0.0 The skill bundle is a well-structured tool designed to transcribe Bilibili videos using the Whisper speech-to-text model. It includes a core processing module (bilibili_transcriber.py), a command-line interface (cli.py), and a comprehensive installation script (setup.py) that configures local environment settings and dependencies. The tool handles sensitive Bilibili authentication cookies stored in ~/.bilibili_cookie.txt, which is a standard requirement for accessing high-quality media on the platform. While the configuration (config.yaml) contains placeholders for webhooks and email notifications, there is no evidence of implementation logic for data exfiltration or unauthorized remote communication. The installation of a CLI symlink and the use of a common Hugging Face mirror (hf-mirror.com) are standard practices for this type of utility, and no malicious prompt injection or obfuscation was found.

Capability Assessment

✓ Purpose & Capability

Name and description (B站视频转录、字幕下载/分析) align with the included code, CLI, and dependencies (bilibili-api, faster-whisper, pydub, requests). Required binaries (python3 or ffmpeg) are reasonable for media download/processing and local Whisper usage.

ℹ Instruction Scope

SKILL.md and README instruct the agent/user to read/write a Bilibili cookie file and to use environment variables (BILIBILI_COOKIE, HF_ENDPOINT, HTTP_PROXY) and local paths (~/.bilibili_cookie.txt, ~/.config/...). Those actions are expected for authenticated Bilibili API access and model/mirror configuration, but the runtime instructions do involve handling sensitive credentials (browser cookies) and creating config files in the user's home directory.

✓ Install Mechanism

There is no remote download/install spec in the registry metadata (instruction-only skill). The repository contains normal packaging and setup scripts that install Python dependencies via pip. No evidence of downloads from untrusted shorteners or personal IPs; the only non-official host referenced is a domestic mirror (https://hf-mirror.com) used for Hugging Face endpoints.

ℹ Credentials

The skill does not declare required environment variables in the registry metadata, but the code and docs use BILIBILI_COOKIE and allow exporting HF_ENDPOINT/HTTP_PROXY. Asking for Bilibili cookies is proportionate to the stated functionality, but these are sensitive credentials and the mismatch between declared requirements (none) and the actual use of env vars should be highlighted to users.

✓ Persistence & Privilege

The package creates config files and may create a cookie file template, symlinks (~/.local/bin) and output directories under the user's home — standard for CLI tools. always: false and normal autonomous invocation settings. No code attempts to modify other skills or system-wide agent settings.

Version History

v1.0.0

Initial release of bilibili-video-transcriber: - 专业支持B站视频语音转文字、字幕下载与内容分析，解决官方字幕系统常见错误。 - 自动检测和切换最佳转录方案，支持Whisper多种模型选择。 - 支持国内镜像、Cookie管理、实时进度与结果验证。 - 提供灵活的CLI和Python API，支持批量处理、输出多种格式（txt/json/markdown）。 - 包含高级功能如字幕验证、内容分析、性能优化和常见问题排查。 - 完善的文档、配置指南、测试用例与集成示例。

Metadata

Slug bilibili-video-transcriber

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Bilibili Video Transcriber?

专业处理B站视频字幕问题，支持语音转文字、字幕下载、内容分析。基于实际B站字幕系统错误问题开发，提供完整的解决方案。 It is an AI Agent Skill for Claude Code / OpenClaw, with 68 downloads so far.

How do I install Bilibili Video Transcriber?

Run "/install bilibili-video-transcriber" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Bilibili Video Transcriber free?

Yes, Bilibili Video Transcriber is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Bilibili Video Transcriber support?

Bilibili Video Transcriber is cross-platform and runs anywhere OpenClaw / Claude Code is available (linux, darwin, win32).

Who created Bilibili Video Transcriber?

It is built and maintained by adolescen-he (@adolescen-he); the current version is v1.0.0.

More Skills

Bilibili Video Transcriber