/install videolink-to-article
\r \r
Videolink To Article\r
\r Extract subtitles from a Bilibili or YouTube video URL and process them into a clean, structured Markdown transcript. Handles tool installation, subtitle download, metadata extraction, and a deterministic-then-interpretive cleanup pipeline that preserves the speaker's original wording.\r \r
When To Use\r
\r
Trigger on Bilibili (bilibili.com/video/BV..., b23.tv/...) or YouTube (youtube.com/watch?v=..., youtu.be/...) URLs combined with requests for transcripts, articles, subtitle extraction, or restructuring spoken content into Markdown. Do not trigger for: download-only requests, audio-only ASR (no subtitles available; use a Whisper-based skill), or non-{Bilibili,YouTube} platforms.\r
\r
Routing\r
\r
| URL pattern | Tool |\r
|---|---|\r
| bilibili.com/video/BV..., b23.tv/... | BBDown |\r
| youtube.com/watch?v=..., youtu.be/... | yt-dlp |\r
| Other | Decline |\r
\r
Directory Conventions\r
\r
| Placeholder | Purpose | Lifetime |\r
|---|---|---|\r
| \x3CTOOLS_DIR> | Persistent binary cache (BBDown, yt-dlp, helper scripts). | Cross-session. |\r
| \x3CWORK_DIR> | Per-task outputs (SRT, metadata.json, transcript). | Single task. |\r
\r
Choosing \x3CTOOLS_DIR> (apply in order, stop at first match):\r
- Reuse the agent's existing binary cache directory (
binaries/,bin/,tools/under per-user data) as\x3Cthat-dir>/videolink-to-article/.\r - Otherwise OS default — Windows:
~/bin/videolink-to-article/; macOS:~/Library/Application Support/videolink-to-article/bin/; Linux:~/.local/bin/videolink-to-article/.\r - Never use a workspace folder, system temp, or cache-style dir — tools must survive across sessions.\r
- After install, register
\x3CTOOLS_DIR>on the user's persistent PATH (seereferences/install_tools.md§ Persisting tools to PATH). Without this, every new session re-runs the install probe.\r \r\x3CWORK_DIR>is per-task. Reasonable picks: workspace-relativeoutput/ortranscripts/, or a task-named subfolder under temp. Safe to clean up after delivery.\r \r ---\r \r
Step 1: Verify Tools\r
\r
For Bilibili check BBDown.exe; for YouTube check yt-dlp.exe. Probe order: system PATH → \x3CTOOLS_DIR>/\x3Cexe> → any path the user has explicitly mentioned. If missing, install per references/install_tools.md (use the standalone yt-dlp.exe, not a venv shim) and end the install with PATH registration.\r
\r
BBDown does a startup check for ffmpeg. If ffmpeg is missing, pass --ffmpeg-path "\x3Cany-existing-exe>" (e.g. \x3CTOOLS_DIR>/yt-dlp.exe). Subtitle-only mode never actually invokes ffmpeg, but BBDown refuses to start without the path resolving.\r
\r
YouTube from Mainland China usually requires --proxy http://127.0.0.1:\x3Cport>.\r
\r
---\r
\r
Step 2: Identify Video (fetch metadata)\r
\r
python scripts/fetch_metadata.py "\x3CURL>" --bbdown "\x3Cbbdown.exe>" --ytdlp "\x3Cyt-dlp.exe>" --ffmpeg-path "\x3Cany-existing-exe>"\r
```\r
\r
Save stdout JSON to `\x3CWORK_DIR>/\x3Cvideo_id>/metadata.json`. Fields: `platform`, `video_id`, `title`, `uploader`, `uploader_url`, `duration_seconds`, `publish_date`, `url`. Notes:\r
- Bilibili `uploader` may be `null` (BBDown's `--only-show-info` doesn't print UP name); extract from title or ask the user.\r
- Exit `3` = title parse failed → upstream tool's output format changed.\r
\r
Use `\x3Cvideo_id>` to name the per-task subdirectory under `\x3CWORK_DIR>`.\r
\r
---\r
\r
## Step 3: Probe Available Subtitles\r
\r
Always list before downloading.\r
\r
**Bilibili:**\r
\r
```powershell\r
BBDown.exe "\x3CURL>" --only-show-info --sub-only --skip-ai false `\r
--ffmpeg-path "\x3Cany-exe>" *> bbdown_info.log 2>&1\r
Get-Content bbdown_info.log -Encoding UTF8\r
```\r
\r
> ⚠️ **`--skip-ai false` is required** every time — BBDown's default skips AI subtitles silently. The double-negative is intentional; verify the argument is present.\r
\r
If the log shows "需要登录", run `BBDown.exe login` first (see `references/auth.md`).\r
\r
**YouTube:**\r
\r
```bash\r
yt-dlp --list-subs --skip-download "\x3CURL>"\r
```\r
\r
If the command fails with an authentication error (e.g. "Sign in to confirm you're not a bot", "Sign in to confirm your age", HTTP 403), follow this auth resolution flow **in order**, stopping at the first success:\r
\r
1. **Try `--cookies-from-browser`** (fastest if it works):\r
```powershell\r
yt-dlp --cookies-from-browser edge --list-subs --skip-download "\x3CURL>"\r
```\r
If this fails with `Could not copy Chrome cookie database` → close all browser instances and retry. If it fails with `Failed to decrypt with DPAPI` (App-Bound Encryption on Chromium ≥ v127), **skip to step 2** — this error is unrecoverable.\r
\r
2. **Ask the user for a `cookies.txt` file.** This is the most reliable method. Present the following guidance to the user:\r
\r
> YouTube 需要登录态才能获取字幕,但自动读取浏览器 Cookie 失败了。\r
>\r
> 请按以下步骤导出 `cookies.txt` 文件:\r
> 1. 安装 Chrome 扩展 [Get cookies.txt LOCALLY](https://chromewebstore.google.com/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc)(开源、本地运行、无遥测)\r
> 2. 在浏览器中打开目标 YouTube 视频页面(确保已登录)\r
> 3. 点击扩展图标 → **Export**,将 `cookies.txt` 保存到本地\r
> 4. 告诉我 cookies.txt 文件的路径\r
\r
Once the user provides the file path, pass it to yt-dlp:\r
```bash\r
yt-dlp --cookies "\x3Ccookies.txt_path>" --list-subs --skip-download "\x3CURL>"\r
```\r
\r
Save the cookies path for reuse in Step 4.\r
\r
Three caption kinds, ranked by quality:\r
\r
| Entry | Meaning | Quality |\r
|---|---|---|\r
| `Available subtitles` | Human-uploaded | Highest |\r
| `Available automatic captions` → `\x3Clang>-orig` | Source-language ASR | High (no translation step) |\r
| `Available automatic captions` → `\x3Clang>` | Auto-translation of source ASR | Lower (machine translation) |\r
\r
**Picking rule.** Prefer human captions; otherwise prefer `\x3Clang>-orig`; only fall back to translated auto-captions when neither exists. Trap: if you see e.g. `zh-Hans` and `en-orig` but no `zh-Hans-orig`, the speaker is **not** speaking Chinese — `zh-Hans` is a translation of the English ASR. Always download `\x3Clang>-orig` and work from it; do not pre-emptively prefer a translated track because it matches the user's interface language.\r
\r
---\r
\r
## Step 4: Download Subtitles\r
\r
Layout: `\x3CWORK_DIR>/\x3Cvideo_id>/{metadata.json, *.srt, sentences.txt}`.\r
\r
**Bilibili:**\r
\r
```powershell\r
BBDown.exe "\x3CURL>" --sub-only --skip-ai false `\r
--work-dir "\x3CWORK_DIR>/\x3Cvideo_id>" --ffmpeg-path "\x3Cany-exe>"\r
```\r
\r
Output: `\x3CvideoTitle>.\x3Clang>.srt` (e.g. `*.ai-zh.srt`, `*.ai-en.srt`).\r
\r
**YouTube:**\r
\r
```bash\r
yt-dlp --write-subs --write-auto-subs \\r
--sub-langs "\x3Csource-lang>-orig,\x3Csource-lang>" \\r
--convert-subs srt --skip-download --ignore-no-formats-error \\r
-o "\x3CWORK_DIR>/\x3Cvideo_id>/%(id)s.%(ext)s" "\x3CURL>"\r
```\r
\r
If Step 3 resolved auth via `--cookies-from-browser`, add the same flag here. If a `cookies.txt` file was provided, add `--cookies "\x3Ccookies.txt_path>"`. Example with cookies:\r
\r
```bash\r
yt-dlp --cookies "\x3Ccookies.txt_path>" \\r
--write-subs --write-auto-subs \\r
--sub-langs "\x3Csource-lang>-orig,\x3Csource-lang>" \\r
--convert-subs srt --skip-download --ignore-no-formats-error \\r
-o "\x3CWORK_DIR>/\x3Cvideo_id>/%(id)s.%(ext)s" "\x3CURL>"\r
```\r
\r
Replace `\x3Csource-lang>` with the actual language code from Step 3 — do not hardcode a default. The `-o` template uses `%(id)s` (not `%(title)s`) on purpose: titles vary in punctuation and special characters, and yt-dlp's title sanitization differs from BBDown's; using `id` keeps file naming uniform across both platforms, and the human-readable title prefix is added later by Step 8 (`rename_with_title.py`). **`--ignore-no-formats-error` is required**: without it, yt-dlp aborts on `Requested format is not available` *before* writing queued subtitle files (common on bot-flagged sessions). Add `--proxy http://127.0.0.1:\x3Cport>` when blocked.\r
\r
**Fallback for srv3.** If yt-dlp writes `*.srv3` instead of `*.srt` (its converter pipeline can't always handle srv3 without ffmpeg), run `python scripts/srv3_to_text.py \x3Cinput.srv3> \x3Cout_basename>` to produce `.srt` and `.txt` directly. Stdlib-only.\r
\r
For auth-required videos (member-only, age-restricted, "Sign in to confirm you're not a bot"), see `references/auth.md`.\r
\r
---\r
\r
## Step 5: Validate the SRT\r
\r
```powershell\r
Get-Content "\x3Cfile>.srt" -Encoding UTF8 -TotalCount 30\r
```\r
\r
| Symptom | Cause | Action |\r
|---|---|---|\r
| 0-byte file | Lang not actually available | Retry Step 3/4 with another lang code |\r
| Timestamps all `00:00:00,000` | Corrupt subtitle | Re-download |\r
| Garbled / mojibake | Wrong encoding | Re-read with UTF-8 |\r
| Body is `[音乐]` / `[Music]` only | No real captions | Stop; tell the user |\r
| Segment count ≪ video length | Truncated | Retry |\r
\r
---\r
\r
## Step 6: Normalize SRT (deterministic)\r
\r
```bash\r
python scripts/srt_to_sentences.py \x3Cinput.srt> \x3CWORK_DIR>/\x3Cvideo_id>/sentences.txt [--target-len 60] [--max-len 120]\r
```\r
\r
Defaults work for typical conversational video. Dense expository → raise `--max-len` (e.g. 180). Staccato delivery → lower `--target-len` (e.g. 40). Exit `2` = empty / no parseable segments → SRT is invalid; revisit Step 5.\r
\r
---\r
\r
## Step 7: Interpretive Cleanup & Restructure\r
\r
### Output language (hard rule)\r
\r
**The transcript stays in the source language of the video.** English video → English transcript; Japanese → Japanese; Mandarin → Mandarin. **Do not silently translate** to the user's interface language, the agent's response language, or any other locale.\r
\r
Translate only when the user **explicitly** asks ("整理成中文文稿", "translate to English"). Even then, produce the source-language version first as the primary deliverable; offer translation as an additional output.\r
\r
This rule overrides any default-language hint injected into the agent's runtime context.\r
\r
### Cleanup checklist\r
\r
1. **Preserve original wording.** Do not paraphrase. The transcript should read like the speaker's own words, with errors fixed and filler removed.\r
2. **Fix ASR errors.** Scan the full transcript once, build a per-video glossary (cross-reference video title / description / channel name / web search), then apply consistently. Apply only when the error is unambiguous and the correct form is verifiable. Methodology + worked example: `references/cleanup_guide.md`.\r
3. **Remove filler sparingly.** Read § Words to KEEP first — many seemingly-filler words carry the speaker's stance. The lists are organized by language; consult the section matching the source language.\r
4. **Add lightweight headings.** Insert `##` / `###` only at obvious topic transitions the speaker explicitly signals ("第一个是…", "Let's talk about…", "Moving on to…"). Headings stay in the source language, short, descriptive. **No structural signals → no headings.** Output continuous paragraphs.\r
5. **Final structure.** Use `metadata.json` from Step 2 to populate the header. Header label set follows the source language (Chinese labels for Chinese videos, English for English, etc.). Templates: `references/cleanup_guide.md` § Output Skeleton Template.\r
\r
### What the deliverable must NOT contain\r
\r
- "整理说明" / "Changes applied" tables\r
- Footnote citations for ASR-corrected passages (`[^1]: 出自...`)\r
- Confidence/uncertainty annotations or `[?...]` placeholders\r
- Tool-internal notes about the cleanup process\r
- Any content the speaker did not say\r
\r
When a fragment cannot be cleaned with confidence, **trim** it rather than guess or annotate. Full rule list: `references/cleanup_guide.md` § What NOT To Do.\r
\r
---\r
\r
## Step 8: Rename final transcript & cleanup\r
\r
After Step 7 has written `transcript.md` (and any `transcript_*.md` translations), run:\r
\r
```bash\r
python scripts/rename_with_title.py \x3CWORK_DIR>/\x3Cvideo_id>\r
```\r
\r
This renames the final transcript file(s) to include the sanitized video title as a prefix, while keeping `\x3Cvideo_id>` as a stable suffix. Result format: `\x3Csanitized_title>__\x3Cvideo_id>.md`. The script is idempotent — safe to re-run.\r
\r
**Cleanup intermediate files.** After renaming, delete all intermediate artifacts (`.srt`, `.srv3`, `sentences.txt`, `.info.json`, `metadata.json`, etc.) — only the final renamed transcript(s) should remain in the output directory. The agent should deliver only the final transcript file to the user.\r
\r
```powershell\r
# Example cleanup (Windows PowerShell) — keep only the renamed transcript(s)\r
Get-ChildItem "\x3CWORK_DIR>/\x3Cvideo_id>" -File | Where-Object {\r
$_.Name -notlike "*__*.md"\r
} | Remove-Item -Force\r
```\r
\r
**Why rename only the transcript.** Since intermediate files (SRT, sentences.txt, metadata.json) are cleaned up and not delivered, the rename logic only needs to handle the final `transcript.md` and any translation variants (`transcript_*.md`). This keeps the output directory clean with just the deliverable(s).\r
\r
---\r
\r
## Troubleshooting\r
\r
`references/troubleshooting.md` is a recovery matrix organized by failure category (tools & environment, network & download, authentication, subtitle availability, encoding & display). Always retry once before reporting transient network issues.\r
\r
---\r
\r
## Resources\r
\r
**scripts/**\r
- `fetch_metadata.py` — unified Bilibili/YouTube metadata fetcher (JSON to stdout). Exit 1 (URL invalid) / 2 (tool missing) / 3 (parse failed).\r
- `srt_to_sentences.py` — deterministic SRT preprocessor; merges short segments into sentence-like lines. Exit 2 on empty/invalid input.\r
- `srv3_to_text.py` — fallback converter when yt-dlp's `--convert-subs srt` leaves you with raw srv3 XML. Stdlib-only. Exit 2 on empty input.\r
- `rename_with_title.py` — Step 8 helper: renames the final transcript file(s) with a sanitized video title prefix. Stdlib-only, idempotent. Exit 2 (metadata.json missing) / 3 (metadata.json missing fields).\r
\r
**references/**\r
- `install_tools.md` — install procedures for BBDown / yt-dlp (Windows + POSIX), GitHub mirrors, ffmpeg workaround, PATH registration.\r
- `auth.md` — authentication flows: BBDown QR-code login, yt-dlp `--cookies-from-browser`, manual `cookies.txt` export. App-Bound Encryption pitfalls.\r
- `cleanup_guide.md` — methodology for Step 7: ASR error identification, Words to KEEP vs filler, sectioning heuristics, output skeleton.\r
- `worked_example.md` — one full walkthrough on a real video. Read once when learning; skip when you just need rules.\r
- `troubleshooting.md` — recovery matrix.\r
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install videolink-to-article - After installation, invoke the skill by name or use
/videolink-to-article - Provide required inputs per the skill's parameter spec and get structured output
What is videolink-to-article?
Use when the user provides a Bilibili or YouTube URL and asks for a transcript, article, or subtitle extraction. Downloads platform subtitles via BBDown / yt... It is an AI Agent Skill for Claude Code / OpenClaw, with 62 downloads so far.
How do I install videolink-to-article?
Run "/install videolink-to-article" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is videolink-to-article free?
Yes, videolink-to-article is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does videolink-to-article support?
videolink-to-article is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created videolink-to-article?
It is built and maintained by HQWQF (@hqwqf); the current version is v1.0.0.