← Back to Skills Marketplace
lsa03

Amber Url To Markdown

by Amber03 · GitHub ↗ · v4.0.3 · MIT-0
cross-platform ⚠ suspicious
216
Downloads
1
Stars
1
Active Installs
9
Versions
Install in OpenClaw
/install amber-url-to-markdown
Description
智能 URL 转 Markdown 工具(V4.0 可扩展架构)。**支持自动触发 Hook**,当用户发送 URL 链接时自动抓取内容并转换为 Markdown 格式。采用可扩展的分类处理架构,支持豆包、微信公众号、知乎、掘金等网站。
README (SKILL.md)

Amber Url to Markdown

智能 URL 转 Markdown 工具,支持微信公众号、知乎、掘金、CSDN、GitHub、Medium 等 7+ 网站类型。

🚀 自动触发(V3.1 新增)

安装技能后,启用 Hook 即可实现真正的自动触发!

启用自动触发 Hook

# 1. 查看可用 Hook
openclaw hooks list

# 2. 启用 url-auto-fetch Hook
openclaw hooks enable url-auto-fetch

# 3. 检查 Hook 状态
openclaw hooks check

启用后,当用户发送以下消息时会自动抓取:

  1. 纯 URL 消息

    https://mp.weixin.qq.com/s/xxx
    
  2. URL + 意图关键词

    帮我把这篇文章转成 Markdown:https://mp.weixin.qq.com/s/xxx
    解析这个链接:https://zhuanlan.zhihu.com/p/xxx
    

Hook 工作原理

  • 监听 message:received 事件
  • 自动检测消息中的 URL 链接
  • 判断是否为纯 URL 或包含意图关键词
  • 自动调用 amber_url_to_markdown.py 脚本
  • 异步执行,不阻塞消息处理

🎯 手动触发(AI 调用)

如果未启用 Hook,AI 会在检测到以下条件时主动调用此技能:

  1. 纯 URL 消息 - 用户只发送了一个 URL 链接

    https://mp.weixin.qq.com/s/xxx
    
  2. URL + 意图关键词 - 消息包含 URL 且有以下关键词:

    • 解析、转换、转成、转为、生成、抓取、爬取、下载
    • markdown、md、文章、内容
    帮我把这篇文章转成 Markdown:https://mp.weixin.qq.com/s/xxx
    解析这个链接:https://zhuanlan.zhihu.com/p/xxx
    

AI 行为约定: 检测到上述条件时,自动执行脚本 python3 scripts/amber_url_to_markdown.py \x3CURL> 并报告结果。

核心功能

  • 智能识别:自动识别链接类型,使用最优抓取策略
  • 多网站支持:支持微信公众号、知乎、掘金、CSDN、GitHub、Medium 等
  • 图片下载:自动下载所有图片到本地 images 目录
  • 完整格式:保留完整的 Markdown 格式,包括标题、列表、代码块等
  • 自动降级:三种抓取方案确保成功率(Playwright → Scrapling → API)

📦 安装

1. 安装技能

# 从 ClawHub 安装
clawhub install amber-url-to-markdown

# 或手动克隆
git clone https://github.com/OrangeViolin/amber-url-to-markdown.git

2. 安装依赖

# 安装核心 Python 库
pip install playwright beautifulsoup4 markdownify requests scrapling html2text

# 安装 Playwright 浏览器(必需)
playwright install chromium

3. 启用自动触发 Hook(可选但推荐)

# 查看可用 Hook
openclaw hooks list

# 启用 url-auto-fetch Hook
openclaw hooks enable url-auto-fetch

# 重启 Gateway
openclaw gateway restart

启用 Hook 后,用户发送 URL 时会自动抓取,无需 AI 手动调用!

使用方式

飞书聊天(推荐)

AI 会自动识别并抓取,以下两种方式都会触发:

  1. 纯 URL 消息(AI 会自动识别):

    https://mp.weixin.qq.com/s/xxx
    
  2. URL + 意图说明(更明确):

    帮我把这篇文章转成 Markdown:https://mp.weixin.qq.com/s/xxx
    解析这个链接:https://zhuanlan.zhihu.com/p/xxx
    

AI 行为约定: 当检测到纯 URL 或 URL+ 意图关键词时,自动执行脚本并报告结果。

命令行

python3 scripts/amber_url_to_markdown.py \x3CURL>

Python 调用

from amber_url_to_markdown import fetch_url_to_markdown

result = fetch_url_to_markdown("https://mp.weixin.qq.com/s/xxx")
print(f"文件已保存:{result['file']}")

支持的网站

网站 类型 状态
微信公众号 wechat ✅ 完美支持
知乎 zhihu ✅ 支持
掘金 juejin ✅ 支持
CSDN csdn ✅ 支持
GitHub github ✅ 支持
Medium medium ✅ 支持
通用网页 general ✅ 支持

输出目录

新的目录结构:

/root/openclaw/urltomarkdown/
├── 文章标题 1.md              # MD 文件直接保存在根目录
├── 文章标题 2.md
└── images/
    └── knowledge_YYYYMMDD_HHMMSS/  # 图片按时间戳分组
        ├── img_001.jpg
        └── img_002.jpg

命名规则:

  • MD 文件{文章标题}.md - 直接保存在 /root/openclaw/urltomarkdown/ 根目录
  • 图片目录images/knowledge_{时间戳}/ - 时间戳格式:YYYYMMDD_HHMMSS
  • 图片文件img_{序号:03d}.jpg(如 img_001.jpg
  • Markdown 中的图片引用images/knowledge_时间戳/img_001.jpg

优势:

  • MD 文件集中管理,方便查找和整理
  • 图片按文章分组,避免冲突
  • 只有包含图片的文章才创建 images 目录

降级策略

  1. Playwright(首选)- 无头浏览器,支持所有网站,最稳定
  2. Scrapling(备选)- 快速轻量,支持所有网站
  3. 第三方 API(保底)- 仅支持微信公众号

技术特点

  • 使用 Playwright 无头浏览器模拟真实访问
  • 使用 BeautifulSoup 解析 HTML
  • 使用 markdownify 转换为标准 Markdown
  • 智能等待策略(networkidle)避免卡死
  • 详细日志输出便于排查问题

注意事项

  • 需要稳定的网络连接
  • 部分网站可能需要登录(如腾讯文档、语雀等不支持)
  • 图片默认下载到本地,使用相对路径引用
Usage Guidance
This skill is functionally consistent with a URL→Markdown scraper, but has several operational and privacy concerns. Before installing or enabling the auto-trigger Hook: 1) Review the included scripts (especially fetcher/handlers and the Hook handler) in a safe environment to ensure nothing sends data to unexpected remote endpoints. 2) Avoid pasting cookies or other sensitive tokens into code; prefer using browser persistent context created via manual login if needed, and store any credentials securely (not hard-coded). 3) Restrict the Hook triggers (enable only pure-URL or limit allowed domains) so it doesn't automatically fetch internal or private links. 4) Install and run Playwright and all dependencies in an isolated/sandboxed environment (container or VM), and inspect the files written to /root/openclaw/urltomarkdown and the doubao_user_data directory. 5) If you plan to enable automatic Hook execution, test with non-sensitive, public URLs first and monitor logs/output. These actions will reduce the risk of accidental credential exposure or unwanted automatic scraping.
Capability Analysis
Type: OpenClaw Skill Name: amber-url-to-markdown Version: 4.0.3 The skill bundle is classified as suspicious due to a shell injection vulnerability in 'hooks/url-auto-fetch/handler.ts', where user-supplied URLs are inadequately sanitized before being executed via 'child_process.exec'. Additionally, 'scripts/amber_url_to_markdown.py' contains a fallback feature that transmits URLs to an external third-party API (down.mptext.top), which could lead to unintentional data disclosure. These issues represent significant security and privacy risks, although clear evidence of malicious intent is absent.
Capability Assessment
Purpose & Capability
The skill's name/description (URL → Markdown, auto-trigger Hook) matches the code and docs: handlers, fetchers, parser, and a Hook handler are present. However, metadata claims 'no required binaries/env/config', while SKILL.md and hook docs clearly require python3, various Python packages, and Playwright (chromium). The manifest omission (no required binaries/config paths) is an inconsistency to be aware of.
Instruction Scope
Runtime instructions and Hook docs instruct the agent/system to listen to message:received, detect URLs, and asynchronously run python3 scripts that fetch pages, download images, persist browser contexts, and save output under /root/openclaw/... . The Hook description explicitly documents using child_process.exec to invoke scripts. The skill also documents manual cookie injection for protected sites (doubao), which instructs the user to copy full Cookie headers into code — this is sensitive behavior and expands scope to handling secrets. The Hook is configured to trigger on all messages matching URL patterns/keywords unless users customize whitelists.
Install Mechanism
No install spec is provided to the platform (instruction-only), which means nothing is automatically written during install — lower platform install risk. However, the repository includes many Python scripts and the SKILL.md requires installing Python packages and Playwright and running 'playwright install chromium'. That imposes significant runtime dependencies that must be installed manually; the lack of declared required binaries in registry metadata is a mismatch.
Credentials
The skill declares no required env vars or config paths, but the code/docs reference writing persistent browser context and cookies to /root/openclaw/skills/.../doubao_user_data/ and outputs to /root/openclaw/urltomarkdown/. The DOUBAO_SETUP instructs users to paste full Cookie header values directly into script headers (sensitive credentials). Although no cloud credentials are requested, the practice of storing session cookies and instructing users to inject them into scripts is disproportionate and risky for secret handling.
Persistence & Privilege
always:false and no special platform privileges are requested. The skill uses the Hooks system to run asynchronously on message events; autonomous invocation is allowed by default and this skill's Hook design increases its blast radius (it can automatically fetch arbitrary URLs from messages). The skill writes files and persistent browser state under /root/openclaw; that is normal for a local scraping tool but users should be aware that data (including cookies) will be stored on disk.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install amber-url-to-markdown
  3. After installation, invoke the skill by name or use /amber-url-to-markdown
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v4.0.3
代码审查确认无问题,正式发布到 ClawHub。确认 Hook 自动触发功能正常工作,所有依赖和配置正确。
v4.0.2
No file changes detected in this version. - No functional or documentation changes. - Version incremented for maintenance or procedural reasons only.
v3.2.3
Amber Url to Markdown 3.2.3 - Updated documentation in SKILL.md, removing sections on author, license, and tags for a cleaner overview. - Minor content reorganization in documentation; core functionality and usage remain unchanged. - No user-facing logic changes in extraction or conversion scripts.
v3.2.2
- Improved scripts/parser.py logic (details not specified). - Updated _meta.json metadata. - No changes to the user-facing SKILL.md documentation in this version.
v3.2.1
Amber URL to Markdown 3.2.1 - Major refactor introducing a new extensible handler-based architecture (V4) under scripts/handlers/ - Added support for Doubao site in addition to WeChat, Zhihu, Juejin, and other platforms - New documentation files: setup, limitations, release notes, and architectural notes for V4 - Main script and modules updated to integrate the modular handler structure - Keeps original features (auto-trigger via Hook, multi-site markdown conversion, local image download) with improved extensibility
v3.2.0
新增 url-auto-fetch Hook,实现真正的自动触发功能
v3.1.0
Version 3.1.0 introduces major refactoring and optimizations. - Added modular core scripts: async_fetcher.py, config.py, fetcher.py, pagination.py, parser.py, and utils.py for better maintainability and extensibility. - Introduced automated testing with tests/test_amber_url_to_markdown.py and a test runner script. - Added Python packaging files (pyproject.toml, requirements.txt) for improved dependency management. - Enhanced documentation and change logs with new markdown files. - Refactored and optimized main and third-party scripts for performance and reliability improvements.
v2.2.0
Amber Url to Markdown v2.2.0 - 新增 AI 触发行为约定:用户发送“纯 URL”或“URL+意图关键词”时自动执行并报告结果 - 明确触发条件及自动执行规则,便于集成到聊天/自动化环境 - 输出目录结构优化:Markdown 文件集中存储,图片按时间戳分组,引用路径调整 - 文档全面更新,详细说明触发场景、输出规范和使用方式
v2.1.0
Amber_Url_to_Markdown v2.1.0 - 增加对 CSDN 和 Medium 网站的 Markdown 内容抓取支持。 - 完善依赖说明及安装指引,增强用户操作体验。 - 文档结构优化,明确输出目录与降级抓取策略。 - 支持命令行与 Python 两种调用方式。 - 强调图片本地下载与完整 Markdown 格式保留。
Metadata
Slug amber-url-to-markdown
Version 4.0.3
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 9
Frequently Asked Questions

What is Amber Url To Markdown?

智能 URL 转 Markdown 工具(V4.0 可扩展架构)。**支持自动触发 Hook**,当用户发送 URL 链接时自动抓取内容并转换为 Markdown 格式。采用可扩展的分类处理架构,支持豆包、微信公众号、知乎、掘金等网站。 It is an AI Agent Skill for Claude Code / OpenClaw, with 216 downloads so far.

How do I install Amber Url To Markdown?

Run "/install amber-url-to-markdown" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Amber Url To Markdown free?

Yes, Amber Url To Markdown is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Amber Url To Markdown support?

Amber Url To Markdown is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Amber Url To Markdown?

It is built and maintained by Amber03 (@lsa03); the current version is v4.0.3.

💬 Comments