← 返回 Skills 市场

Web to Markdown

Name: Web to Markdown
Author: chdlc

作者 Christian de la Cruz · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install web-to-md

功能描述

Extracts readable markdown from user-provided URLs via a deterministic fallback chain (markdown.new → r.jina.ai). Use when the user supplies specific URLs an...

使用说明 (SKILL.md)

Web to Markdown

Deterministic, console-first extraction workflow for user-provided URLs. Enforces a fixed fallback chain to maximize content quality without open-ended browsing.

When to Use

The user provides one or more specific URLs.
The task requires reading, extracting, summarizing, or analyzing those URLs.
A deterministic fallback order is preferred over open-ended browsing.

Do not use for open-ended web discovery unless the user explicitly asks for discovery first.

Fallback Chain

For each URL, attempt in order. Stop at the first sufficient result.

1. markdown.new (AI mode)

curl -s "https://markdown.new/{URL}?method=ai"

2. markdown.new (Auto mode)

Only if step 1 is insufficient or timed out:

curl -s "https://markdown.new/{URL}?method=auto"

3. r.jina.ai (Browser engine)

Only if steps 1–2 are insufficient or timed out:

curl -s "https://r.jina.ai/{URL}" -H "X-Engine: browser"

4. Agent tools (last resort)

If all three prefixes fail, report the failure and fall back to the agent's own extraction tools. This is outside the skill's chain — acknowledge it as a fallback.

Quality Gate

After each step, content is insufficient when any condition is true:

Main article or body text is missing
Content is clearly truncated
Output is mostly navigation, boilerplate, placeholders, or login walls
Useful text is too short for the task
Important sections requested by the user are absent

Rule of thumb: Under ~1,200 useful characters for an article page is almost certainly truncated. Naturally short pages (announcements, status updates) may be legitimately brief — use judgment.

URL Handling

Preserve the protocol when present.
Ensure the URL is shell-safe and quoted in all curl commands.
Process each URL independently when multiple are provided.

Provenance Reporting

Report exactly one final source label per extracted URL in your response:

Label	When
`markdown.new:ai`	method=ai was sufficient
`markdown.new:auto`	method=auto was sufficient (ai failed)
`r.jina.ai`	r.jina.ai was sufficient (both markdown.new failed)
`agent-tools`	All three prefixes failed; agent used own tools

Workflow

Scope gate — Only process URLs explicitly provided by the user. If discovery is needed, use web search first and confirm candidate URLs before extraction.
Normalize — Quote URLs, preserve protocol.
Extract — Run the fallback chain per URL.
Quality gate — Check each result against the insufficiency conditions.
Continue — Use the richest sufficient source for the task.
Report — Include provenance labels in the final response.

Best Practices

Keep extraction deterministic — explicit fallback transitions, state why each happened.
Prefer reproducible commands with quoted URLs.
Conservative timeout handling: continue immediately to the next fallback when blocked.
Preserve source traceability via provenance labels.
Avoid tool-specific assumptions beyond curl and standard HTTP endpoints.

Edge Cases

Page blocks automated access: Skip to next fallback immediately.
Multiple URLs: Apply the same sequence to each independently.
Naturally short pages: Accept shorter content when it satisfies the request.
All prefixes fail: Report failure clearly, then use agent tools as last resort.

Common Pitfalls

Output format must be markdown. If any level returns raw HTML or another format, it breaks the contract. Test each level independently.
Don't skip testing lower fallback levels just because the top level works. A chain is only as reliable as its weakest link.
Quality is subjective — the 1,200-char heuristic is a guideline, not a hard rule. Apply judgment for short-form content.

Verification Checklist

curl is installed (which curl)
Extraction starts with markdown.new?method=ai
method=auto is tried only after ai fails
r.jina.ai is tried only after both markdown.new attempts fail
All three prefixes failing → report + fall back to agent tools
Quality checks include: missing body, truncation, boilerplate, too-short content
Final response includes provenance label per URL

安全使用建议

Install this only if you are comfortable with provided URLs being sent to markdown.new or r.jina.ai. Avoid using it with private intranet links, authenticated pages, signed links, URLs containing tokens or secrets, or links whose existence should remain confidential.

能力评估

✓ Purpose & Capability

The stated purpose is URL-to-markdown extraction, and the artifact's capabilities are limited to curl requests against markdown.new, r.jina.ai, and last-resort agent extraction tools.

ℹ Instruction Scope

The workflow is scoped to URLs explicitly provided by the user and includes provenance reporting, but it does not add a clear warning or consent step for sensitive, private, authenticated, internal, or token-bearing URLs.

✓ Install Mechanism

The package contains only SKILL.md and declares curl as a required binary; there are no executable scripts, dependency installs, setup hooks, or hidden files in the artifact.

ℹ Credentials

Network use is expected and proportionate for web extraction, but the URLs are sent to third-party services and may reveal confidential link contents or query parameters.

✓ Persistence & Privilege

No persistence, background execution, credential access, privilege escalation, local indexing, file mutation, or destructive behavior is present.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install web-to-md
安装完成后，直接呼叫该 Skill 的名称或使用 /web-to-md 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release. Deterministic fallback chain: markdown.new → r.jina.ai → agent tools.

元数据

Slug web-to-md

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Web to Markdown 是什么？

Extracts readable markdown from user-provided URLs via a deterministic fallback chain (markdown.new → r.jina.ai). Use when the user supplies specific URLs an... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 56 次。

如何安装 Web to Markdown？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install web-to-md」即可一键安装，无需额外配置。

Web to Markdown 是免费的吗？

是的，Web to Markdown 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Web to Markdown 支持哪些平台？

Web to Markdown 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Web to Markdown？

由 Christian de la Cruz（@chdlc）开发并维护，当前版本 v1.0.0。

Web to Markdown

Web to Markdown

When to Use

Fallback Chain

1. markdown.new (AI mode)

2. markdown.new (Auto mode)

3. r.jina.ai (Browser engine)

4. Agent tools (last resort)

Quality Gate

URL Handling

Provenance Reporting

Workflow

Best Practices

Edge Cases

Common Pitfalls

Verification Checklist

Web to Markdown 是什么？

如何安装 Web to Markdown？

Web to Markdown 是免费的吗？

Web to Markdown 支持哪些平台？

谁开发了 Web to Markdown？

💬 留言讨论