← Back to Skills Marketplace
yang1002378395-cmyk

Firecrawl Scrape Cn

by yang1002378395-cmyk · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
827
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install firecrawl-scrape-cn
Description
从任意 URL 提取干净的 Markdown 内容,包括 JS 渲染的 SPA。当用户提供 URL 并想要其内容、说"抓取"、"抓网页"、"获取页面"、"从 URL 提取"或"读取网页"时使用此 Skill。支持 JS 渲染页面、多个并发 URL,返回 LLM 优化的 Markdown。
README (SKILL.md)

Firecrawl 网页抓取

抓取一个或多个 URL,返回干净的、LLM 优化的 Markdown。多个 URL 并发抓取。

使用场景

  • 你有特定的 URL 并想要其内容
  • 页面是静态或 JS 渲染的(SPA)
  • 工作流升级模式的第 2 步:搜索 → 抓取 → 映射 → 爬取 → 交互

快速开始

# 基础 Markdown 提取
firecrawl scrape "\x3Curl>" -o .firecrawl/page.md

# 仅主内容,无导航/页脚
firecrawl scrape "\x3Curl>" --only-main-content -o .firecrawl/page.md

# 等待 JS 渲染后抓取
firecrawl scrape "\x3Curl>" --wait-for 3000 -o .firecrawl/page.md

# 多个 URL(每个保存到 .firecrawl/)
firecrawl scrape https://example.com https://example.com/blog https://example.com/docs

# 获取 Markdown 和链接
firecrawl scrape "\x3Curl>" --format markdown,links -o .firecrawl/page.json

# 询问页面内容的问题
firecrawl scrape "https://example.com/pricing" --query "企业版价格是多少?"

选项

选项 描述
-f, --format \x3Cformats> 输出格式:markdown, html, rawHtml, links, screenshot, json
-Q, --query \x3Cprompt> 询问页面内容的问题(5 积分)
-H 在输出中包含 HTTP 头
--only-main-content 去除导航、页脚、侧边栏 — 仅主内容
--wait-for \x3Cms> 抓取前等待 JS 渲染
--include-tags \x3Ctags> 仅包含这些 HTML 标签
--exclude-tags \x3Ctags> 排除这些 HTML 标签
-o, --output \x3Cpath> 输出文件路径

提示

  • 优先使用普通抓取而非 --query 抓取到文件,然后用 grephead 或直接读取 Markdown — 你可以自己搜索和推理完整内容。仅当你想要单个目标答案而不保存页面时使用 --query(额外消耗 5 积分)。
  • 先尝试抓取再交互。 抓取可以处理静态页面和 JS 渲染的 SPA。仅在需要交互(点击、表单填充、分页)时升级到 interact
  • 多个 URL 并发抓取 — 用 firecrawl --status 查看你的并发限制。
  • 单格式输出原始内容。多格式(如 --format markdown,links)输出 JSON。
  • 始终引用 URL — shell 会将 ?& 解释为特殊字符。
  • 命名约定:.firecrawl/{site}-{path}.md

另见


中文翻译版 | 原版:firecrawl-scrape version: "1.0.0"

Usage Guidance
This skill is coherent for scraping web pages into Markdown, but before installing or invoking it: (1) Prefer a vetted binary — if you must use 'npx firecrawl', understand npx will download and run code from the npm registry at runtime; only run it if you trust the package and maintainer. (2) Consider pre-installing the firecrawl CLI from a trusted release (and verify signatures/checksums) instead of using npx. (3) Scraped content will be written to .firecrawl/ in the agent environment — review filesystem permissions and sensitive data that might be captured. (4) Be mindful of legal and robots.txt/crawling rules for target sites and avoid providing credentials to the scraper unless necessary. (5) If you need higher assurance, request the skill source or an install spec pointing to an official release host (GitHub release or package registry info) before enabling.
Capability Assessment
Purpose & Capability
Name/description describe scraping arbitrary URLs (including JS-rendered SPA) into Markdown. The SKILL.md only asks the agent to invoke a 'firecrawl' CLI (or 'npx firecrawl'), save outputs to .firecrawl/, and optionally query results — these are appropriate for the stated purpose.
Instruction Scope
Runtime instructions are limited to running the firecrawl CLI with flags, writing output files (e.g., .firecrawl/page.md), and optionally asking a query. The docs do not instruct the agent to read unrelated system files, environment variables, or exfiltrate agent state. They advise quoting URLs and using grep/head, which is expected.
Install Mechanism
There is no install spec (instruction-only), which is low-risk, but the SKILL.md and allowed-tools explicitly reference 'npx firecrawl'. Using npx will fetch and execute a package from the npm registry at runtime — a legitimate delivery method for CLI tools but a moderate risk if the package/source is not verified. The skill does not declare an authoritative package source or checksum.
Credentials
The skill requests no environment variables, no credentials, and no config paths. This is proportionate for a web-scraping CLI that operates against arbitrary URLs provided by the user.
Persistence & Privilege
always:false and normal agent invocation; the skill writes its own output files under .firecrawl/ which is reasonable. It does not request permanent system-wide presence or modify other skills' configs.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install firecrawl-scrape-cn
  3. After installation, invoke the skill by name or use /firecrawl-scrape-cn
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
firecrawl-scrape-cn v1.0.0 - 首次发布,提供 firecrawl 抓取工具的中文文档与用法说明。 - 支持从任意 URL 提取纯净 Markdown 内容,包括 JS 渲染页面和并发多 URL 抓取。 - 列举主要用法、选项参数及常见使用场景,帮助中文用户快速上手。 - 提供与 firecrawl 相关技能的参考链接及使用建议。
Metadata
Slug firecrawl-scrape-cn
Version 1.0.0
License MIT-0
All-time Installs 3
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is Firecrawl Scrape Cn?

从任意 URL 提取干净的 Markdown 内容,包括 JS 渲染的 SPA。当用户提供 URL 并想要其内容、说"抓取"、"抓网页"、"获取页面"、"从 URL 提取"或"读取网页"时使用此 Skill。支持 JS 渲染页面、多个并发 URL,返回 LLM 优化的 Markdown。 It is an AI Agent Skill for Claude Code / OpenClaw, with 827 downloads so far.

How do I install Firecrawl Scrape Cn?

Run "/install firecrawl-scrape-cn" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Firecrawl Scrape Cn free?

Yes, Firecrawl Scrape Cn is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Firecrawl Scrape Cn support?

Firecrawl Scrape Cn is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Firecrawl Scrape Cn?

It is built and maintained by yang1002378395-cmyk (@yang1002378395-cmyk); the current version is v1.0.0.

💬 Comments