← 返回 Skills 市场
saddamtechie

Firecrawl Local

作者 SaddamTechie · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
157
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install firecrawl-local
功能描述
Use this skill whenever you need to scrape web pages, crawl websites, or map site structure using a self-hosted Firecrawl instance. Triggers on requests to e...
使用说明 (SKILL.md)

Firecrawl Local Skill

Self-hosted Firecrawl integration using the v1 REST API. Tests connectivity first, executes scrape/crawl/map, handles async crawl polling automatically.

Setup (one-time)

mkdir -p ~/.openclaw/skills/firecrawl-local
cp run.sh ~/.openclaw/skills/firecrawl-local/run.sh
chmod +x ~/.openclaw/skills/firecrawl-local/run.sh

The script lives at scripts/run.sh in this skill folder — copy it into place as above.

Prerequisites: curl, jq installed. Firecrawl running at localhost:3002.

Optional env vars:

export FIRECRAWL_LOCAL_URL="http://localhost:3002"  # default
export FIRECRAWL_API_KEY="fc-your-key"              # only needed if auth enabled

Commands

Default — scrape a single page (URL only, no subcommand needed)

firecrawl-local https://docs.example.com/api

Scrape — explicit, with format options

firecrawl-local scrape https://docs.example.com/api
firecrawl-local scrape https://docs.example.com/api --formats markdown,html

Map — discover all URLs on a site

firecrawl-local map https://docs.example.com
firecrawl-local map https://docs.example.com --limit 200

Crawl — bulk extract multiple pages (async, auto-polled)

firecrawl-local crawl https://docs.example.com
firecrawl-local crawl https://docs.example.com --limit 30 --max-depth 2
firecrawl-local crawl https://docs.example.com --include /docs --exclude /blog

Agent Instructions

When to use each command

Goal Command
Get content from one URL (quickest) firecrawl-local \x3Curl>
Discover what pages exist map
Get content from one URL with format control scrape
Ingest an entire docs site crawl
RAG pipeline ingestion map → targeted scrape or crawl

Optimal workflows

Documentation RAG pipeline:

1. map https://docs.example.com          → get full URL list
2. scrape \x3Cspecific key pages>           → targeted extraction
3. Pass markdown to embedding pipeline

Full site ingestion:

1. crawl https://docs.example.com --limit 50 --max-depth 3
2. Results auto-polled and returned as JSON array of {url, markdown}

Parameters

Flag Applies to Description
--limit N map, crawl Max pages (default: 50 for crawl, 500 for map)
--max-depth N crawl How deep to follow links (default: 2)
--include /path crawl Only crawl URLs matching this path prefix
--exclude /path crawl Skip URLs matching this path prefix
--formats list scrape Comma-separated: markdown, html, rawHtml, links

Reading the output

  • scrape: Returns {success, data: {markdown, html, metadata}}
  • map: Returns {success, links: [...]}
  • crawl: Returns {success, data: [{url, markdown, metadata}, ...]} ← after polling completes

Failure signals and fixes

Error Cause Fix
Local Firecrawl unavailable Service not running Start Firecrawl, check port 3002
success: false Bad URL or blocked Check URL is reachable, try --formats html
Empty markdown field JS-rendered page Firecrawl handles most JS — check if site blocks bots
Crawl times out Site is large Reduce --limit or --max-depth

Script reference

See scripts/run.sh for the full implementation. Key design decisions:

  • Health check uses /health endpoint with 3s timeout
  • Auth header only sent when FIRECRAWL_API_KEY is set
  • Crawl polling retries every 5s up to 60 attempts (5 minutes)
  • All parameters are passed via jq to prevent shell injection in JSON
安全使用建议
This skill appears to implement a legitimate local Firecrawl client, but the manifest omits practical requirements. Before installing: (1) inspect run.sh yourself (it's included) and only install if you trust it, (2) ensure curl and jq are installed, (3) confirm whether you need to set FIRECRAWL_LOCAL_URL — avoid pointing it to untrusted remote hosts, (4) if your Firecrawl uses auth, store FIRECRAWL_API_KEY securely (do not paste it into untrusted places), (5) ask the publisher/registry to update metadata to list required binaries and optional env vars so automated checks don't miss them. If you cannot verify the publisher or don't want to risk misconfiguration (e.g., accidentally pointing the skill at a remote endpoint), treat it as untrusted.
功能分析
Type: OpenClaw Skill Name: firecrawl-local Version: 1.0.0 The firecrawl-local skill is a legitimate integration for a self-hosted Firecrawl instance, providing web scraping and crawling capabilities. The shell script (run.sh) follows security best practices by using jq for safe JSON construction to prevent injection and limits its network activity to the user-configured local service (defaulting to localhost:3002).
能力评估
Purpose & Capability
The name/description (integrate with a self-hosted Firecrawl) aligns with the included run.sh and the SKILL.md: the script performs health checks and calls /v1/map, /v1/scrape, and /v1/crawl as expected. However, the registry metadata declares no required binaries or env vars while the SKILL.md and run.sh require curl, jq and optionally FIRECRAWL_LOCAL_URL and FIRECRAWL_API_KEY — this omission is an incoherence in the manifest.
Instruction Scope
The runtime instructions and script stay within scope: they only interact with the Firecrawl HTTP API (default localhost:3002), perform polling, and output JSON. They do not attempt to read arbitrary system files or access other services. Note: because FIRECRAWL_LOCAL_URL can be set, the script can be pointed at a remote host (which is a legitimate feature but increases risk if misconfigured).
Install Mechanism
There is no automated install spec (instruction-only + a supplied run.sh). That is low risk from hidden downloads. The SKILL.md asks the user to copy run.sh into ~/.openclaw/skills/... manually and mark it executable; this is reasonable but requires the user to perform the file write/permission step themselves.
Credentials
The script uses optional env vars FIRECRAWL_LOCAL_URL and FIRECRAWL_API_KEY and expects curl/jq to be present, but the skill's registry metadata lists none of these requirements. The FIRECRAWL_API_KEY (if set) is sent as a Bearer token to the target service — this is appropriate for auth but the missing declaration in metadata means automated permission/credential reviews might miss it.
Persistence & Privilege
The skill is not marked always:true, does not request persistent system-wide changes, and contains no code that modifies other skills or global agent settings. It requires the user to place the script in their skills directory manually.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install firecrawl-local
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /firecrawl-local 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of Firecrawl Local skill for web scraping and site crawling with a self-hosted Firecrawl instance. - Supports commands for single-page scraping, site mapping, and async multi-page crawling with format and filtering options. - Automatically detects Firecrawl availability and handles crawl polling. - Easy command-line integration with robust parameterization (URL filtering, limits, depth, output format). - Clear agent guidance for documentation ingestion and RAG pipeline workflows.
元数据
Slug firecrawl-local
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 0
历史版本数 1
常见问题

Firecrawl Local 是什么?

Use this skill whenever you need to scrape web pages, crawl websites, or map site structure using a self-hosted Firecrawl instance. Triggers on requests to e... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 157 次。

如何安装 Firecrawl Local?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install firecrawl-local」即可一键安装,无需额外配置。

Firecrawl Local 是免费的吗?

是的,Firecrawl Local 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Firecrawl Local 支持哪些平台?

Firecrawl Local 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Firecrawl Local?

由 SaddamTechie(@saddamtechie)开发并维护,当前版本 v1.0.0。

💬 留言讨论