← 返回 Skills 市场
66
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install doc-scraper
功能描述
Documentation extraction and indexing. Extracts information from markdown files and syncs to workspace-db. Works alongside workspace-db which handles synchro...
使用说明 (SKILL.md)
Doc Scraper
Dokumentations-Extraktion und Indexierung - arbeitet mit workspace-db zusammen.
Zusammenspiel mit workspace-db
| Skill | Aufgabe | Datenbank |
|---|---|---|
| workspace-db | Synchronisation & Organisation | docs.db |
| doc-scraper (dieser) | Informationsextraktion | Nutzt docs.db |
Aufgaben
1. Markdown-Extraktion
// Extrahiert aus SKILL.md:
// - Name, Version, Beschreibung
// - Nutzungsbeispiele
// - Konfigurationsoptionen
const docInfo = await docScraper.extractMarkdown({
file: "skills/my-skill/SKILL.md",
extract: ["title", "description", "usage", "config"]
});
2. Indexierung in docs.db
// Speichert extrahierte Daten in docs.db
// (workspace-db verwaltet die DB)
await docScraper.index({
source: "skills/my-skill/SKILL.md",
data: docInfo,
tags: ["skill", "api"]
});
3. Auto-Update bei Änderungen
# Überwacht .md Dateien
# Extrahiert bei Änderung neu
# Aktualisiert docs.db
doc-scraper watch --dir skills/ --ext .md
Extraktions-Templates
Skill-Dokumentation
# Aus SKILL.md extrahiert:
name: "skill-name"
description: "Beschreibung"
version: "1.0.0"
category: "database"
usage_examples:
- command: "openclaw skill"
result: "..."
API-Dokumentation
# Aus API.md extrahiert:
endpoints:
- path: "/api/v1/search"
method: "GET"
params:
- query: string
response: json
System-Dokumentation
# Aus SYSTEM.md extrahiert:
components:
- databases:
- docs.db
- tree.db
cron_jobs:
- db-maintainer: "*/30"
Workflow
skill.md geändert
↓
doc-scraper erkennt Änderung
↓
Extrahiert: name, desc, usage, config
↓
Speichert in docs.db
↓
workspace-db synchronisiert
Nutzung
Einmalig
doc-scraper index --dir skills/ --recursive
doc-scraper index --dir docs/ --ext .md
Watch-Modus
# Kontinuierlich überwachen
doc-scraper watch --dir workspace/
# Einzelne Datei
doc-scraper watch --file README.md
Suche
# Direkt in extrahierten Daten suchen
doc-scraper search --query "database"
doc-scraper search --tag "api" --format json
Integration mit workspace-db
// doc-scraper extrahiert
// workspace-db speichert/organisiert
const extracted = await docScraper.extract('skills/my/SKILL.md');
// Übergabe an workspace-db
await workspaceDb.syncDocument({
id: extracted.name,
category: extracted.category,
data: extracted,
source_file: 'skills/my/SKILL.md'
});
Konfiguration
{
"doc-scraper": {
"watch_dirs": ["skills/", "docs/"],
"extensions": [".md", ".mdx"],
"extract_headers": ["##", "###"],
"auto_index": true,
"workspace_db_integration": true
}
}
Links
安全使用建议
This skill is instruction-only and does not include the doc-scraper implementation it describes. Before installing or enabling it: 1) Ask the author for the implementation or an install spec (how is doc-scraper provided?). 2) Verify workspace-db: where docs.db is stored and whether any credentials/network endpoints are used. 3) Limit scanning scope—do not allow the skill to watch your entire workspace; specify allowed directories only. 4) Prefer manual invocation until you can review the underlying code; avoid granting always:true or leaving autonomous long-running watchers enabled. 5) If you must run it, run in a sandboxed/least-privilege environment and audit the files that get indexed to ensure no secrets are captured.
功能分析
Type: OpenClaw Skill
Name: doc-scraper
Version: 1.0.0
The doc-scraper skill is designed for extracting metadata and indexing documentation from Markdown files into a workspace database. The files (SKILL.md, package.json, and _meta.json) contain functional instructions, usage examples, and configuration templates that align strictly with its stated purpose of documentation management. No indicators of malicious intent, data exfiltration, or prompt injection were found.
能力评估
Purpose & Capability
SKILL.md describes a CLI (doc-scraper) and JS API (docScraper.extract*/workspaceDb.sync*) and shows commands like doc-scraper watch --dir, but the skill bundle has no install spec, no binaries, and no code implementing those commands. That mismatch (declared purpose vs. provided artifacts) is incoherent: either the author assumed external tools exist on the host or omitted the implementation.
Instruction Scope
Instructions instruct recursively scanning and watching directories (skills/, docs/, workspace/) and indexing arbitrary markdown files into docs.db. While consistent with a documentation indexer, this scope can capture sensitive files (README, config snippets, credentials if present in docs) and the SKILL.md gives the agent discretion to extract various headers and configs without constraints. There are no limits or filters described.
Install Mechanism
No install spec is provided (instruction-only), which is lowest-risk from a supply-chain perspective—but it's unexpected given the CLI/API usage in the instructions. If the agent tries to run the named CLI and it does not exist, behavior will depend on the agent's error-handling; if the CLI does exist in the environment (from elsewhere), the skill will call it without presenting its code here.
Credentials
The skill declares no required environment variables or credentials (which is reasonable), but it explicitly integrates with workspace-db and writes to docs.db. That integration may implicitly require workspace-db credentials or network access (not declared). The absence of declared credentials makes it unclear how workspace sync is authenticated—this is an information gap worth clarifying.
Persistence & Privilege
always is false (good). However, the SKILL.md encourages long-running watch processes (doc-scraper watch) which would give the agent persistent access to file-system changes while running. Autonomous invocation is allowed by default; combined with watch-style behavior this increases blast radius if the agent is permitted to run watchers without restrictions.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install doc-scraper - 安装完成后,直接呼叫该 Skill 的名称或使用
/doc-scraper触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of doc-scraper for documentation extraction and indexing.
- Extracts structured information from markdown files, including SKILL.md, API.md, and SYSTEM.md.
- Indexes extracted data into docs.db and integrates with workspace-db for synchronization and organization.
- Supports auto-update: watches for markdown file changes and updates docs.db automatically.
- Provides CLI commands for indexing, watching, and searching documentation.
- Offers flexible configuration options for extraction and integration.
元数据
常见问题
Doc Scraper 是什么?
Documentation extraction and indexing. Extracts information from markdown files and syncs to workspace-db. Works alongside workspace-db which handles synchro... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 66 次。
如何安装 Doc Scraper?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install doc-scraper」即可一键安装,无需额外配置。
Doc Scraper 是免费的吗?
是的,Doc Scraper 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Doc Scraper 支持哪些平台?
Doc Scraper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Doc Scraper?
由 KikiKari(@kikikari)开发并维护,当前版本 v1.0.0。
推荐 Skills