← 返回 Skills 市场
chiefsegundo

Boof

作者 chiefsegundo · GitHub ↗ · v4.0.0 · MIT-0
cross-platform ⚠ suspicious
1007
总下载
0
收藏
7
当前安装
2
版本数
在 OpenClaw 中安装
/install boof
功能描述
Convert PDFs and documents to markdown, index them locally for RAG retrieval, and analyze them token-efficiently. Use when asked to: read/analyze/summarize a...
使用说明 (SKILL.md)

Boof 🍑

Local-first document processing: PDF → markdown → RAG index → token-efficient analysis.

Documents stay local. Only relevant chunks go to the LLM. Maximum knowledge absorption, minimum token burn.

Powered by opendataloader-pdf — #1 in PDF parsing benchmarks (0.90 overall, 0.93 table accuracy). CPU-only, no GPU required.

Quick Reference

Convert + index a document

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf

Convert with custom collection name

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf --collection my-project

Query indexed content

qmd query "your question" -c collection-name

Core Workflow

  1. Boof it: Run boof.sh on a PDF. This converts it to markdown via opendataloader-pdf (local Java engine, no API, no GPU) and indexes it into QMD for semantic search.

  2. Query it: Use qmd query to retrieve only the relevant chunks. Send those chunks to the LLM — not the entire document.

  3. Analyze it: The LLM sees focused, relevant excerpts. No wasted tokens, no lost-in-the-middle problems.

When to Use Each Approach

"Analyze this specific aspect of the paper" → Boof + query (cheapest, most focused)

"Summarize this entire document" → Boof, then read the markdown section by section. Summarize each section individually, then merge summaries. See advanced-usage.md.

"Compare findings across multiple papers" → Boof all papers into one collection, then query across them.

"Find where the paper discusses X"qmd search "X" -c collection for exact match, qmd query "X" -c collection for semantic match.

Output Location

Converted markdown files are saved to knowledge/boofed/ by default (override with --output-dir).

Setup

If boof.sh reports missing dependencies, see setup-guide.md for installation instructions (Java + opendataloader-pdf + QMD).

Environment

  • ODL_ENV — Path to opendataloader-pdf Python venv (default: ~/.openclaw/tools/odl-env)
  • QMD_BIN — Path to qmd binary (default: ~/.bun/bin/qmd)
  • BOOF_OUTPUT_DIR — Default output directory (default: ~/.openclaw/workspace/knowledge/boofed)
安全使用建议
Boof appears internally consistent with its stated purpose, but review these practical points before installing: - The script runs locally and will execute Python/Java on your machine and write markdown to the specified output directory; only run it on documents you authorize. - Installing opendataloader-pdf and QMD requires network access and will download packages and (on first run) QMD models (~1–2GB). Verify you trust the opendataloader-pdf and QMD sources (inspect their repos or package pages) before installing. - The setup uses bun to install QMD from a GitHub URL — prefer installing in isolated environments (venv, container, or VM) if processing sensitive documents. - No credentials are requested by the skill, but ensure you set a safe BOOF_OUTPUT_DIR if you do not want converted files stored under your home/workspace. - If you need higher assurance, run the boof.sh commands manually in an isolated venv and review the opendataloader-pdf/QMD behavior during first-run model downloads.
功能分析
Type: OpenClaw Skill Name: boof Version: 4.0.0 The 'boof' skill bundle provides legitimate PDF-to-Markdown conversion and RAG indexing using local tools like opendataloader-pdf and qmd. However, the script 'scripts/boof.sh' contains a code injection vulnerability where the '$INPUT_FILE' shell variable is interpolated directly into a Python script within a heredoc. This could allow arbitrary Python code execution if the agent is tasked with processing a file with a specially crafted filename (e.g., containing quotes and Python commands). While the tool's behavior aligns with its stated purpose and no intentional malice was found, this high-risk implementation flaw warrants a suspicious classification.
能力评估
Purpose & Capability
The name/description (PDF→markdown→RAG) matches the included files and script. Required tools (Java, Python venv with opendataloader-pdf, and QMD) are exactly what you would expect for local conversion and local semantic indexing. No unrelated binaries or credentials are requested.
Instruction Scope
The SKILL.md and scripts instruct the agent to run a local shell script that (a) runs opendataloader-pdf inside a venv to convert the provided file, (b) indexes the resulting markdown with qmd, and (c) writes output to a local directory. This stays within the stated purpose. Note: the setup and first-run steps will download QMD models (~1–2GB) and require network access when installing packages (pip / bun / qmd); logs are filtered in the script output which reduces noise but also hides some informational lines. The skill does not reference or exfiltrate unrelated system files or environment variables.
Install Mechanism
There is no automated install spec in the skill bundle; setup instructions tell the user to install Java, pip-install opendataloader-pdf into a venv, and install QMD via bun from a GitHub URL. These are standard, traceable sources (PyPI/GitHub/bun). No arbitrary binary downloads or extract-from-unknown-URLs are present in the bundle.
Credentials
No secrets or cloud credentials are required. Declared environment variables (ODL_ENV, QMD_BIN, BOOF_OUTPUT_DIR) are path/configuration variables appropriate for the task. The skill does not request unrelated tokens or passwords.
Persistence & Privilege
always is false and the skill does not modify other skills or system-wide configs. It writes converted files under a local output directory (default under the workspace) which is proportional to its function.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install boof
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /boof 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v4.0.0
Swap Marker (GPU, slow, flat text) for opendataloader-pdf (CPU-only, #1 benchmark, proper Markdown tables). Faster, lighter, better output quality. Requires Java 11+ instead of Python ML stack.
v1.0.0
Initial release
元数据
Slug boof
版本 4.0.0
许可证 MIT-0
累计安装 7
当前安装数 7
历史版本数 2
常见问题

Boof 是什么?

Convert PDFs and documents to markdown, index them locally for RAG retrieval, and analyze them token-efficiently. Use when asked to: read/analyze/summarize a... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1007 次。

如何安装 Boof?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install boof」即可一键安装,无需额外配置。

Boof 是免费的吗?

是的,Boof 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Boof 支持哪些平台?

Boof 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Boof?

由 chiefsegundo(@chiefsegundo)开发并维护,当前版本 v4.0.0。

💬 留言讨论