← Back to Skills Marketplace
chiefsegundo

Boof

by chiefsegundo · GitHub ↗ · v4.0.0 · MIT-0
cross-platform ⚠ suspicious
1007
Downloads
0
Stars
7
Active Installs
2
Versions
Install in OpenClaw
/install boof
Description
Convert PDFs and documents to markdown, index them locally for RAG retrieval, and analyze them token-efficiently. Use when asked to: read/analyze/summarize a...
README (SKILL.md)

Boof 🍑

Local-first document processing: PDF → markdown → RAG index → token-efficient analysis.

Documents stay local. Only relevant chunks go to the LLM. Maximum knowledge absorption, minimum token burn.

Powered by opendataloader-pdf — #1 in PDF parsing benchmarks (0.90 overall, 0.93 table accuracy). CPU-only, no GPU required.

Quick Reference

Convert + index a document

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf

Convert with custom collection name

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf --collection my-project

Query indexed content

qmd query "your question" -c collection-name

Core Workflow

  1. Boof it: Run boof.sh on a PDF. This converts it to markdown via opendataloader-pdf (local Java engine, no API, no GPU) and indexes it into QMD for semantic search.

  2. Query it: Use qmd query to retrieve only the relevant chunks. Send those chunks to the LLM — not the entire document.

  3. Analyze it: The LLM sees focused, relevant excerpts. No wasted tokens, no lost-in-the-middle problems.

When to Use Each Approach

"Analyze this specific aspect of the paper" → Boof + query (cheapest, most focused)

"Summarize this entire document" → Boof, then read the markdown section by section. Summarize each section individually, then merge summaries. See advanced-usage.md.

"Compare findings across multiple papers" → Boof all papers into one collection, then query across them.

"Find where the paper discusses X"qmd search "X" -c collection for exact match, qmd query "X" -c collection for semantic match.

Output Location

Converted markdown files are saved to knowledge/boofed/ by default (override with --output-dir).

Setup

If boof.sh reports missing dependencies, see setup-guide.md for installation instructions (Java + opendataloader-pdf + QMD).

Environment

  • ODL_ENV — Path to opendataloader-pdf Python venv (default: ~/.openclaw/tools/odl-env)
  • QMD_BIN — Path to qmd binary (default: ~/.bun/bin/qmd)
  • BOOF_OUTPUT_DIR — Default output directory (default: ~/.openclaw/workspace/knowledge/boofed)
Usage Guidance
Boof appears internally consistent with its stated purpose, but review these practical points before installing: - The script runs locally and will execute Python/Java on your machine and write markdown to the specified output directory; only run it on documents you authorize. - Installing opendataloader-pdf and QMD requires network access and will download packages and (on first run) QMD models (~1–2GB). Verify you trust the opendataloader-pdf and QMD sources (inspect their repos or package pages) before installing. - The setup uses bun to install QMD from a GitHub URL — prefer installing in isolated environments (venv, container, or VM) if processing sensitive documents. - No credentials are requested by the skill, but ensure you set a safe BOOF_OUTPUT_DIR if you do not want converted files stored under your home/workspace. - If you need higher assurance, run the boof.sh commands manually in an isolated venv and review the opendataloader-pdf/QMD behavior during first-run model downloads.
Capability Analysis
Type: OpenClaw Skill Name: boof Version: 4.0.0 The 'boof' skill bundle provides legitimate PDF-to-Markdown conversion and RAG indexing using local tools like opendataloader-pdf and qmd. However, the script 'scripts/boof.sh' contains a code injection vulnerability where the '$INPUT_FILE' shell variable is interpolated directly into a Python script within a heredoc. This could allow arbitrary Python code execution if the agent is tasked with processing a file with a specially crafted filename (e.g., containing quotes and Python commands). While the tool's behavior aligns with its stated purpose and no intentional malice was found, this high-risk implementation flaw warrants a suspicious classification.
Capability Assessment
Purpose & Capability
The name/description (PDF→markdown→RAG) matches the included files and script. Required tools (Java, Python venv with opendataloader-pdf, and QMD) are exactly what you would expect for local conversion and local semantic indexing. No unrelated binaries or credentials are requested.
Instruction Scope
The SKILL.md and scripts instruct the agent to run a local shell script that (a) runs opendataloader-pdf inside a venv to convert the provided file, (b) indexes the resulting markdown with qmd, and (c) writes output to a local directory. This stays within the stated purpose. Note: the setup and first-run steps will download QMD models (~1–2GB) and require network access when installing packages (pip / bun / qmd); logs are filtered in the script output which reduces noise but also hides some informational lines. The skill does not reference or exfiltrate unrelated system files or environment variables.
Install Mechanism
There is no automated install spec in the skill bundle; setup instructions tell the user to install Java, pip-install opendataloader-pdf into a venv, and install QMD via bun from a GitHub URL. These are standard, traceable sources (PyPI/GitHub/bun). No arbitrary binary downloads or extract-from-unknown-URLs are present in the bundle.
Credentials
No secrets or cloud credentials are required. Declared environment variables (ODL_ENV, QMD_BIN, BOOF_OUTPUT_DIR) are path/configuration variables appropriate for the task. The skill does not request unrelated tokens or passwords.
Persistence & Privilege
always is false and the skill does not modify other skills or system-wide configs. It writes converted files under a local output directory (default under the workspace) which is proportional to its function.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install boof
  3. After installation, invoke the skill by name or use /boof
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v4.0.0
Swap Marker (GPU, slow, flat text) for opendataloader-pdf (CPU-only, #1 benchmark, proper Markdown tables). Faster, lighter, better output quality. Requires Java 11+ instead of Python ML stack.
v1.0.0
Initial release
Metadata
Slug boof
Version 4.0.0
License MIT-0
All-time Installs 7
Active Installs 7
Total Versions 2
Frequently Asked Questions

What is Boof?

Convert PDFs and documents to markdown, index them locally for RAG retrieval, and analyze them token-efficiently. Use when asked to: read/analyze/summarize a... It is an AI Agent Skill for Claude Code / OpenClaw, with 1007 downloads so far.

How do I install Boof?

Run "/install boof" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Boof free?

Yes, Boof is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Boof support?

Boof is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Boof?

It is built and maintained by chiefsegundo (@chiefsegundo); the current version is v4.0.0.

💬 Comments