← Back to Skills Marketplace
scottkiss

Pdf2word Skills

by scottkiss · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
203
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install pdf2word-skills
Description
Convert scanned PDF documents into Word text documents using a free, local OCR engine or remote api.
README (SKILL.md)

PDF to Word Converter

🇨🇳 简体中文 / Simplified Chinese

A skill to extract text from scanned PDF documents and convert them into reusable Word (.docx) files using the free, local docr OCR engine.

Prerequisites

  1. Initialize the OCR engine by downloading the binaries:
    bash scripts/install.sh
    
  2. Install the required Python dependencies:
    pip install -r scripts/requirements.txt
    

Usage

Run the Python script passing the input PDF file and the desired output .docx file path. You can also append any additional standard docr arguments (such as engine preferences).

python scripts/pdf2word.py \x3Cinput.pdf> \x3Coutput.docx> [docr_args...]

Examples

Convert a single file with the default local engine:

python scripts/pdf2word.py sample.pdf sample_output.docx

Using Other API Engines

By default, the script uses the local RapidOCR engine. The underlying docr tool also supports other engines like the Google Gemini API for potentially higher recognition accuracy on complex layouts.

To use Gemini, first configure your API key:

mkdir -p ~/.ocr
echo "gemini_api_key=your_gemini_key" > ~/.ocr/config

Then pass the -engine gemini argument to the script:

python scripts/pdf2word.py sample.pdf sample_output.docx -engine gemini

If your document has tables, you can force Gemini to output them in Markdown format so the script can parse them into native Word tables:

python scripts/pdf2word.py sample.pdf sample_output.docx -engine gemini -prompt "Extract all text and preserve tables in Markdown format using | symbols."

How it Works

  1. The script calls docr, which uses the specified OCR model (RapidOCR by default) to read text from the scanned PDF.
  2. The extracted text is temporarily stored.
  3. The python-docx library is used to read the temporary text and construct a formatted Word document.
  4. Temporary files are cleaned up automatically.
Usage Guidance
This skill appears to do what it claims: it downloads a docr binary, runs it on PDFs, and builds a .docx from the extracted text. Before installing or running it: 1) Inspect the referenced GitHub repo/releases (https://github.com/scottkiss/doc-ocr) and verify the release and maintainer match your trust criteria; prefer checking a checksum or signed release if available. 2) Run the install and conversion in a sandbox or VM if you will process sensitive documents, because the downloaded binary is third-party native code and could perform network activity. 3) If you plan to use a remote engine (Gemini), understand that text may leave your machine and follow your organization's data-sharing policies; SKILL.md suggests storing the API key in ~/.ocr/config (this is optional but not declared elsewhere). 4) On Windows there may be an executable extension mismatch (install creates docr.exe but the Python script looks for 'docr'); verify behavior on your platform before automating. 5) If you need stronger assurance, request the upstream source code/binary reproducible build or replace the binary with a vetted OCR implementation.
Capability Analysis
Type: OpenClaw Skill Name: pdf2word-skills Version: 1.0.0 The skill provides a legitimate utility for converting scanned PDF documents into Word files using the 'docr' OCR engine. It includes an installation script (scripts/install.sh) that downloads the necessary binary from a public GitHub repository and a Python script (scripts/pdf2word.py) that uses the 'python-docx' library to format the extracted text. The code is transparent, follows its stated purpose, and lacks any indicators of malicious intent, data exfiltration, or prompt injection.
Capability Assessment
Purpose & Capability
The name/description match the delivered assets: a Python script that calls a local 'docr' binary and uses python-docx to produce a .docx. The included install.sh downloads the expected OCR binary from a GitHub releases URL — this is consistent with providing a local OCR engine.
Instruction Scope
SKILL.md stays on task (install binary, pip deps, run script). It also documents optional use of remote engines (e.g., Gemini) and instructs creating ~/.ocr/config with a gemini_api_key. That config step is outside the skill directory and is not declared in required env/config fields; it's optional but relevant to user privacy and should be noted.
Install Mechanism
The install script downloads a single binary from a GitHub releases URL and writes it under scripts/docr/. Downloading from GitHub releases is a typical, low-risk mechanism compared with arbitrary IPs or paste sites. The script does not extract archives or run additional installers. However, the binary will be executed, so its provenance should be validated.
Credentials
No required environment variables are declared, and the Python script does not read secrets itself. However, SKILL.md asks users to store API keys in ~/.ocr/config for optional remote engines (Gemini). That is reasonable for optional remote OCR but is not declared in requires.env and should be considered a configuration that affects privacy/security for sensitive docs.
Persistence & Privilege
The skill does not request always:true, does not modify other skills, and only places the downloaded binary under the skill's scripts directory (and optionally asks the user to create ~/.ocr/config). There is no permanent elevated privilege requested.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdf2word-skills
  3. After installation, invoke the skill by name or use /pdf2word-skills
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of pdf2word-skills. - Converts scanned PDF documents to editable Word (.docx) files using a free, local OCR engine. - Supports additional OCR engines through the underlying `docr` tool, including Google Gemini API. - Provides options for handling tables and custom OCR arguments. - Setup scripts and simple command-line usage instructions included.
Metadata
Slug pdf2word-skills
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Pdf2word Skills?

Convert scanned PDF documents into Word text documents using a free, local OCR engine or remote api. It is an AI Agent Skill for Claude Code / OpenClaw, with 203 downloads so far.

How do I install Pdf2word Skills?

Run "/install pdf2word-skills" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Pdf2word Skills free?

Yes, Pdf2word Skills is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Pdf2word Skills support?

Pdf2word Skills is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Pdf2word Skills?

It is built and maintained by scottkiss (@scottkiss); the current version is v1.0.0.

💬 Comments