← 返回 Skills 市场
scottkiss

Pdf2word Skills

作者 scottkiss · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
203
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install pdf2word-skills
功能描述
Convert scanned PDF documents into Word text documents using a free, local OCR engine or remote api.
使用说明 (SKILL.md)

PDF to Word Converter

🇨🇳 简体中文 / Simplified Chinese

A skill to extract text from scanned PDF documents and convert them into reusable Word (.docx) files using the free, local docr OCR engine.

Prerequisites

  1. Initialize the OCR engine by downloading the binaries:
    bash scripts/install.sh
    
  2. Install the required Python dependencies:
    pip install -r scripts/requirements.txt
    

Usage

Run the Python script passing the input PDF file and the desired output .docx file path. You can also append any additional standard docr arguments (such as engine preferences).

python scripts/pdf2word.py \x3Cinput.pdf> \x3Coutput.docx> [docr_args...]

Examples

Convert a single file with the default local engine:

python scripts/pdf2word.py sample.pdf sample_output.docx

Using Other API Engines

By default, the script uses the local RapidOCR engine. The underlying docr tool also supports other engines like the Google Gemini API for potentially higher recognition accuracy on complex layouts.

To use Gemini, first configure your API key:

mkdir -p ~/.ocr
echo "gemini_api_key=your_gemini_key" > ~/.ocr/config

Then pass the -engine gemini argument to the script:

python scripts/pdf2word.py sample.pdf sample_output.docx -engine gemini

If your document has tables, you can force Gemini to output them in Markdown format so the script can parse them into native Word tables:

python scripts/pdf2word.py sample.pdf sample_output.docx -engine gemini -prompt "Extract all text and preserve tables in Markdown format using | symbols."

How it Works

  1. The script calls docr, which uses the specified OCR model (RapidOCR by default) to read text from the scanned PDF.
  2. The extracted text is temporarily stored.
  3. The python-docx library is used to read the temporary text and construct a formatted Word document.
  4. Temporary files are cleaned up automatically.
安全使用建议
This skill appears to do what it claims: it downloads a docr binary, runs it on PDFs, and builds a .docx from the extracted text. Before installing or running it: 1) Inspect the referenced GitHub repo/releases (https://github.com/scottkiss/doc-ocr) and verify the release and maintainer match your trust criteria; prefer checking a checksum or signed release if available. 2) Run the install and conversion in a sandbox or VM if you will process sensitive documents, because the downloaded binary is third-party native code and could perform network activity. 3) If you plan to use a remote engine (Gemini), understand that text may leave your machine and follow your organization's data-sharing policies; SKILL.md suggests storing the API key in ~/.ocr/config (this is optional but not declared elsewhere). 4) On Windows there may be an executable extension mismatch (install creates docr.exe but the Python script looks for 'docr'); verify behavior on your platform before automating. 5) If you need stronger assurance, request the upstream source code/binary reproducible build or replace the binary with a vetted OCR implementation.
功能分析
Type: OpenClaw Skill Name: pdf2word-skills Version: 1.0.0 The skill provides a legitimate utility for converting scanned PDF documents into Word files using the 'docr' OCR engine. It includes an installation script (scripts/install.sh) that downloads the necessary binary from a public GitHub repository and a Python script (scripts/pdf2word.py) that uses the 'python-docx' library to format the extracted text. The code is transparent, follows its stated purpose, and lacks any indicators of malicious intent, data exfiltration, or prompt injection.
能力评估
Purpose & Capability
The name/description match the delivered assets: a Python script that calls a local 'docr' binary and uses python-docx to produce a .docx. The included install.sh downloads the expected OCR binary from a GitHub releases URL — this is consistent with providing a local OCR engine.
Instruction Scope
SKILL.md stays on task (install binary, pip deps, run script). It also documents optional use of remote engines (e.g., Gemini) and instructs creating ~/.ocr/config with a gemini_api_key. That config step is outside the skill directory and is not declared in required env/config fields; it's optional but relevant to user privacy and should be noted.
Install Mechanism
The install script downloads a single binary from a GitHub releases URL and writes it under scripts/docr/. Downloading from GitHub releases is a typical, low-risk mechanism compared with arbitrary IPs or paste sites. The script does not extract archives or run additional installers. However, the binary will be executed, so its provenance should be validated.
Credentials
No required environment variables are declared, and the Python script does not read secrets itself. However, SKILL.md asks users to store API keys in ~/.ocr/config for optional remote engines (Gemini). That is reasonable for optional remote OCR but is not declared in requires.env and should be considered a configuration that affects privacy/security for sensitive docs.
Persistence & Privilege
The skill does not request always:true, does not modify other skills, and only places the downloaded binary under the skill's scripts directory (and optionally asks the user to create ~/.ocr/config). There is no permanent elevated privilege requested.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install pdf2word-skills
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /pdf2word-skills 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of pdf2word-skills. - Converts scanned PDF documents to editable Word (.docx) files using a free, local OCR engine. - Supports additional OCR engines through the underlying `docr` tool, including Google Gemini API. - Provides options for handling tables and custom OCR arguments. - Setup scripts and simple command-line usage instructions included.
元数据
Slug pdf2word-skills
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Pdf2word Skills 是什么?

Convert scanned PDF documents into Word text documents using a free, local OCR engine or remote api. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 203 次。

如何安装 Pdf2word Skills?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf2word-skills」即可一键安装,无需额外配置。

Pdf2word Skills 是免费的吗?

是的,Pdf2word Skills 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Pdf2word Skills 支持哪些平台?

Pdf2word Skills 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Pdf2word Skills?

由 scottkiss(@scottkiss)开发并维护,当前版本 v1.0.0。

💬 留言讨论