← 返回 Skills 市场
camscanner-ai

CamScanner-Pdf2Markdown

作者 CamScanner-AI · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
91
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install camscanner-pdf2markdown-office
功能描述
Use CamScanner to convert PDF documents to Markdown format, powered by a high-precision document parsing engine that intelligently decomposes paragraphs, pre...
使用说明 (SKILL.md)

CamScanner PDF to Markdown

Overview

CamScanner provides a high-precision document parsing engine that converts PDF documents to Markdown format. It intelligently decomposes document paragraphs, precisely recognizes tables and multiple element types, and outputs structured results in reading order — empowering large language models to accurately understand document content. The workflow is a 3-step pipeline: upload the PDF, convert it, then download the result.

When to Use

  • User wants to convert a PDF to Markdown
  • User wants to extract text/content from a PDF as Markdown
  • User has a PDF and needs it as Markdown for further editing or processing

Privacy & Data

Important: Privacy & Data Flow Notice

  • Third-party service: This skill sends your files to CamScanner's official servers (ai-tools.camscanner.com) for processing.
  • Data retention: CamScanner servers process your files in real-time. Files are not permanently stored on the server.
  • Local files: Output files are saved to your local filesystem at the path you specify.

API Reference

Base URL: https://ai-tools.camscanner.com

Supported Conversions

source_type target_type Output
pdf md .md

Step 1: Upload PDF

BASE="https://ai-tools.camscanner.com"

IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
  -H "Content-Type: application/octet-stream" \
  --data-binary "@/path/to/document.pdf" | jq -r '.tool_result.data.file_id')

Response:

{
  "code": 200,
  "tool": "upload_file",
  "tool_result": {
    "success": true,
    "data": {
      "file_id": "file_1741857600_ab12cd34ef56",
      "size": 24576
    }
  }
}

Step 2: Convert PDF to Markdown

OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_pdf/execute" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"pdf\",\"target_type\":\"md\",\"output_mode\":\"file_id\"}" \
  | jq -r '.tool_result.data.file_id')

Response:

{
  "code": 200,
  "tool": "convert_pdf",
  "tool_result": {
    "success": true,
    "data": {
      "file_id": "file_1741857701_9988aabbccdd",
      "target_type": "md"
    }
  }
}

Step 3: Download Result

curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$OUT_FILE_ID\"}" \
  -o /path/to/output.md

Critical: The response_mode=raw query parameter is required to get the binary file. Without it, the response is JSON.

Quick Reference: Complete Pipeline

BASE="https://ai-tools.camscanner.com"
INPUT_PDF="/path/to/document.pdf"
OUTPUT_FILE="/path/to/output.md"

# Upload
IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
  -H "Content-Type: application/octet-stream" \
  --data-binary "@$INPUT_PDF" | jq -r '.tool_result.data.file_id')

# Convert
OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_pdf/execute" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"pdf\",\"target_type\":\"md\",\"output_mode\":\"file_id\"}" \
  | jq -r '.tool_result.data.file_id')

# Download
curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$OUT_FILE_ID\"}" \
  -o "$OUTPUT_FILE"

Common Mistakes

Mistake Fix
Forgetting response_mode=raw on download Always append ?response_mode=raw to the download URL
Wrong Content-Type on upload Upload uses application/octet-stream, not multipart/form-data
Using GET instead of POST All three endpoints use POST
Missing source_type in convert request Always include "source_type": "pdf"
Missing output_mode in convert request Always include "output_mode": "file_id" to get a downloadable file_id

Error Handling

Check each step before proceeding:

# After upload
if [ -z "$IN_FILE_ID" ] || [ "$IN_FILE_ID" = "null" ]; then
  echo "Upload failed"; exit 1
fi

# After convert
if [ -z "$OUT_FILE_ID" ] || [ "$OUT_FILE_ID" = "null" ]; then
  echo "Conversion failed"; exit 1
fi
安全使用建议
This skill behaves as advertised: it uploads a local PDF to a CamScanner endpoint, converts it, and downloads a Markdown file. Before installing/using it: (1) confirm the endpoint (ai-tools.camscanner.com) is the legitimate service you intend to use; (2) avoid uploading highly sensitive documents if you have privacy concerns — the skill sends files off your machine and the SKILL.md's claim about retention cannot be independently verified; (3) ensure curl and jq are available on the system; (4) inspect network activity or use test documents if you want to validate behavior first. If you need offline conversion for sensitive files, consider local tools instead.
能力评估
Purpose & Capability
Name/description match the actions in SKILL.md: upload a PDF to a remote CamScanner endpoint, request a conversion, and download the .md result. Required binaries (curl, jq) are appropriate for the provided shell examples and no unrelated credentials or config paths are requested.
Instruction Scope
Instructions are narrowly scoped to uploading the specified local PDF, invoking conversion endpoints, and saving the result locally. However, the workflow explicitly sends the user's file to a remote service (ai-tools.camscanner.com); that is expected for this skill but is a privacy/data‑exfiltration surface the user should be aware of. The SKILL.md also asserts files are not permanently stored on the server — this claim is not verifiable from the skill itself.
Install Mechanism
This is an instruction-only skill with no install spec and no code files. That minimizes attack surface (nothing is written to disk by the skill itself).
Credentials
The skill requests no environment variables, credentials, or config paths beyond requiring curl/jq on PATH. No excessive secrets or unrelated service credentials are requested.
Persistence & Privilege
The skill is not always-included and does not request any elevated or persistent platform privileges. Autonomous invocation is enabled by default (platform normal) but the skill does not request special persistence or modify other skills.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install camscanner-pdf2markdown-office
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /camscanner-pdf2markdown-office 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Major update: The skill is no longer deprecated and now provides full documentation for converting PDF files to Markdown using CamScanner. - Fully replaces deprecated notice with a detailed skill overview and usage instructions. - Adds step-by-step API guide to upload, convert, and download PDF-to-Markdown results. - Highlights privacy and data handling practices. - Includes a quick reference command pipeline and troubleshooting tips for common mistakes. - Specifies required tools (curl, jq) and integration details.
元数据
Slug camscanner-pdf2markdown-office
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

CamScanner-Pdf2Markdown 是什么?

Use CamScanner to convert PDF documents to Markdown format, powered by a high-precision document parsing engine that intelligently decomposes paragraphs, pre... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 91 次。

如何安装 CamScanner-Pdf2Markdown?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install camscanner-pdf2markdown-office」即可一键安装,无需额外配置。

CamScanner-Pdf2Markdown 是免费的吗?

是的,CamScanner-Pdf2Markdown 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

CamScanner-Pdf2Markdown 支持哪些平台?

CamScanner-Pdf2Markdown 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 CamScanner-Pdf2Markdown?

由 CamScanner-AI(@camscanner-ai)开发并维护,当前版本 v1.0.0。

💬 留言讨论