CamScanner-Pdf2Markdown
/install camscanner-pdf2markdown-office
CamScanner PDF to Markdown
Overview
CamScanner provides a high-precision document parsing engine that converts PDF documents to Markdown format. It intelligently decomposes document paragraphs, precisely recognizes tables and multiple element types, and outputs structured results in reading order — empowering large language models to accurately understand document content. The workflow is a 3-step pipeline: upload the PDF, convert it, then download the result.
When to Use
- User wants to convert a PDF to Markdown
- User wants to extract text/content from a PDF as Markdown
- User has a PDF and needs it as Markdown for further editing or processing
Privacy & Data
Important: Privacy & Data Flow Notice
- Third-party service: This skill sends your files to CamScanner's official servers (
ai-tools.camscanner.com) for processing.- Data retention: CamScanner servers process your files in real-time. Files are not permanently stored on the server.
- Local files: Output files are saved to your local filesystem at the path you specify.
API Reference
Base URL: https://ai-tools.camscanner.com
Supported Conversions
| source_type | target_type | Output |
|---|---|---|
| md | .md |
Step 1: Upload PDF
BASE="https://ai-tools.camscanner.com"
IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
-H "Content-Type: application/octet-stream" \
--data-binary "@/path/to/document.pdf" | jq -r '.tool_result.data.file_id')
Response:
{
"code": 200,
"tool": "upload_file",
"tool_result": {
"success": true,
"data": {
"file_id": "file_1741857600_ab12cd34ef56",
"size": 24576
}
}
}
Step 2: Convert PDF to Markdown
OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_pdf/execute" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"pdf\",\"target_type\":\"md\",\"output_mode\":\"file_id\"}" \
| jq -r '.tool_result.data.file_id')
Response:
{
"code": 200,
"tool": "convert_pdf",
"tool_result": {
"success": true,
"data": {
"file_id": "file_1741857701_9988aabbccdd",
"target_type": "md"
}
}
}
Step 3: Download Result
curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$OUT_FILE_ID\"}" \
-o /path/to/output.md
Critical: The response_mode=raw query parameter is required to get the binary file. Without it, the response is JSON.
Quick Reference: Complete Pipeline
BASE="https://ai-tools.camscanner.com"
INPUT_PDF="/path/to/document.pdf"
OUTPUT_FILE="/path/to/output.md"
# Upload
IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
-H "Content-Type: application/octet-stream" \
--data-binary "@$INPUT_PDF" | jq -r '.tool_result.data.file_id')
# Convert
OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_pdf/execute" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"pdf\",\"target_type\":\"md\",\"output_mode\":\"file_id\"}" \
| jq -r '.tool_result.data.file_id')
# Download
curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$OUT_FILE_ID\"}" \
-o "$OUTPUT_FILE"
Common Mistakes
| Mistake | Fix |
|---|---|
Forgetting response_mode=raw on download |
Always append ?response_mode=raw to the download URL |
| Wrong Content-Type on upload | Upload uses application/octet-stream, not multipart/form-data |
| Using GET instead of POST | All three endpoints use POST |
Missing source_type in convert request |
Always include "source_type": "pdf" |
Missing output_mode in convert request |
Always include "output_mode": "file_id" to get a downloadable file_id |
Error Handling
Check each step before proceeding:
# After upload
if [ -z "$IN_FILE_ID" ] || [ "$IN_FILE_ID" = "null" ]; then
echo "Upload failed"; exit 1
fi
# After convert
if [ -z "$OUT_FILE_ID" ] || [ "$OUT_FILE_ID" = "null" ]; then
echo "Conversion failed"; exit 1
fi
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install camscanner-pdf2markdown-office - 安装完成后,直接呼叫该 Skill 的名称或使用
/camscanner-pdf2markdown-office触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
CamScanner-Pdf2Markdown 是什么?
Use CamScanner to convert PDF documents to Markdown format, powered by a high-precision document parsing engine that intelligently decomposes paragraphs, pre... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 91 次。
如何安装 CamScanner-Pdf2Markdown?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install camscanner-pdf2markdown-office」即可一键安装,无需额外配置。
CamScanner-Pdf2Markdown 是免费的吗?
是的,CamScanner-Pdf2Markdown 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
CamScanner-Pdf2Markdown 支持哪些平台?
CamScanner-Pdf2Markdown 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 CamScanner-Pdf2Markdown?
由 CamScanner-AI(@camscanner-ai)开发并维护,当前版本 v1.0.0。