Description

Document format converter. Use when user needs to convert between Word, PDF, Markdown, HTML formats. Supports docx-to-pdf, md-to-pdf, md-to-docx, html-to-pdf...

README (SKILL.md)

Document Converter

Name: Doc Converter
Author: tobewin

High-fidelity document format converter supporting Word, PDF, Markdown, and HTML formats.

Features

📄 Word → PDF: Convert .docx to PDF
📝 Markdown → PDF: Convert .md to PDF
📝 Markdown → Word: Convert .md to .docx
🌐 HTML → PDF: Convert .html to PDF
🎨 High Fidelity: Preserve formatting, fonts, images
🌍 Multi-Language: Chinese, English, etc.
✅ Cross-Platform: Windows, macOS, Linux

Supported Conversions

From	To	Method
.docx	.pdf	python-docx + fpdf2
.md	.pdf	markdown + fpdf2
.md	.docx	markdown + python-docx
.html	.pdf	pdfkit/wkhtmltopdf
.docx	.md	python-docx extraction

Trigger Conditions

"转换文档格式" / "Convert document format"
"Word转PDF" / "Convert Word to PDF"
"Markdown转Word" / "Convert Markdown to Word"
"HTML转PDF" / "Convert HTML to PDF"
"doc-converter"

Step 1: Understand Requirements

请提供以下信息：

源文件路径：
目标格式：（pdf/docx/md）
特殊要求：（保持格式/压缩/水印等）

Step 2: Conversion Scripts

Word → PDF

python3 \x3C\x3C 'PYEOF'
import os
import sys
from docx import Document
from fpdf import FPDF

class DocxToPdf:
    def __init__(self):
        self.pdf = FPDF()
        self.pdf.set_auto_page_break(auto=True, margin=15)
    
    def convert(self, docx_path, output_path):
        """Convert Word document to PDF"""
        
        doc = Document(docx_path)
        self.pdf.add_page()
        
        for para in doc.paragraphs:
            if not para.text.strip():
                continue
            
            # Determine style
            style = para.style.name
            
            if 'Heading 1' in style:
                self.pdf.set_font('Helvetica', 'B', 18)
                self.pdf.ln(5)
                self.pdf.multi_cell(0, 10, para.text)
                self.pdf.ln(3)
            elif 'Heading 2' in style:
                self.pdf.set_font('Helvetica', 'B', 14)
                self.pdf.ln(3)
                self.pdf.multi_cell(0, 8, para.text)
                self.pdf.ln(2)
            elif 'Heading 3' in style:
                self.pdf.set_font('Helvetica', 'B', 12)
                self.pdf.ln(2)
                self.pdf.multi_cell(0, 7, para.text)
                self.pdf.ln(2)
            else:
                self.pdf.set_font('Helvetica', '', 11)
                self.pdf.multi_cell(0, 6, para.text)
                self.pdf.ln(2)
        
        # Handle tables
        for table in doc.tables:
            self.pdf.ln(5)
            self._add_table(table)
        
        self.pdf.output(output_path)
        return output_path
    
    def _add_table(self, table):
        """Add table to PDF"""
        # Get column widths
        num_cols = len(table.columns)
        col_width = 190 / num_cols
        
        # Header
        self.pdf.set_font('Helvetica', 'B', 10)
        self.pdf.set_fill_color(49, 130, 206)
        self.pdf.set_text_color(255, 255, 255)
        
        for cell in table.rows[0].cells:
            self.pdf.cell(col_width, 8, cell.text, 1, 0, 'C', True)
        self.pdf.ln()
        
        # Data rows
        self.pdf.set_font('Helvetica', '', 9)
        self.pdf.set_text_color(0, 0, 0)
        
        for row in table.rows[1:]:
            for cell in row.cells:
                self.pdf.cell(col_width, 7, cell.text, 1, 0, 'L')
            self.pdf.ln()

# Convert
converter = DocxToPdf()
converter.convert('input.docx', 'output.pdf')
PYEOF

Markdown → PDF

python3 \x3C\x3C 'PYEOF'
import os
import markdown
from fpdf import FPDF

class MarkdownToPdf:
    def __init__(self):
        self.pdf = FPDF()
        self.pdf.set_auto_page_break(auto=True, margin=15)
    
    def convert(self, md_path, output_path):
        """Convert Markdown to PDF"""
        
        with open(md_path, 'r', encoding='utf-8') as f:
            content = f.read()
        
        # Parse markdown
        lines = content.split('\
')
        
        self.pdf.add_page()
        
        for line in lines:
            line = line.strip()
            
            if not line:
                self.pdf.ln(3)
                continue
            
            # Headers
            if line.startswith('# '):
                self.pdf.set_font('Helvetica', 'B', 20)
                self.pdf.ln(5)
                self.pdf.multi_cell(0, 10, line[2:])
                self.pdf.ln(3)
            elif line.startswith('## '):
                self.pdf.set_font('Helvetica', 'B', 16)
                self.pdf.ln(4)
                self.pdf.multi_cell(0, 8, line[3:])
                self.pdf.ln(2)
            elif line.startswith('### '):
                self.pdf.set_font('Helvetica', 'B', 14)
                self.pdf.ln(3)
                self.pdf.multi_cell(0, 7, line[4:])
                self.pdf.ln(2)
            # Lists
            elif line.startswith('- ') or line.startswith('* '):
                self.pdf.set_font('Helvetica', '', 11)
                self.pdf.cell(10, 6, '', 0, 0)
                self.pdf.multi_cell(0, 6, line[2:])
            elif line.startswith('1. '):
                self.pdf.set_font('Helvetica', '', 11)
                self.pdf.cell(10, 6, '', 0, 0)
                self.pdf.multi_cell(0, 6, line[3:])
            # Code blocks
            elif line.startswith('```'):
                self.pdf.set_font('Courier', '', 9)
                # Skip code block content
                continue
            # Regular text
            elif not line.startswith('|') and not line.startswith('>'):
                self.pdf.set_font('Helvetica', '', 11)
                self.pdf.multi_cell(0, 6, line)
                self.pdf.ln(1)
        
        self.pdf.output(output_path)
        return output_path

# Convert
converter = MarkdownToPdf()
converter.convert('input.md', 'output.pdf')
PYEOF

Markdown → Word

python3 \x3C\x3C 'PYEOF'
import os
import markdown
from docx import Document
from docx.shared import Pt, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH

class MarkdownToDocx:
    def __init__(self):
        self.doc = Document()
    
    def convert(self, md_path, output_path):
        """Convert Markdown to Word"""
        
        with open(md_path, 'r', encoding='utf-8') as f:
            content = f.read()
        
        lines = content.split('\
')
        in_code_block = False
        
        for line in lines:
            line = line.strip()
            
            if not line:
                continue
            
            # Code blocks
            if line.startswith('```'):
                in_code_block = not in_code_block
                continue
            
            if in_code_block:
                p = self.doc.add_paragraph()
                run = p.add_run(line)
                run.font.name = 'Courier New'
                run.font.size = Pt(9)
                continue
            
            # Headers
            if line.startswith('# '):
                self.doc.add_heading(line[2:], level=1)
            elif line.startswith('## '):
                self.doc.add_heading(line[3:], level=2)
            elif line.startswith('### '):
                self.doc.add_heading(line[4:], level=3)
            # Lists
            elif line.startswith('- ') or line.startswith('* '):
                self.doc.add_paragraph(line[2:], style='List Bullet')
            elif line.startswith('1. '):
                self.doc.add_paragraph(line[3:], style='List Number')
            # Bold/Italic
            elif '**' in line:
                p = self.doc.add_paragraph()
                parts = line.split('**')
                for i, part in enumerate(parts):
                    if part:
                        run = p.add_run(part)
                        if i % 2 == 1:
                            run.bold = True
            # Regular text
            else:
                self.doc.add_paragraph(line)
        
        self.doc.save(output_path)
        return output_path

# Convert
converter = MarkdownToDocx()
converter.convert('input.md', 'output.docx')
PYEOF

HTML → PDF

python3 \x3C\x3C 'PYEOF'
import os
import subprocess

class HtmlToPdf:
    def convert(self, html_path, output_path):
        """Convert HTML to PDF using wkhtmltopdf"""
        
        # Check if wkhtmltopdf is installed
        try:
            subprocess.run(['wkhtmltopdf', '--version'], 
                         capture_output=True, check=True)
        except FileNotFoundError:
            print("❌ wkhtmltopdf not installed")
            print("Install: brew install wkhtmltopdf (macOS)")
            print("         sudo apt install wkhtmltopdf (Linux)")
            return None
        
        # Convert
        cmd = [
            'wkhtmltopdf',
            '--enable-local-file-access',
            '--page-size', 'A4',
            '--margin-top', '20mm',
            '--margin-bottom', '20mm',
            '--margin-left', '20mm',
            '--margin-right', '20mm',
            html_path,
            output_path
        ]
        
        result = subprocess.run(cmd, capture_output=True, text=True)
        
        if result.returncode == 0:
            return output_path
        else:
            print(f"❌ Conversion failed: {result.stderr}")
            return None

# Convert
converter = HtmlToPdf()
converter.convert('input.html', 'output.pdf')
PYEOF

Auto-Detection

def detect_conversion_type(source_path, target_path):
    """Detect conversion type from file extensions"""
    
    source_ext = os.path.splitext(source_path)[1].lower()
    target_ext = os.path.splitext(target_path)[1].lower()
    
    conversions = {
        ('.docx', '.pdf'): 'docx_to_pdf',
        ('.md', '.pdf'): 'md_to_pdf',
        ('.md', '.docx'): 'md_to_docx',
        ('.html', '.pdf'): 'html_to_pdf',
        ('.docx', '.md'): 'docx_to_md',
    }
    
    return conversions.get((source_ext, target_ext))

Quality Settings

PDF Quality

PDF_SETTINGS = {
    'high': {
        'dpi': 300,
        'quality': 95,
        'compress': False
    },
    'medium': {
        'dpi': 150,
        'quality': 85,
        'compress': True
    },
    'low': {
        'dpi': 72,
        'quality': 70,
        'compress': True
    }
}

Security Notes

✅ No network calls or external endpoints
✅ No credentials or API keys required
✅ Local file processing only
✅ Open source dependencies
✅ No data uploaded to external servers

Notes

Word转PDF需要python-docx和fpdf2
HTML转PDF需要wkhtmltopdf系统依赖
保持格式的最佳方式是使用相同字体
中文支持需要Unicode字体

Usage Guidance

This skill appears coherent for converting documents and only requires Python and some pip packages. Before installing or running: (1) be aware the SKILL runs Python code you should trust — it will read and write local files you specify, so avoid passing sensitive system paths; (2) the skill will likely prompt you for file paths and options and may install Python packages via pip into the agent environment — ensure you are OK with those packages; (3) HTML→PDF may need wkhtmltopdf installed separately; (4) if you want extra caution, inspect the full SKILL.md content or copy the conversion scripts and run them in a controlled environment rather than giving the agent direct execution rights.

Capability Analysis

Type: OpenClaw Skill Name: document-converter-pro Version: 1.0.1 The skill bundle provides legitimate document conversion functionality between Word, PDF, Markdown, and HTML formats. The Python scripts in SKILL.md use standard, well-known libraries (python-docx, fpdf2, markdown) and follow safe practices, such as using list-based arguments in subprocess.run to prevent shell injection. No evidence of data exfiltration, malicious execution, or prompt injection was found.

Capability Assessment

✓ Purpose & Capability

Name/description (document conversions) align with the included scripts and referenced Python libraries (python-docx, fpdf2, markdown, optional wkhtmltopdf). Required binary is only python3, which is appropriate for the provided Python-based converters.

✓ Instruction Scope

SKILL.md contains explicit conversion scripts and prompts the user for source file path, target format, and options. The scripts read local input files and write local output files; they do not reference unrelated system files, environment secrets, or external endpoints. No broad or vague instructions requesting extra system context are present.

✓ Install Mechanism

This is an instruction-only skill with no install spec or remote downloads. It lists pip dependencies in SKILL.md (pip install python-docx fpdf2, markdown) and mentions an optional system tool (wkhtmltopdf). That is proportional to the functionality and does not involve untrusted URLs or archive extraction.

✓ Credentials

No environment variables or credentials are requested. The dependencies and system tools listed are consistent with the conversion tasks; there are no unrelated secret-like environment requirements.

✓ Persistence & Privilege

Skill does not request always: true or any special persistent privileges. It is user-invocable and allows normal autonomous invocation, which is the platform default and acceptable here.

Version History

v1.0.1

修复依赖声明：移除node，统一使用python-docx和fpdf2

v1.0.0

专业文档格式转换器：支持Word/PDF/Markdown/HTML互转，高保真输出

Metadata

Slug document-converter-pro

Version 1.0.1

License MIT-0

All-time Installs 3

Active Installs 2

Total Versions 2

Frequently Asked Questions

What is Doc Converter?

Document format converter. Use when user needs to convert between Word, PDF, Markdown, HTML formats. Supports docx-to-pdf, md-to-pdf, md-to-docx, html-to-pdf... It is an AI Agent Skill for Claude Code / OpenClaw, with 374 downloads so far.

How do I install Doc Converter?

Run "/install document-converter-pro" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Doc Converter free?

Yes, Doc Converter is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Doc Converter support?

Doc Converter is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Doc Converter?

It is built and maintained by ToBeWin (@tobewin); the current version is v1.0.1.

More Skills

Doc Converter