← Back to Skills Marketplace

Office → Markdown Skill

Name: Office → Markdown Skill
Author: naimalarain13

by Naimal Salahuddin · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install office-to-markdown

Description

Converts office automation documents — PDF, PPTX, DOCX, XLSX, CSV — into clean, readable Markdown. Use this skill when a user explicitly asks to convert, ext...

README (SKILL.md)

Office → Markdown Skill

Convert any uploaded office document to clean Markdown. All conversion logic lives in scripts/ — load only the script you need.

Security notes

Dependencies are installed into an isolated temp directory (/tmp/office_md_deps/) and pinned to reviewed versions. The system Python environment is not modified.

For scanned or image-only content, pages are sent to Anthropic's vision API. Always ask the user for confirmation before enabling vision (see Workflow step 3).

Script Reference

Format	Extensions	Script
PDF (text + scanned/image)	`.pdf`	`scripts/pdf-to-md.py`
PowerPoint	`.pptx`, `.ppt`	`scripts/pptx-to-md.py`
Word	`.docx`, `.doc`	`scripts/docx-to-md.py`
Excel	`.xlsx`, `.xls`	`scripts/xlsx-to-md.py`
CSV	`.csv`	`scripts/csv-to-md.py`

Workflow

1. Confirm conversion intent

Only proceed if the user has explicitly asked to convert, extract, or export the document to Markdown. A bare file upload without a conversion request is not sufficient to trigger this skill.

2. Run the matching script (text-only pass first)

python scripts/\x3Cscript-name>.py \
  /mnt/user-data/uploads/\x3Cinput-file> \
  /mnt/user-data/outputs/\x3Cstem>.md

Each script installs its own pinned dependencies into /tmp/office_md_deps/ on first run (isolated from the system Python environment).

3. Vision consent — REQUIRED before image extraction

If the script output indicates image-only pages were detected (or the document is known to be scanned), stop and ask the user:

"This document has N image-only page(s) that cannot be extracted without sending them to Anthropic's vision API. Page images will be transmitted externally for OCR. Would you like to proceed with vision extraction?"

Only if the user confirms, re-run with the --allow-vision flag:

python scripts/\x3Cscript-name>.py \
  /mnt/user-data/uploads/\x3Cinput-file> \
  /mnt/user-data/outputs/\x3Cstem>.md \
  --allow-vision

If the user declines, save the text-only result and note which pages were skipped.

4. Present the file

Use present_files with the output .md path, then give a brief summary:

File type and page/slide/sheet count
Whether vision was used and for how many pages (or that it was skipped)

How vision works (PDF / PPTX / DOCX)

Each script uses a two-pass strategy:

Text pass — extract text normally (fast, no API call, always runs)
Vision pass — only runs when --allow-vision is passed AND pages had no extractable text; those pages are rendered and sent to the Claude vision API

Edge Cases

Situation	Behaviour
Fully scanned PDF	All pages flagged for vision; user confirmation required
Mixed PDF (some text, some images)	Only image pages flagged; user confirmation required
User declines vision	Text-only `.md` is saved; skipped pages are noted inline
Password-protected file	Script exits with a clear error message
Very large PDF (50+ image pages)	Script adds 0.3s sleep between vision calls
Image too large (>4MB base64)	Reduce DPI: edit `dpi=150` → `dpi=100` in `pdf-to-md.py`
Encoding errors in CSV	Script auto-retries with `latin-1`

Usage Guidance

This skill is reasonable to install if you are comfortable with runtime Python package installation. Do not approve the --allow-vision/OCR path for confidential scanned documents unless you are comfortable sending those page images to Anthropic.

Capability Analysis

Type: OpenClaw Skill Name: office-to-markdown Version: 1.0.1 The office-to-markdown skill is a well-documented tool for converting various document formats (PDF, DOCX, PPTX, XLSX, CSV) into Markdown. It uses a two-pass strategy for text extraction and optional OCR via the Anthropic Vision API, which explicitly requires user confirmation as per the SKILL.md instructions. The Python scripts (e.g., pdf-to-md.py, docx-to-md.py) manage their own dependencies by installing pinned versions into an isolated temporary directory (/tmp/office_md_deps/), and the external network calls to api.anthropic.com are strictly aligned with the stated purpose of document processing.

Capability Assessment

✓ Purpose & Capability

The artifacts consistently implement conversion of uploaded PDF, PPTX, DOCX, XLSX, and CSV files into Markdown; no unrelated account, credential, or system-management behavior is shown.

✓ Instruction Scope

The workflow requires an explicit user conversion request and separately requires confirmation before using vision/OCR for image-only content.

ℹ Install Mechanism

There is no install spec, but the scripts install pinned Python dependencies into /tmp/office_md_deps at runtime. This is disclosed and purpose-aligned, but users should be aware it relies on external package installation.

ℹ Credentials

The scripts read user-provided documents and write Markdown outputs. Optional vision mode may transmit document page images externally to Anthropic, which is proportionate for OCR but privacy-sensitive.

✓ Persistence & Privilege

No credentials, privileged paths, scheduled jobs, or background persistence are shown; the only persistent-looking state is a temporary dependency cache.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install office-to-markdown
After installation, invoke the skill by name or use /office-to-markdown
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

Version 1.0.1 - Clarifies skill should only trigger on explicit conversion requests, not on bare file uploads. - Adds a required step for user confirmation before transmitting images to the vision API. - Documents new security practices: scripts install dependencies into an isolated temp directory, not the system Python. - Adds workflow/enforcement for vision extraction: prompt user if image-only pages detected, optionally skip vision if declined. - Updates description and edge cases to reflect these consent and security changes.

v1.0.0

- Initial release. - Converts PDF, DOCX, PPTX, XLSX, and CSV files to clean, readable Markdown. - Supports both text-based and scanned/image-based documents using Claude vision. - Automatic format detection and matching script execution for each file type. - Outputs Markdown and gives summary with file type, count info, and any vision usage. - Handles edge cases: password protection, mixed content, large files, encoding errors.

Metadata

Slug office-to-markdown

Version 1.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Office → Markdown Skill?

Converts office automation documents — PDF, PPTX, DOCX, XLSX, CSV — into clean, readable Markdown. Use this skill when a user explicitly asks to convert, ext... It is an AI Agent Skill for Claude Code / OpenClaw, with 54 downloads so far.

How do I install Office → Markdown Skill?

Run "/install office-to-markdown" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Office → Markdown Skill free?

Yes, Office → Markdown Skill is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Office → Markdown Skill support?

Office → Markdown Skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Office → Markdown Skill?

It is built and maintained by Naimal Salahuddin (@naimalarain13); the current version is v1.0.1.

More Skills

Office → Markdown Skill

Office → Markdown Skill

Script Reference

Workflow

1. Confirm conversion intent

2. Run the matching script (text-only pass first)

3. Vision consent — REQUIRED before image extraction

4. Present the file

How vision works (PDF / PPTX / DOCX)

Edge Cases

What is Office → Markdown Skill?

How do I install Office → Markdown Skill?

Is Office → Markdown Skill free?

Which platforms does Office → Markdown Skill support?

Who created Office → Markdown Skill?

💬 Comments