← Back to Skills Marketplace

Extract Tables From Pdf

Name: Extract Tables From Pdf
Author: mzlzyca

by mzlzyCA · GitHub ↗ · v0.4.0 · MIT-0

cross-platform ✓ Security Clean

253

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install extract-tables-from-pdf

Description

Extract tables from PDF documents using MinerU's table detection engine. Identifies and extracts structured table data from both native and scanned PDFs. Fea...

README (SKILL.md)

Extract Tables From Pdf

Convert and extract content from .pdf using MinerU (mineru-open-api).

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Extract tables from PDF (requires token)
mineru-open-api extract report.pdf -o ./out/

# With explicit table flag and OCR for scanned docs
mineru-open-api extract scanned.pdf --ocr --table -o ./out/

Authentication

Token required for extract and crawl:

mineru-open-api auth            # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

Supports local files and URLs
Requires token (mineru-open-api auth or MINERU_TOKEN env)
Supported input: .pdf
Language hint with --language (default: ch, use en for English)
Page range with --pages (where applicable)

Notes

Table recognition requires extract with token. flash-extract does NOT support tables. Use --table flag (enabled by default).
Output goes to stdout by default; use -o \x3Cdir> to save to file
Binary formats (docx) require -o flag (cannot stream to stdout)
All progress/status messages go to stderr
MinerU is an open-source project by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Usage Guidance

This skill appears to be what it says: a wrapper around the mineru-open-api CLI that requires a MINERU_TOKEN. Before installing or using it: 1) Confirm mineru.net and the GitHub repo look legitimate and review their privacy/security docs; 2) Assume PDFs you process may be uploaded to MinerU servers—do not send sensitive or regulated data unless you trust the service or have an on-prem/self-hosted alternative; 3) Prefer installing in an isolated environment (container or VM) and inspect the npm/go package source if you require higher assurance; 4) Limit and rotate the MINERU_TOKEN and avoid storing it in shared shells; 5) If you need purely local-only processing, verify the CLI actually supports local-only mode or find an offline tool. If you want me to, I can fetch the mineru-open-api npm package or GitHub repo and highlight any concerning code or publish scripts before you install.

Capability Analysis

Type: OpenClaw Skill Name: extract-tables-from-pdf Version: 0.4.0 The skill is a legitimate wrapper for the MinerU document intelligence engine (developed by Shanghai AI Lab) used to extract tables from PDF files. It utilizes the 'mineru-open-api' CLI tool and requires a standard API token (MINERU_TOKEN) for its cloud-based extraction services. No evidence of malicious intent, data exfiltration beyond the stated purpose, or prompt injection was found in SKILL.md or _meta.json.

Capability Assessment

✓ Purpose & Capability

Name/description match what is required: the skill requires the mineru-open-api binary and a MINERU_TOKEN, which are exactly what a MinerU-based PDF table extractor would need.

ℹ Instruction Scope

SKILL.md instructs the agent to run mineru-open-api commands, authenticate with MINERU_TOKEN, and operate on local files or URLs. This is within scope, but the instructions imply the CLI will use the token to contact MinerU services — meaning PDF contents may be transmitted to an external service; that is expected but important for privacy.

✓ Install Mechanism

Installers are standard: npm package and go install from a GitHub repo. Both are reasonable for a CLI tool. Note: installing npm packages can run lifecycle scripts, so review the package or install in a controlled environment if you need extra safety.

✓ Credentials

Only a single service credential (MINERU_TOKEN) is required and is declared as the primary credential. That is proportional to the described remote-API usage. The skill does not request unrelated credentials or system paths.

✓ Persistence & Privilege

Skill does not request always:true or elevated platform persistence. It is user-invocable and can run autonomously (platform default), which is normal for skills of this type.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install extract-tables-from-pdf
After installation, invoke the skill by name or use /extract-tables-from-pdf
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.4.0

SEO: expand description for better ClawHub vector search discovery

v0.3.0

Rollback to original version

v0.2.1

SEO optimization v0.2.1

v0.2.0

SEO optimization v0.2.0

v1.0.1

Fix: declare MINERU_TOKEN credential in metadata

v1.0.0

Extract Tables from PDF - extract tables from PDF documents using MinerU. Use when a PDF contains da

Metadata

Slug extract-tables-from-pdf

Version 0.4.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 6

Frequently Asked Questions

What is Extract Tables From Pdf?

Extract tables from PDF documents using MinerU's table detection engine. Identifies and extracts structured table data from both native and scanned PDFs. Fea... It is an AI Agent Skill for Claude Code / OpenClaw, with 253 downloads so far.

How do I install Extract Tables From Pdf?

Run "/install extract-tables-from-pdf" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Extract Tables From Pdf free?

Yes, Extract Tables From Pdf is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Extract Tables From Pdf support?

Extract Tables From Pdf is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Extract Tables From Pdf?

It is built and maintained by mzlzyCA (@mzlzyca); the current version is v0.4.0.

More Skills

Extract Tables From Pdf

Extract Tables From Pdf

Install

Quick Start

Authentication

Capabilities

Notes

What is Extract Tables From Pdf?

How do I install Extract Tables From Pdf?

Is Extract Tables From Pdf free?

Which platforms does Extract Tables From Pdf support?

Who created Extract Tables From Pdf?

💬 Comments