← Back to Skills Marketplace
xmind-doc-parser
by
Maglanyulan
· GitHub ↗
· v1.0.1
· MIT-0
135
Downloads
1
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install xmind-doc-parser
Description
Parse documents in 18+ formats using Baidu API to extract text, tables, layout, OCR scanned images, and produce document chunks for RAG.
Usage Guidance
What to consider before installing:
- Metadata mismatch: The registry says no env vars required, but SKILL.md and the included script require BAIDU_DOC_AI_API_KEY and BAIDU_DOC_AI_SECRET_KEY. Ask the author to fix the metadata or update the registry entry before trusting the skill.
- Credentials: The skill needs your Baidu API key/secret (reasonable for this purpose). Avoid placing these secrets in a global config if you care about limiting access: do not paste keys into ~/.openclaw/openclaw.json unless you accept that other skills or users with access to that file might use them. If you must store keys, restrict file permissions (e.g., chmod 600) and consider a per-skill or per-agent secret store.
- Data exfil/privacy: The skill sends documents (base64 or public URLs) to Baidu's cloud endpoints. Do not send sensitive, confidential, or internal-only documents or internal URLs. If you pass file_url, be aware the remote service will fetch that URL (potentially exposing internal endpoints to Baidu).
- Operational limits: The skill documents file-size, QPS and polling limits — confirm these match your expected usage and billing/quotas in your Baidu account.
- Verify behavior: Review the included script (it calls only Baidu endpoints and has no obfuscated code). Test with non-sensitive sample files and a limited/ephemeral API key to confirm behavior before using real data.
- Remediation suggestions: Ask the maintainer to update registry metadata to declare required env vars and any config path usage; prefer guidance for using per-skill secrets rather than editing a global openclaw.json; add explicit warnings about sending sensitive data to a third-party cloud.
Given the clear metadata/instruction mismatch and the recommendation to store credentials globally, treat this skill cautiously (suspicious) rather than outright malicious, but require fixes or mitigations before trusting it with sensitive data.
Capability Assessment
Purpose & Capability
The code and SKILL.md implement a Baidu Document Parser client and this matches the skill description. However the registry metadata claims no required environment variables or config paths, while SKILL.md and references clearly require BAIDU_DOC_AI_API_KEY and BAIDU_DOC_AI_SECRET_KEY and suggest editing ~/.openclaw/openclaw.json. That mismatch is incoherent and should be corrected.
Instruction Scope
Runtime instructions are focused on document parsing and polling Baidu's APIs (expected). But ancillary documentation instructs editing the global OpenClaw config file (~/.openclaw/openclaw.json) and restarting the gateway to inject credentials — this references a system path outside the skill's declared scope and effectively centralizes credentials for other skills, increasing blast radius.
Install Mechanism
There is no install spec or remote download; the skill is instruction-only with an included Python script. No external archives or unknown URLs are fetched by the installer. The client uses standard requests to call Baidu endpoints (expected).
Credentials
The SKILL.md (and the Python client) require BAIDU_DOC_AI_API_KEY and BAIDU_DOC_AI_SECRET_KEY which are proportionate to calling Baidu's API. However the registry metadata declares 'required env vars: none' and 'required config paths: none' — a clear inconsistency. Also references encourage placing these secrets into a global OpenClaw config, which would expose them to other skills.
Persistence & Privilege
The skill does not request always:true and does not modify other skills. However references instruct the operator to place API keys into a shared ~/.openclaw/openclaw.json and restart the gateway; that is a form of persistent credential placement (administrative action) that could increase exposure if other skills or users can read that file.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install xmind-doc-parser - After installation, invoke the skill by name or use
/xmind-doc-parser - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- Skill renamed from "xmind-doc-parser" to "baidu-doc-parser" to accurately reflect functionality.
- Now parses documents using Baidu Document Parser API, supporting 18+ formats (PDF, Word, Excel, PowerPoint, images, and more).
- Offers comprehensive extraction: text, tables, layout analysis, OCR for scanned docs, and document chunking for RAG.
- Enhanced documentation: details on API parameters, environment setup, file/format/language support, error codes, and usage examples.
- Adds command-line script for easy testing and reference links to official resources.
v1.0.0
- Initial release of the baidu-doc-parser skill.
- Supports parsing and extracting text, tables, and layout from 18+ document formats (PDF, Word, Excel, PPT, images, etc.) via Baidu Document Parser API.
- Includes OCR for scanned documents and multi-language support (20+ languages).
- Provides options for document chunking (RAG), formula recognition, chart analysis, table merging, and more.
- Comprehensive usage instructions, API parameters, return structure, error handling, and polling strategy documented in SKILL.md.
Metadata
Frequently Asked Questions
What is xmind-doc-parser?
Parse documents in 18+ formats using Baidu API to extract text, tables, layout, OCR scanned images, and produce document chunks for RAG. It is an AI Agent Skill for Claude Code / OpenClaw, with 135 downloads so far.
How do I install xmind-doc-parser?
Run "/install xmind-doc-parser" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is xmind-doc-parser free?
Yes, xmind-doc-parser is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does xmind-doc-parser support?
xmind-doc-parser is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created xmind-doc-parser?
It is built and maintained by Maglanyulan (@maglanyulan); the current version is v1.0.1.
More Skills