← Back to Skills Marketplace
overdue-lin

pdf-translate-skill

by Zexun Lin · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
79
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install pdf-translate-skill
Description
Translate PDF documents or arXiv papers preserving formatting by extracting text and images, translating content, and generating a reconstructed LaTeX-based...
Usage Guidance
This skill is coherent with its purpose but requires significant local tooling and performs network downloads and disk writes. Before using it: 1) Note the SKILL.md requires Python (pymupdf, pillow), curl/wget, and a LaTeX installation (XeLaTeX/TeX Live or MiKTeX). The registry metadata omits these — install them yourself or in a container. 2) Review the scripts (they are included) if you are concerned: they call curl/wget, extract tar.gz archives, and run xelatex/pdflatex via subprocess — run in a sandbox (container or VM) if you want to limit risk. 3) The skill downloads from arxiv.org only; do not run it on arbitrary/untrusted URLs without inspection. 4) No credentials are requested, but the tool writes files and executes system binaries — ensure you trust the environment and have disk space. If you want more assurance, run the scripts on sample documents inside an isolated environment first.
Capability Analysis
Type: OpenClaw Skill Name: pdf-translate-skill Version: 1.0.0 The skill bundle is classified as suspicious due to a Path Traversal vulnerability (Zip Slip) in `scripts/download_arxiv_source.py` caused by the unsafe use of `tarfile.extractall()` on downloaded archives. While the skill's functionality for translating PDFs and arXiv papers appears legitimate, it utilizes high-risk operations such as downloading remote content via `curl`/`wget` and executing system binaries for LaTeX compilation in `scripts/compile_latex.py`. These features, combined with the lack of sanitization for archive members, present a significant security risk, although no evidence of intentional malice or data exfiltration was found.
Capability Assessment
Purpose & Capability
The skill's name and description (translate PDFs / arXiv papers and produce LaTeX/PDF) match the included scripts and references. The code files implement PDF→images, image extraction, arXiv source download, and LaTeX compilation — all needed for the stated functionality. One inconsistency: the registry metadata lists no required binaries/env—but SKILL.md and the scripts clearly require system tools (curl/wget, XeLaTeX/pdfLaTeX) and Python packages. This appears to be an omissions in metadata rather than malicious misdirection.
Instruction Scope
Runtime instructions and scripts perform expected actions: detecting arXiv IDs/URLs, downloading arXiv e-print archives, extracting .tex files, converting PDF pages to images, extracting embedded images, translating TeX content conceptually (the SKILL.md describes translation rules), and compiling LaTeX via xelatex/pdflatex. The scripts read and write files in local directories and call external commands (curl/wget/xelatex) — all consistent with the stated tasks. They do not reference or exfiltrate unrelated system files, nor do they require credentials. The seller's instructions to 'use the agent's multilingual capabilities' implies translation happens locally in the agent workflow (no external translation API is invoked).
Install Mechanism
There is no install spec (instruction-only install) which minimizes automated code installation risk. However, the SKILL.md lists several manual prerequisites (Python packages, XeLaTeX/TeX Live or MiKTeX, curl/wget). The code uses subprocess calls to system binaries. Because installation is manual, the user must install large toolchains (TeX) themselves — this is expected for LaTeX compilation but worth noting as a non-trivial dependency.
Credentials
The skill declares no environment variables or credentials, and none are required by the scripts. Network access is used only to fetch arXiv e-prints (https://arxiv.org/e-print/{id}) via curl/wget/urllib which is appropriate for the arXiv download feature. No secrets, keys, or unrelated service credentials are requested.
Persistence & Privilege
The skill is not always-enabled, is user-invocable, and does not attempt to modify other skills or agent-wide configuration. It runs local file operations and external commands in the working directories only (no system-wide changes are performed by the scripts).
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdf-translate-skill
  3. After installation, invoke the skill by name or use /pdf-translate-skill
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of pdf-translator skill. - Translates PDF documents while preserving original formatting, including layout and embedded images. - Supports two modes: - arXiv Mode: Accepts arXiv ID/URL, downloads source TeX, translates content, and compiles back to PDF with high fidelity. - Local PDF Mode: Processes local PDF files by converting pages to images, analyzing layout, extracting and translating text, and regenerating the document with LaTeX. - Automatically detects and chooses the best mode depending on user input and arXiv source availability, with fallback if needed. - Maintains structure of figures, math, citations, and bibliography in translated documents. - Provides detailed instructions for prerequisites and installation.
Metadata
Slug pdf-translate-skill
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is pdf-translate-skill?

Translate PDF documents or arXiv papers preserving formatting by extracting text and images, translating content, and generating a reconstructed LaTeX-based... It is an AI Agent Skill for Claude Code / OpenClaw, with 79 downloads so far.

How do I install pdf-translate-skill?

Run "/install pdf-translate-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is pdf-translate-skill free?

Yes, pdf-translate-skill is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does pdf-translate-skill support?

pdf-translate-skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created pdf-translate-skill?

It is built and maintained by Zexun Lin (@overdue-lin); the current version is v1.0.0.

💬 Comments