← Back to Skills Marketplace
Visible Text Extractor
by
wunianze666-netizen
· GitHub ↗
· v1.2.0
· MIT-0
153
Downloads
0
Stars
0
Active Installs
4
Versions
Install in OpenClaw
/install visible-text-extractor
Description
Extract and reconstruct as much visible text as possible from webpage URLs, article pages, screenshots, long images, image directories, and GIFs. Use when th...
Usage Guidance
This skill appears to do what it says, but before running it you should: (1) confirm required runtime tools exist (node, ffmpeg, Python, the local OCR stack referenced at /root/.openclaw/.../ocr-local and any virtualenv); the package metadata currently does not declare these dependencies, so expect runtime errors otherwise; (2) be aware it downloads arbitrary images/frames referenced by pages and writes temp files — if pages contain internal URLs this could cause server-side requests to your internal network (SSRF-like behavior); (3) the scripts optionally send the produced docx to Feishu when a receive-id is supplied — do not provide a receive-id unless you trust the destination; (4) review or supply trusted implementations for the referenced external scripts (ocr.js, feishu_file_sender.py) since the skill delegates OCR and delivery to them; (5) run the skill in an isolated environment (or sandbox) if you plan to process sensitive pages. If the publisher updates the package metadata to list required binaries and external script dependencies explicitly (and documents the optional Feishu delivery clearly), and if you verify the referenced local scripts are trusted, this assessment could be upgraded to benign.
Capability Analysis
Type: OpenClaw Skill
Name: visible-text-extractor
Version: 1.2.0
The skill bundle provides extensive capabilities for web scraping, browser automation (Playwright), and OCR, which involve high-risk behaviors such as arbitrary network requests and shell command execution via subprocess. It includes a feature to send extracted documents to external Feishu recipients using a separate local skill (feishu-file-sender). While these capabilities are aligned with the stated purpose of extracting and delivering text from various media, the broad access to the network, file system, and local command execution environment warrants a suspicious classification under the provided criteria. Key files include extract_visible_text.py for web/OCR logic and build_authorized_capture_docx.py for the delivery pipeline.
Capability Assessment
Purpose & Capability
The skill claims to extract visible text from pages/images and the included scripts implement that. However the package metadata lists no required binaries or credentials while scripts clearly call node, ffmpeg, and an external/local OCR stack (paths like /root/.openclaw/.../ocr-local/scripts/ocr.js and /root/.openclaw/venvs/ocrstack/bin/python). The absence of declared runtime requirements is an incoherence: a consumer would legitimately need those dependencies to run the skill.
Instruction Scope
SKILL.md and USAGE.md instruct the agent to download pages/images, render pages via a browser fallback, extract GIF frames, run OCR, and produce docx/JSON/markdown. That scope matches the stated purpose. Some scripts also reference other local skill scripts (feishu sender, ocr-local) and absolute workspace paths and will download arbitrary image URLs discovered in pages — expected for this task but worth noting because it increases the runtime network surface (and may access internal-only URLs if present).
Install Mechanism
There is no install spec (instruction-only with bundled scripts) so nothing is fetched at install time. Runtime, however, depends on external binaries and other skill scripts (node, ffmpeg, local OCR scripts). The lack of an explicit install section or dependency declaration is the main issue, not the install mechanism itself.
Credentials
The skill declares no required env vars/credentials, but several scripts can invoke a Feishu file-sender script and will send a generated docx if a user-supplied --send-feishu-receive-id is passed. The code will also invoke external local tools under absolute paths. Requiring no credentials is coherent for read-only extraction, but the optional remote-send behavior and implicit dependencies on other local skill code are not documented in metadata and increase the risk of unintended data sharing or failure due to missing components.
Persistence & Privilege
always is false and the skill does not request permanent inclusion or modify other skills' configs. It writes temporary files and output artifacts in specified output paths, which is expected for this workflow.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install visible-text-extractor - After installation, invoke the skill by name or use
/visible-text-extractor - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.2.0
Visible Text Extractor 1.0.0 – Initial release
- Initial public release with support for webpage, article, screenshot, long image, GIF, and image directory text extraction.
- Adds dedicated scripts for WeChat article order reconstruction, high-accuracy OCR, and multi-stage cleanup.
- Introduces specialized pipelines for clean, human-readable output (markdown/Word/JSON).
- Provides workflow and reference documentation for usage, publishing, and release notes.
- Includes multiple extraction and deliverable pipelines, especially tailored for WeChat articles and complex, image-heavy sources.
v1.1.1
Refine reading-order reconstruction guidance, strengthen deliverable quality targets, and improve the skill description around original article flow and user-facing comfort.
v1.1.0
Improve WeChat reading-order reconstruction, stabilize OCR fallback speed, and tighten deliverable quality for cleaner Word output.
v1.0.0
Initial polished public release. Added OCR cleanup, section reconstruction, WeChat handling, troubleshooting docs, and a one-step deliverable pipeline that outputs raw JSON, clean JSON, clean markdown, and Word documents.
Metadata
Frequently Asked Questions
What is Visible Text Extractor?
Extract and reconstruct as much visible text as possible from webpage URLs, article pages, screenshots, long images, image directories, and GIFs. Use when th... It is an AI Agent Skill for Claude Code / OpenClaw, with 153 downloads so far.
How do I install Visible Text Extractor?
Run "/install visible-text-extractor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Visible Text Extractor free?
Yes, Visible Text Extractor is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Visible Text Extractor support?
Visible Text Extractor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Visible Text Extractor?
It is built and maintained by wunianze666-netizen (@wunianze666-netizen); the current version is v1.2.0.
More Skills