← Back to Skills Marketplace

Visible Text Extractor

Name: Visible Text Extractor
Author: wunianze666-netizen

by wunianze666-netizen · GitHub ↗ · v1.2.0 · MIT-0

cross-platform ⚠ suspicious

153

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install visible-text-extractor

Description

Extract and reconstruct as much visible text as possible from webpage URLs, article pages, screenshots, long images, image directories, and GIFs. Use when th...

Usage Guidance

This skill appears to do what it says, but before running it you should: (1) confirm required runtime tools exist (node, ffmpeg, Python, the local OCR stack referenced at /root/.openclaw/.../ocr-local and any virtualenv); the package metadata currently does not declare these dependencies, so expect runtime errors otherwise; (2) be aware it downloads arbitrary images/frames referenced by pages and writes temp files — if pages contain internal URLs this could cause server-side requests to your internal network (SSRF-like behavior); (3) the scripts optionally send the produced docx to Feishu when a receive-id is supplied — do not provide a receive-id unless you trust the destination; (4) review or supply trusted implementations for the referenced external scripts (ocr.js, feishu_file_sender.py) since the skill delegates OCR and delivery to them; (5) run the skill in an isolated environment (or sandbox) if you plan to process sensitive pages. If the publisher updates the package metadata to list required binaries and external script dependencies explicitly (and documents the optional Feishu delivery clearly), and if you verify the referenced local scripts are trusted, this assessment could be upgraded to benign.

Capability Analysis

Type: OpenClaw Skill Name: visible-text-extractor Version: 1.2.0 The skill bundle provides extensive capabilities for web scraping, browser automation (Playwright), and OCR, which involve high-risk behaviors such as arbitrary network requests and shell command execution via subprocess. It includes a feature to send extracted documents to external Feishu recipients using a separate local skill (feishu-file-sender). While these capabilities are aligned with the stated purpose of extracting and delivering text from various media, the broad access to the network, file system, and local command execution environment warrants a suspicious classification under the provided criteria. Key files include extract_visible_text.py for web/OCR logic and build_authorized_capture_docx.py for the delivery pipeline.

Capability Assessment

⚠ Purpose & Capability

The skill claims to extract visible text from pages/images and the included scripts implement that. However the package metadata lists no required binaries or credentials while scripts clearly call node, ffmpeg, and an external/local OCR stack (paths like /root/.openclaw/.../ocr-local/scripts/ocr.js and /root/.openclaw/venvs/ocrstack/bin/python). The absence of declared runtime requirements is an incoherence: a consumer would legitimately need those dependencies to run the skill.

ℹ Instruction Scope

SKILL.md and USAGE.md instruct the agent to download pages/images, render pages via a browser fallback, extract GIF frames, run OCR, and produce docx/JSON/markdown. That scope matches the stated purpose. Some scripts also reference other local skill scripts (feishu sender, ocr-local) and absolute workspace paths and will download arbitrary image URLs discovered in pages — expected for this task but worth noting because it increases the runtime network surface (and may access internal-only URLs if present).

✓ Install Mechanism

There is no install spec (instruction-only with bundled scripts) so nothing is fetched at install time. Runtime, however, depends on external binaries and other skill scripts (node, ffmpeg, local OCR scripts). The lack of an explicit install section or dependency declaration is the main issue, not the install mechanism itself.

⚠ Credentials

The skill declares no required env vars/credentials, but several scripts can invoke a Feishu file-sender script and will send a generated docx if a user-supplied --send-feishu-receive-id is passed. The code will also invoke external local tools under absolute paths. Requiring no credentials is coherent for read-only extraction, but the optional remote-send behavior and implicit dependencies on other local skill code are not documented in metadata and increase the risk of unintended data sharing or failure due to missing components.

✓ Persistence & Privilege

always is false and the skill does not request permanent inclusion or modify other skills' configs. It writes temporary files and output artifacts in specified output paths, which is expected for this workflow.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install visible-text-extractor
After installation, invoke the skill by name or use /visible-text-extractor
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.2.0

Visible Text Extractor 1.0.0 – Initial release - Initial public release with support for webpage, article, screenshot, long image, GIF, and image directory text extraction. - Adds dedicated scripts for WeChat article order reconstruction, high-accuracy OCR, and multi-stage cleanup. - Introduces specialized pipelines for clean, human-readable output (markdown/Word/JSON). - Provides workflow and reference documentation for usage, publishing, and release notes. - Includes multiple extraction and deliverable pipelines, especially tailored for WeChat articles and complex, image-heavy sources.

v1.1.1

Refine reading-order reconstruction guidance, strengthen deliverable quality targets, and improve the skill description around original article flow and user-facing comfort.

v1.1.0

Improve WeChat reading-order reconstruction, stabilize OCR fallback speed, and tighten deliverable quality for cleaner Word output.

v1.0.0

Initial polished public release. Added OCR cleanup, section reconstruction, WeChat handling, troubleshooting docs, and a one-step deliverable pipeline that outputs raw JSON, clean JSON, clean markdown, and Word documents.

Metadata

Slug visible-text-extractor

Version 1.2.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 4

Frequently Asked Questions

What is Visible Text Extractor?

Extract and reconstruct as much visible text as possible from webpage URLs, article pages, screenshots, long images, image directories, and GIFs. Use when th... It is an AI Agent Skill for Claude Code / OpenClaw, with 153 downloads so far.

How do I install Visible Text Extractor?

Run "/install visible-text-extractor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Visible Text Extractor free?

Yes, Visible Text Extractor is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Visible Text Extractor support?

Visible Text Extractor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Visible Text Extractor?

It is built and maintained by wunianze666-netizen (@wunianze666-netizen); the current version is v1.2.0.

More Skills