← Back to Skills Marketplace
goog

tra-extract-text

by Jay · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
208
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install tra-extract-text
Description
Extract readable text, markdown, HTML, JSON, or XML content from web pages using the trafilatura CLI tool with optional metadata and output formatting.
Usage Guidance
This skill is coherent and simply documents how to install and use the trafilatura CLI. Before installing, consider: (1) 'pip install trafilatura' will fetch code from PyPI—verify the package's reputation and optionally pin a specific version; (2) there is no homepage/source repo listed in the skill metadata—if you care about provenance, check PyPI and the project's source to confirm authenticity; (3) run the tool in a sandboxed environment if you are concerned about executing third-party code; (4) be cautious when extracting content from internal or sensitive URLs (this can expose internal data to the agent environment); and (5) ensure you respect site terms/robots and copyright when scraping. If you want stricter controls, add an explicit install spec (trusted package source and pinned version) or pre-install trafilatura in a controlled environment rather than letting the agent run pip at runtime.
Capability Analysis
Type: OpenClaw Skill Name: tra-extract-text Version: 1.0.0 The skill bundle provides standard instructions and examples for using the legitimate 'trafilatura' Python library and CLI tool to extract text from web pages. The SKILL.md file contains typical usage patterns (markdown, text, and metadata extraction) and lacks any indicators of malicious intent, data exfiltration, or prompt injection.
Capability Assessment
Purpose & Capability
Name/description (extract web page text/markdown/HTML/JSON/XML) match the SKILL.md which documents using the trafilatura CLI and its options. There are no unrelated credentials, binaries, or config paths requested.
Instruction Scope
Runtime instructions are narrowly scoped to installing trafilatura and running the trafilatura CLI against user-provided URLs; they do not ask the agent to read unrelated files, environment variables, or to transmit results to any unexpected external endpoint.
Install Mechanism
There is no formal install spec, but SKILL.md instructs running 'pip install trafilatura' (PyPI). This is expected for a Python CLI tool but means the agent or user will download code from PyPI at install time — a moderate, expected risk. The skill does not pin a version or point to an authoritative homepage/source repo.
Credentials
The skill requests no environment variables, secrets, or config paths. That is proportionate to its stated purpose.
Persistence & Privilege
The skill is instruction-only, not always-enabled, and does not request system-wide changes or persistent privileges. Autonomous invocation is permitted (platform default) but not combined with other concerning privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install tra-extract-text
  3. After installation, invoke the skill by name or use /tra-extract-text
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of tra-extract-text. - Extracts readable text, markdown, or raw HTML from web pages using the trafilatura CLI tool. - Supports output in multiple formats: Markdown, plain text, HTML, JSON, and XML. - Includes options for adding metadata (title, author, date) to extracted content. - Simple command-line interface for extracting, formatting, and saving web content.
Metadata
Slug tra-extract-text
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is tra-extract-text?

Extract readable text, markdown, HTML, JSON, or XML content from web pages using the trafilatura CLI tool with optional metadata and output formatting. It is an AI Agent Skill for Claude Code / OpenClaw, with 208 downloads so far.

How do I install tra-extract-text?

Run "/install tra-extract-text" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is tra-extract-text free?

Yes, tra-extract-text is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does tra-extract-text support?

tra-extract-text is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created tra-extract-text?

It is built and maintained by Jay (@goog); the current version is v1.0.0.

💬 Comments