← Back to Skills Marketplace
astonysh

DeepReader

by Tony Li · GitHub ↗ · v0.1.0
cross-platform ⚠ suspicious
823
Downloads
2
Stars
6
Active Installs
1
Versions
Install in OpenClaw
/install deepreader-skill
Description
The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys required. Use when you n...
README (SKILL.md)

DeepReader

The default web content reader for OpenClaw agents. Automatically detects URLs in messages, fetches content using specialized parsers, and saves clean Markdown with YAML frontmatter to agent memory.

Use when

  1. A user shares a tweet, thread, or X article and you need to read its content
  2. A user shares a Reddit post and you need the discussion + top comments
  3. A user shares a YouTube video and you need the transcript
  4. A user shares any blog, article, or documentation URL and you need the text
  5. You need to batch-read multiple URLs from a single message

Supported sources

Source Method API Key?
Twitter / X FxTwitter API + Nitter fallback None
Reddit .json suffix API None
YouTube youtube-transcript-api None
Any URL Trafilatura + BeautifulSoup None

Usage

from deepreader_skill import run

# Automatic — triggered when message contains URLs
result = run("Check this out: https://x.com/user/status/123456")

# Reddit post with comments
result = run("https://www.reddit.com/r/python/comments/abc123/my_post/")

# YouTube transcript
result = run("https://youtube.com/watch?v=dQw4w9WgXcQ")

# Any webpage
result = run("https://example.com/blog/interesting-article")

# Multiple URLs at once
result = run("""
  https://x.com/user/status/123456
  https://www.reddit.com/r/MachineLearning/comments/xyz789/
  https://example.com/article
""")

Output

Content is saved as .md files with structured YAML frontmatter:

---
title: "Tweet by @user"
source_url: "https://x.com/user/status/123456"
domain: "x.com"
parser: "twitter"
ingested_at: "2026-02-16T12:00:00Z"
content_hash: "sha256:..."
word_count: 350
---

Configuration

Variable Default Description
DEEPREEDER_MEMORY_PATH ../../memory/inbox/ Where to save ingested content
DEEPREEDER_LOG_LEVEL INFO Logging verbosity

How it works

URL detected → is Twitter/X?  → FxTwitter API → Nitter fallback
             → is Reddit?     → .json suffix API
             → is YouTube?    → youtube-transcript-api
             → otherwise      → Trafilatura (generic)

Triggers automatically when any message contains https:// or http://.

Usage Guidance
This skill appears to implement a real web content reader, but exercise caution before enabling it broadly. Key points to consider: - SSRF / unrestricted fetches: The skill will attempt to download any URL it detects (generic fallback fetches arbitrary hosts). If the agent runs in a networked environment with access to internal resources (localhost, internal metadata endpoints, cloud IMDS, private services), maliciously crafted messages or links could cause the agent to connect to those endpoints. Restricting the skill to isolated execution environments or adding a URL allowlist/blocklist is recommended. - Automatic triggering: The manifest triggers on any message containing "http(s)://". If you want manual control, disable the automatic trigger or require explicit user invocation. - Storage: Fetched content is written to the agent's memory directory (default ../../memory/inbox/). Confirm that storing external content there is acceptable and that sensitive data won't be leaked to downstream components that read agent memory. - Dependencies & deployment: The package imports non-stdlib libraries (trafilatura, bs4, youtube_transcript_api, requests). There is no install spec — ensure required dependencies are installed in a controlled way before use. - Minor red flags: Several typos/inconsistencies ("DeepReeder"/"DEEPREEDER") and mismatches between SKILL.md and code suggest the package may be lightly maintained — review code before trusting in production. If you plan to use it: run the skill in a sandboxed environment with constrained network egress, review/limit which domains are fetchable, audit requirements.txt and install dependencies from trusted sources, and consider disabling automatic URL-triggering until you add domain/host protections.
Capability Analysis
Package: DeepReader (xpi) Version: 1.0.0 Description: The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys. Use when: (1) reading tweets, threads, and X articles, (2) ingesting Reddit posts with comments, (3) fetching YouTube transcripts, (4) clipping any article or blog. The DeepReader skill is a web content ingestion tool designed for OpenClaw agents. It extracts URLs from user messages, uses specialized parsers (FxTwitter, Nitter, YouTube, Reddit, Generic) to fetch content, and saves it as clean Markdown files with YAML frontmatter to a designated local memory path (`../../memory/inbox/`). The package utilizes standard and well-known libraries for web scraping, content extraction, and data processing (e.g., `requests`, `trafilatura`, `beautifulsoup4`, `youtube-transcript-api`). While web scraping inherently involves making network requests to user-provided URLs, which could pose risks like SSRF or DoS, the implementation includes safeguards such as URL validation, filename sanitization, request timeouts, and specific User-Agent headers. There is no evidence of malicious activity, data exfiltration to unauthorized destinations, or arbitrary code execution. File writing operations are confined to an expected local memory directory, and filename generation includes sanitization to prevent path traversal. The code logic is transparent and aligns with the stated purpose of the skill.
Capability Assessment
Purpose & Capability
The code and manifest match the described purpose: parsers for X/Twitter (FxTwitter + Nitter), Reddit (.json), YouTube transcripts, and generic webpages using trafilatura/BeautifulSoup. However, SKILL.md and other text contain typos/inconsistent names (e.g., "DEEPREEDER" / "DeepReeder") and the repo includes Python modules despite an earlier statement that the skill is instruction-only. The presence of a requirements.txt but no install spec is an implementation mismatch.
Instruction Scope
The skill triggers on any message containing 'http(s)://' and will attempt to fetch every detected URL (GenericParser will fetch arbitrary domains). There is no domain allowlist, no internal-host blocking, and no explicit SSRF protections. It writes the fetched content into agent memory. This broad, automatic URL-fetching behavior is the primary security concern (SSRF/data exposure, untrusted fetches).
Install Mechanism
There is no install spec (instruction-only in metadata), yet the package contains Python code that imports external libraries (trafilatura, bs4, requests, youtube_transcript_api). Without an install step the runtime may lack required dependencies, causing failures; the lack of an installation mechanism is an operational inconsistency but not itself malicious.
Credentials
The skill does not request credentials or secrets (requires.env empty), which is appropriate. SKILL.md documents two environment variables (DEEPREEDER_MEMORY_PATH, DEEPREEDER_LOG_LEVEL) but the code does not read these explicitly and the variable name is misspelled relative to the skill name — an inconsistent configuration story that could confuse administrators.
Persistence & Privilege
The skill saves fetched content to a memory directory (default ../../memory/inbox/). It is not forced-always, but it is user-invocable and the manifest declares a message trigger that causes automatic invocation when messages contain URLs. Autonomous invocation combined with unrestricted fetching and writing to agent memory increases blast radius (SSRF, local data accumulation).
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install deepreader-skill
  3. After installation, invoke the skill by name or use /deepreader-skill
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
Initial release of DeepReader, the default web content reader for OpenClaw agents: - Automatically detects and reads X (Twitter), Reddit, YouTube, and general web URLs in messages. - Fetches content using specialized parsers for each source, requiring no API keys. - Outputs clean Markdown with YAML frontmatter, ready for agent memory ingestion. - Supports batch-reading of multiple URLs in a single message. - Includes configurable options for memory path and logging level.
Metadata
Slug deepreader-skill
Version 0.1.0
License
All-time Installs 6
Active Installs 6
Total Versions 1
Frequently Asked Questions

What is DeepReader?

The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys required. Use when you n... It is an AI Agent Skill for Claude Code / OpenClaw, with 823 downloads so far.

How do I install DeepReader?

Run "/install deepreader-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is DeepReader free?

Yes, DeepReader is completely free (open-source). You can download, install and use it at no cost.

Which platforms does DeepReader support?

DeepReader is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created DeepReader?

It is built and maintained by Tony Li (@astonysh); the current version is v0.1.0.

💬 Comments