Description

The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys required. Use when you n...

README (SKILL.md)

DeepReader

Name: DeepReader
Author: astonysh

The default web content reader for OpenClaw agents. Automatically detects URLs in messages, fetches content using specialized parsers, and saves clean Markdown with YAML frontmatter to agent memory.

Use when

A user shares a tweet, thread, or X article and you need to read its content
A user shares a Reddit post and you need the discussion + top comments
A user shares a YouTube video and you need the transcript
A user shares any blog, article, or documentation URL and you need the text
You need to batch-read multiple URLs from a single message

Supported sources

Source	Method	API Key?
Twitter / X	FxTwitter API + Nitter fallback	None
Reddit	.json suffix API	None
YouTube	youtube-transcript-api	None
Any URL	Trafilatura + BeautifulSoup	None

Usage

from deepreader_skill import run

# Automatic — triggered when message contains URLs
result = run("Check this out: https://x.com/user/status/123456")

# Reddit post with comments
result = run("https://www.reddit.com/r/python/comments/abc123/my_post/")

# YouTube transcript
result = run("https://youtube.com/watch?v=dQw4w9WgXcQ")

# Any webpage
result = run("https://example.com/blog/interesting-article")

# Multiple URLs at once
result = run("""
  https://x.com/user/status/123456
  https://www.reddit.com/r/MachineLearning/comments/xyz789/
  https://example.com/article
""")

Output

Content is saved as .md files with structured YAML frontmatter:

---
title: "Tweet by @user"
source_url: "https://x.com/user/status/123456"
domain: "x.com"
parser: "twitter"
ingested_at: "2026-02-16T12:00:00Z"
content_hash: "sha256:..."
word_count: 350
---

Configuration

Variable	Default	Description
`DEEPREEDER_MEMORY_PATH`	`../../memory/inbox/`	Where to save ingested content
`DEEPREEDER_LOG_LEVEL`	`INFO`	Logging verbosity

How it works

URL detected → is Twitter/X?  → FxTwitter API → Nitter fallback
             → is Reddit?     → .json suffix API
             → is YouTube?    → youtube-transcript-api
             → otherwise      → Trafilatura (generic)

Triggers automatically when any message contains https:// or http://.

Usage Guidance

This skill appears to implement a real web content reader, but exercise caution before enabling it broadly. Key points to consider: - SSRF / unrestricted fetches: The skill will attempt to download any URL it detects (generic fallback fetches arbitrary hosts). If the agent runs in a networked environment with access to internal resources (localhost, internal metadata endpoints, cloud IMDS, private services), maliciously crafted messages or links could cause the agent to connect to those endpoints. Restricting the skill to isolated execution environments or adding a URL allowlist/blocklist is recommended. - Automatic triggering: The manifest triggers on any message containing "http(s)://". If you want manual control, disable the automatic trigger or require explicit user invocation. - Storage: Fetched content is written to the agent's memory directory (default ../../memory/inbox/). Confirm that storing external content there is acceptable and that sensitive data won't be leaked to downstream components that read agent memory. - Dependencies & deployment: The package imports non-stdlib libraries (trafilatura, bs4, youtube_transcript_api, requests). There is no install spec — ensure required dependencies are installed in a controlled way before use. - Minor red flags: Several typos/inconsistencies ("DeepReeder"/"DEEPREEDER") and mismatches between SKILL.md and code suggest the package may be lightly maintained — review code before trusting in production. If you plan to use it: run the skill in a sandboxed environment with constrained network egress, review/limit which domains are fetchable, audit requirements.txt and install dependencies from trusted sources, and consider disabling automatic URL-triggering until you add domain/host protections.

Capability Analysis

Package: DeepReader (xpi) Version: 1.0.0 Description: The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys. Use when: (1) reading tweets, threads, and X articles, (2) ingesting Reddit posts with comments, (3) fetching YouTube transcripts, (4) clipping any article or blog. The DeepReader skill is a web content ingestion tool designed for OpenClaw agents. It extracts URLs from user messages, uses specialized parsers (FxTwitter, Nitter, YouTube, Reddit, Generic) to fetch content, and saves it as clean Markdown files with YAML frontmatter to a designated local memory path (`../../memory/inbox/`). The package utilizes standard and well-known libraries for web scraping, content extraction, and data processing (e.g., `requests`, `trafilatura`, `beautifulsoup4`, `youtube-transcript-api`). While web scraping inherently involves making network requests to user-provided URLs, which could pose risks like SSRF or DoS, the implementation includes safeguards such as URL validation, filename sanitization, request timeouts, and specific User-Agent headers. There is no evidence of malicious activity, data exfiltration to unauthorized destinations, or arbitrary code execution. File writing operations are confined to an expected local memory directory, and filename generation includes sanitization to prevent path traversal. The code logic is transparent and aligns with the stated purpose of the skill.

Capability Assessment

ℹ Purpose & Capability

The code and manifest match the described purpose: parsers for X/Twitter (FxTwitter + Nitter), Reddit (.json), YouTube transcripts, and generic webpages using trafilatura/BeautifulSoup. However, SKILL.md and other text contain typos/inconsistent names (e.g., "DEEPREEDER" / "DeepReeder") and the repo includes Python modules despite an earlier statement that the skill is instruction-only. The presence of a requirements.txt but no install spec is an implementation mismatch.

⚠ Instruction Scope

The skill triggers on any message containing 'http(s)://' and will attempt to fetch every detected URL (GenericParser will fetch arbitrary domains). There is no domain allowlist, no internal-host blocking, and no explicit SSRF protections. It writes the fetched content into agent memory. This broad, automatic URL-fetching behavior is the primary security concern (SSRF/data exposure, untrusted fetches).

ℹ Install Mechanism

There is no install spec (instruction-only in metadata), yet the package contains Python code that imports external libraries (trafilatura, bs4, requests, youtube_transcript_api). Without an install step the runtime may lack required dependencies, causing failures; the lack of an installation mechanism is an operational inconsistency but not itself malicious.

ℹ Credentials

The skill does not request credentials or secrets (requires.env empty), which is appropriate. SKILL.md documents two environment variables (DEEPREEDER_MEMORY_PATH, DEEPREEDER_LOG_LEVEL) but the code does not read these explicitly and the variable name is misspelled relative to the skill name — an inconsistent configuration story that could confuse administrators.

⚠ Persistence & Privilege

The skill saves fetched content to a memory directory (default ../../memory/inbox/). It is not forced-always, but it is user-invocable and the manifest declares a message trigger that causes automatic invocation when messages contain URLs. Autonomous invocation combined with unrestricted fetching and writing to agent memory increases blast radius (SSRF, local data accumulation).

Version History

v0.1.0

Initial release of DeepReader, the default web content reader for OpenClaw agents: - Automatically detects and reads X (Twitter), Reddit, YouTube, and general web URLs in messages. - Fetches content using specialized parsers for each source, requiring no API keys. - Outputs clean Markdown with YAML frontmatter, ready for agent memory ingestion. - Supports batch-reading of multiple URLs in a single message. - Includes configurable options for memory path and logging level.

Metadata

Slug deepreader-skill

Version 0.1.0

License —

All-time Installs 6

Active Installs 6

Total Versions 1

Frequently Asked Questions

What is DeepReader?

The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys required. Use when you n... It is an AI Agent Skill for Claude Code / OpenClaw, with 823 downloads so far.

How do I install DeepReader?

Run "/install deepreader-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is DeepReader free?

Yes, DeepReader is completely free (open-source). You can download, install and use it at no cost.

Which platforms does DeepReader support?

DeepReader is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created DeepReader?

It is built and maintained by Tony Li (@astonysh); the current version is v0.1.0.

More Skills

DeepReader

DeepReader

Use when

Supported sources

Usage

Output

Configuration

How it works

What is DeepReader?

How do I install DeepReader?

Is DeepReader free?

Which platforms does DeepReader support?

Who created DeepReader?

💬 Comments