← 返回 Skills 市场

DeepReader

Name: DeepReader
Author: astonysh

作者 Tony Li · GitHub ↗ · v0.1.0

cross-platform ⚠ suspicious

823

总下载

当前安装

版本数

在 OpenClaw 中安装

/install deepreader-skill

功能描述

The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys required. Use when you n...

使用说明 (SKILL.md)

DeepReader

The default web content reader for OpenClaw agents. Automatically detects URLs in messages, fetches content using specialized parsers, and saves clean Markdown with YAML frontmatter to agent memory.

Use when

A user shares a tweet, thread, or X article and you need to read its content
A user shares a Reddit post and you need the discussion + top comments
A user shares a YouTube video and you need the transcript
A user shares any blog, article, or documentation URL and you need the text
You need to batch-read multiple URLs from a single message

Supported sources

Source	Method	API Key?
Twitter / X	FxTwitter API + Nitter fallback	None
Reddit	.json suffix API	None
YouTube	youtube-transcript-api	None
Any URL	Trafilatura + BeautifulSoup	None

Usage

from deepreader_skill import run

# Automatic — triggered when message contains URLs
result = run("Check this out: https://x.com/user/status/123456")

# Reddit post with comments
result = run("https://www.reddit.com/r/python/comments/abc123/my_post/")

# YouTube transcript
result = run("https://youtube.com/watch?v=dQw4w9WgXcQ")

# Any webpage
result = run("https://example.com/blog/interesting-article")

# Multiple URLs at once
result = run("""
  https://x.com/user/status/123456
  https://www.reddit.com/r/MachineLearning/comments/xyz789/
  https://example.com/article
""")

Output

Content is saved as .md files with structured YAML frontmatter:

---
title: "Tweet by @user"
source_url: "https://x.com/user/status/123456"
domain: "x.com"
parser: "twitter"
ingested_at: "2026-02-16T12:00:00Z"
content_hash: "sha256:..."
word_count: 350
---

Configuration

Variable	Default	Description
`DEEPREEDER_MEMORY_PATH`	`../../memory/inbox/`	Where to save ingested content
`DEEPREEDER_LOG_LEVEL`	`INFO`	Logging verbosity

How it works

URL detected → is Twitter/X?  → FxTwitter API → Nitter fallback
             → is Reddit?     → .json suffix API
             → is YouTube?    → youtube-transcript-api
             → otherwise      → Trafilatura (generic)

Triggers automatically when any message contains https:// or http://.

安全使用建议

This skill appears to implement a real web content reader, but exercise caution before enabling it broadly. Key points to consider: - SSRF / unrestricted fetches: The skill will attempt to download any URL it detects (generic fallback fetches arbitrary hosts). If the agent runs in a networked environment with access to internal resources (localhost, internal metadata endpoints, cloud IMDS, private services), maliciously crafted messages or links could cause the agent to connect to those endpoints. Restricting the skill to isolated execution environments or adding a URL allowlist/blocklist is recommended. - Automatic triggering: The manifest triggers on any message containing "http(s)://". If you want manual control, disable the automatic trigger or require explicit user invocation. - Storage: Fetched content is written to the agent's memory directory (default ../../memory/inbox/). Confirm that storing external content there is acceptable and that sensitive data won't be leaked to downstream components that read agent memory. - Dependencies & deployment: The package imports non-stdlib libraries (trafilatura, bs4, youtube_transcript_api, requests). There is no install spec — ensure required dependencies are installed in a controlled way before use. - Minor red flags: Several typos/inconsistencies ("DeepReeder"/"DEEPREEDER") and mismatches between SKILL.md and code suggest the package may be lightly maintained — review code before trusting in production. If you plan to use it: run the skill in a sandboxed environment with constrained network egress, review/limit which domains are fetchable, audit requirements.txt and install dependencies from trusted sources, and consider disabling automatic URL-triggering until you add domain/host protections.

功能分析

Package: DeepReader (xpi) Version: 1.0.0 Description: The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys. Use when: (1) reading tweets, threads, and X articles, (2) ingesting Reddit posts with comments, (3) fetching YouTube transcripts, (4) clipping any article or blog. The DeepReader skill is a web content ingestion tool designed for OpenClaw agents. It extracts URLs from user messages, uses specialized parsers (FxTwitter, Nitter, YouTube, Reddit, Generic) to fetch content, and saves it as clean Markdown files with YAML frontmatter to a designated local memory path (`../../memory/inbox/`). The package utilizes standard and well-known libraries for web scraping, content extraction, and data processing (e.g., `requests`, `trafilatura`, `beautifulsoup4`, `youtube-transcript-api`). While web scraping inherently involves making network requests to user-provided URLs, which could pose risks like SSRF or DoS, the implementation includes safeguards such as URL validation, filename sanitization, request timeouts, and specific User-Agent headers. There is no evidence of malicious activity, data exfiltration to unauthorized destinations, or arbitrary code execution. File writing operations are confined to an expected local memory directory, and filename generation includes sanitization to prevent path traversal. The code logic is transparent and aligns with the stated purpose of the skill.

能力评估

ℹ Purpose & Capability

The code and manifest match the described purpose: parsers for X/Twitter (FxTwitter + Nitter), Reddit (.json), YouTube transcripts, and generic webpages using trafilatura/BeautifulSoup. However, SKILL.md and other text contain typos/inconsistent names (e.g., "DEEPREEDER" / "DeepReeder") and the repo includes Python modules despite an earlier statement that the skill is instruction-only. The presence of a requirements.txt but no install spec is an implementation mismatch.

⚠ Instruction Scope

The skill triggers on any message containing 'http(s)://' and will attempt to fetch every detected URL (GenericParser will fetch arbitrary domains). There is no domain allowlist, no internal-host blocking, and no explicit SSRF protections. It writes the fetched content into agent memory. This broad, automatic URL-fetching behavior is the primary security concern (SSRF/data exposure, untrusted fetches).

ℹ Install Mechanism

There is no install spec (instruction-only in metadata), yet the package contains Python code that imports external libraries (trafilatura, bs4, requests, youtube_transcript_api). Without an install step the runtime may lack required dependencies, causing failures; the lack of an installation mechanism is an operational inconsistency but not itself malicious.

ℹ Credentials

The skill does not request credentials or secrets (requires.env empty), which is appropriate. SKILL.md documents two environment variables (DEEPREEDER_MEMORY_PATH, DEEPREEDER_LOG_LEVEL) but the code does not read these explicitly and the variable name is misspelled relative to the skill name — an inconsistent configuration story that could confuse administrators.

⚠ Persistence & Privilege

The skill saves fetched content to a memory directory (default ../../memory/inbox/). It is not forced-always, but it is user-invocable and the manifest declares a message trigger that causes automatic invocation when messages contain URLs. Autonomous invocation combined with unrestricted fetching and writing to agent memory increases blast radius (SSRF, local data accumulation).

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install deepreader-skill
安装完成后，直接呼叫该 Skill 的名称或使用 /deepreader-skill 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v0.1.0

Initial release of DeepReader, the default web content reader for OpenClaw agents: - Automatically detects and reads X (Twitter), Reddit, YouTube, and general web URLs in messages. - Fetches content using specialized parsers for each source, requiring no API keys. - Outputs clean Markdown with YAML frontmatter, ready for agent memory ingestion. - Supports batch-reading of multiple URLs in a single message. - Includes configurable options for memory path and logging level.

元数据

Slug deepreader-skill

版本 0.1.0

许可证 —

累计安装 6

当前安装数 6

历史版本数 1

常见问题

DeepReader 是什么？

The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys required. Use when you n... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 823 次。

如何安装 DeepReader？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install deepreader-skill」即可一键安装，无需额外配置。

DeepReader 是免费的吗？

是的，DeepReader 完全免费（开源免费），可自由下载、安装和使用。

DeepReader 支持哪些平台？

DeepReader 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 DeepReader？

由 Tony Li（@astonysh）开发并维护，当前版本 v0.1.0。

DeepReader

DeepReader

Use when

Supported sources

Usage

Output

Configuration

How it works

DeepReader 是什么？

如何安装 DeepReader？

DeepReader 是免费的吗？

DeepReader 支持哪些平台？

谁开发了 DeepReader？

💬 留言讨论