← Back to Skills Marketplace
Novel Scraper
by
yuzhihui886
· GitHub ↗
· v1.6.0
· MIT-0
608
Downloads
3
Stars
2
Active Installs
9
Versions
Install in OpenClaw
/install novel-scraper
Description
智能小说抓取工具,支持自动翻页、分页补全、章节号自动解析。 使用 curl+BeautifulSoup 抓取笔趣阁等小说网站,输出格式化 TXT 文件。 默认每 10 章合并为一个文档,避免文件零散分布。 自动检测分页并补全,智能跳过非小说内容(作者感言、抽奖预告等)。 Use when: 抓取网络小说章节、批量...
Usage Guidance
This skill appears to do what it claims: fetch HTML with curl, parse with BeautifulSoup, cache to /tmp, and save TXT files under ~/.openclaw/workspace/novels. Before installing: (1) be aware it will make outbound HTTP(S) requests to whatever URLs you pass (and can fetch internal endpoints if you instruct it to) — run it in a network-restricted or sandboxed environment if you are concerned about SSRF or unintended crawling; (2) it writes state and cache files under your home directory and /tmp — back up or review those files if needed; (3) scraping copyrighted content may violate terms of service or law in your jurisdiction — ensure you have permission to scrape sites; (4) the code uses subprocess.run to call curl and writes files, so review or run it in an isolated environment if you want to be extra cautious. If you do not want the agent to call this skill autonomously, disable autonomous invocation in your agent settings.
Capability Analysis
Type: OpenClaw Skill
Name: novel-scraper
Version: 1.6.0
The novel-scraper skill bundle is a legitimate tool for fetching and formatting web novels from sites like bqquge.com. The scripts (scraper.py, scraper_v5.py, fetch_catalog.py) use standard tools like curl and BeautifulSoup to extract content, and include features for memory monitoring, caching, and chapter merging. The code demonstrates security awareness by implementing checks against shell injection in URLs and avoids dangerous practices like shell=True in subprocess calls. No evidence of data exfiltration, persistence, or prompt injection was found.
Capability Assessment
Purpose & Capability
Name/description (novel scraping, pagination, merging) align with the included scripts and configs: scripts perform HTML fetches, parse chapter numbers, handle pagination, cache to /tmp and ~/.openclaw, and save TXT files. Hardcoded site config and catalog logic for bqquge.com match the stated purpose.
Instruction Scope
SKILL.md directs running the provided Python scripts in the skill workspace and saving outputs to ~/.openclaw/workspace/novels. The runtime instructions and scripts only reference local files, site HTML, and curl calls to target URLs; they do not attempt to read unrelated system credentials, other skills' config, or exfiltrate data to third-party endpoints.
Install Mechanism
No install spec is present (instruction-only skill). The package includes Python scripts and a small requirements.txt (beautifulsoup4). There are no downloads from untrusted URLs or archive extraction steps in the manifest.
Credentials
The skill requires no environment variables or external credentials. It writes state/cache under ~/.openclaw and /tmp and creates log files; these file operations are proportional to a scraper. No SECRET/TOKEN/PASSWORD env vars are requested.
Persistence & Privilege
always is false and the skill does not modify other skills or global agent configuration. It persists its own state under its skill directory and /tmp, which is expected for resumable scraping.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install novel-scraper - After installation, invoke the skill by name or use
/novel-scraper - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.6.0
代码质量修复:删除未使用导入和变量,修复 f-string 问题,通过 ruff check 所有检查
v1.5.0
修复索引切片 bug,新增--chapters 参数,清理未使用脚本
v1.2.3
novel-scraper 1.2.3
- No file changes detected in this version.
- Documentation, features, and setup remain unchanged.
v1.2.2
- Removed the documentation file references/learnings.md.
- No user-facing functional changes.
- Documentation and usage remain consistent with the previous version.
v1.2.1
novel-scraper 1.2.1
- 精简和更新了文档,移除了脚本行数描述,保持简明。
- 删除了 state/progress.json,简化状态管理。
- 保持核心功能和命令选项不变,提升文档可读性。
- requirements.txt 未见新增依赖,依赖说明保持一致。
v1.2.0
v1.2.0: Claude Code Debug 修复 (5 个问题) - 修复黑名单重复定义/新增内存释放方法/修复变量覆盖/新增 URL 安全验证/优化 frontmatter - 完整测试验证 (110 章抓取成功)
v1.1.0
novel-scraper v1.1.0
- Added README.md and references/learnings.md for improved documentation and reference tracking.
- Enhanced scripts: `scraper.py` and `merge_cache.py` expanded with more functionality; new `package_skill.py` script added for packaging.
- Updated dependency management and instructions in SKILL.md and requirements.txt.
- Removed obsolete state/progress.json to streamline state handling.
- Refined site configuration options in configs/sites.json.
v1.0.1
novel-scraper 1.0.1
- No code or documentation changes detected in this version.
- All features, usage, and documentation remain unchanged from the previous version.
v1.0.0
novel-scraper 1.0.0 – Initial Release
- 首发轻量级小说抓取工具,支持自动翻页、会话复用、内存监控等功能
- 可抓取笔趣阁等网站的章节,批量下载并导出为格式化 TXT 文件
- 针对低内存服务器优化,支持断点续传、缓存系统
- 支持自定义网站选择器,易于配置扩展更多小说站点
- 提供详细命令行参数与使用示例,便于快速上手
Metadata
Frequently Asked Questions
What is Novel Scraper?
智能小说抓取工具,支持自动翻页、分页补全、章节号自动解析。 使用 curl+BeautifulSoup 抓取笔趣阁等小说网站,输出格式化 TXT 文件。 默认每 10 章合并为一个文档,避免文件零散分布。 自动检测分页并补全,智能跳过非小说内容(作者感言、抽奖预告等)。 Use when: 抓取网络小说章节、批量... It is an AI Agent Skill for Claude Code / OpenClaw, with 608 downloads so far.
How do I install Novel Scraper?
Run "/install novel-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Novel Scraper free?
Yes, Novel Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Novel Scraper support?
Novel Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Novel Scraper?
It is built and maintained by yuzhihui886 (@yuzhihui886); the current version is v1.6.0.
More Skills