← 返回 Skills 市场
donatasdecodo

Decodo Web Scraper

作者 DonatasDecodo · GitHub ↗ · v1.1.0
cross-platform ✓ 安全检测通过
692
总下载
1
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install decodo-scraper-skill
功能描述
Search Google, scrape web pages, Amazon product pages, YouTube subtitles, or Reddit (post/subreddit) using the Decodo Scraper OpenClaw Skill.
使用说明 (SKILL.md)

Decodo Scraper OpenClaw Skill

Use this skill to search Google, scrape any URL, or fetch YouTube subtitles via the Decodo Web Scraping API. Search outputs a JSON object of result sections; Scrape URL outputs plain markdown; Amazon and Amazon search output parsed product-page or search results (JSON). Amazon search uses --query. YouTube subtitles outputs transcript/subtitles. Reddit post and Reddit subreddit output post/listing content (JSON).

Authentication: Set DECODO_AUTH_TOKEN (Basic auth token from Decodo Dashboard → Scraping APIs) in your environment or in a .env file in the repo root.

Errors: On failure the script writes a JSON error to stderr and exits with code 1.


Tools

1. Search Google

Use this to find URLs, answers, or structured search results. The API returns a JSON object whose results key contains several sections (not all may be present for every query):

Section Description
organic Main search results (titles, links, snippets).
ai_overviews AI-generated overviews or summaries when Google shows them.
paid Paid/sponsored results (ads).
related_questions “People also ask”–style questions and answers.
related_searches Suggested related search queries.
discussions_and_forums Forum or discussion results (e.g. Reddit, Stack Exchange).

The script outputs only the inner results object (these sections); pagination info (page, last_visible_page, parse_status_code) is not included.

Command:

python3 tools/scrape.py --target google_search --query "your search query"

Examples:

python3 tools/scrape.py --target google_search --query "best laptops 2025"
python3 tools/scrape.py --target google_search --query "python requests tutorial"

Optional: --geo us or --locale en for location/language.


2. Scrape URL

Use this to get the content of a specific web page. By default the API returns content as Markdown (cleaner for LLMs and lower token usage).

Command:

python3 tools/scrape.py --target universal --url "https://example.com"

Examples:

python3 tools/scrape.py --target universal --url "https://example.com"
python3 tools/scrape.py --target universal --url "https://news.ycombinator.com/"

3. Amazon product page

Use this to get parsed data from an Amazon product (or other Amazon) page. Pass the product page URL as --url. The script sends parse: true and outputs the inner results object (e.g. ads, product details, etc.).

Command:

python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/PRODUCT_ID"

Examples:

python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/B09H74FXNW"

4. Amazon search

Use this to search Amazon and get parsed results (search results list, delivery_postcode, etc.). Pass the search query as --query.

Command:

python3 tools/scrape.py --target amazon_search --query "your search query"

Examples:

python3 tools/scrape.py --target amazon_search --query "laptop"

5. YouTube subtitles

Use this to get subtitles/transcript for a YouTube video. Pass the video ID (e.g. from youtube.com/watch?v=VIDEO_ID) as --query.

Command:

python3 tools/scrape.py --target youtube_subtitles --query "VIDEO_ID"

Examples:

python3 tools/scrape.py --target youtube_subtitles --query "dFu9aKJoqGg"

6. Reddit post

Use this to get the content of a Reddit post (thread). Pass the full post URL as --url.

Command:

python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/SUBREDDIT/comments/ID/..."

Examples:

python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/nba/comments/17jrqc5/serious_next_day_thread_postgame_discussion/"

7. Reddit subreddit

Use this to get the listing (posts) of a Reddit subreddit. Pass the subreddit URL as --url.

Command:

python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/SUBREDDIT/"

Examples:

python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/nba/"

Summary

Action Target Argument Example command
Search google_search --query python3 tools/scrape.py --target google_search --query "laptop"
Scrape page universal --url python3 tools/scrape.py --target universal --url "https://example.com"
Amazon product amazon --url python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/B09H74FXNW"
Amazon search amazon_search --query python3 tools/scrape.py --target amazon_search --query "laptop"
YouTube subtitles youtube_subtitles --query python3 tools/scrape.py --target youtube_subtitles --query "dFu9aKJoqGg"
Reddit post reddit_post --url python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/nba/comments/17jrqc5/..."
Reddit subreddit reddit_subreddit --url python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/nba/"

Output: Search → JSON (sections). Scrape URL → markdown. Amazon / Amazon search → JSON (results e.g. ads, product info, delivery_postcode). YouTube → transcript. Reddit → JSON (content).

安全使用建议
This skill appears to do what it says: it calls Decodo's scraping API and returns results. Before installing, confirm the following: 1) the DECODO_AUTH_TOKEN is required (SKILL.md and tools/scrape.py use it) even though registry metadata omitted that — only provide a token from Decodo's dashboard and store it securely (e.g., local .env, secret manager), 2) the tool makes outbound requests to https://scraper-api.decodo.com — ensure your environment/network policy allows that and that you trust Decodo, 3) scraping content may have legal/ToS implications for sites you target (Amazon, Google, Reddit, YouTube); ensure you have the right to scrape and use scraped data, and 4) if you need stronger assurance, verify the repo origin (the README points to a GitHub repo) and check that the hosted homepage and dashboard domain match your expectations before providing credentials.
功能分析
Type: OpenClaw Skill Name: decodo-scraper-skill Version: 1.1.0 The OpenClaw skill 'decodo-scraper' is designed to interact with the Decodo Web Scraping API. It uses a `DECODO_AUTH_TOKEN` to authenticate with `scraper-api.decodo.com` and sends user-provided queries/URLs to this service. The `SKILL.md` file provides clear instructions without any prompt injection attempts. The `tools/scrape.py` script uses `argparse` for input and constructs JSON payloads for `requests.post`, preventing shell injection. There is no evidence of unauthorized data exfiltration, malicious execution, persistence mechanisms, or obfuscation. The behavior is fully aligned with its stated purpose.
能力评估
Purpose & Capability
Name/description, README, SKILL.md and the included tools/scrape.py are all aligned: the skill calls Decodo's scraping API for Google, universal URLs, Amazon, YouTube, and Reddit. No unrelated capabilities or credentials are requested by the code.
Instruction Scope
SKILL.md and the CLI script focus only on building requests to Decodo's scraper API. The runtime instructions ask the agent to set DECODO_AUTH_TOKEN or put it in a .env file; the script reads only that token and does not direct the agent to read other system files or exfiltrate additional environment variables.
Install Mechanism
This is an instruction-only skill with a small Python helper and a requirements.txt (requests, python-dotenv). There is no download-from-arbitrary-URL or install script; risk from install mechanism is low.
Credentials
The skill legitimately requires a single DECODO_AUTH_TOKEN (Basic auth) to call Decodo's API, which is proportionate. however, registry metadata at the top claims 'Required env vars: none' while SKILL.md and the code require DECODO_AUTH_TOKEN — this metadata mismatch should be fixed. Treat the token as sensitive and only provide one obtained from Decodo's dashboard.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or system configs, and does not attempt privileged or persistent system changes. It uses the agent process to make outbound HTTPS calls to scraper-api.decodo.com.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install decodo-scraper-skill
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /decodo-scraper-skill 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.1.0
- Added detailed documentation for all supported scraping targets: Google Search, web pages, Amazon product and search pages, YouTube subtitles, Reddit posts, and Reddit subreddits. - Clarified required environment variable (`DECODO_AUTH_TOKEN`) and setup instructions for authentication. - Provided example usage commands and described output formats for each tool. - Summarized actions and arguments in a new summary table for easier reference. - Improved error handling documentation, specifying JSON output on failure.
元数据
Slug decodo-scraper-skill
版本 1.1.0
许可证
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Decodo Web Scraper 是什么?

Search Google, scrape web pages, Amazon product pages, YouTube subtitles, or Reddit (post/subreddit) using the Decodo Scraper OpenClaw Skill. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 692 次。

如何安装 Decodo Web Scraper?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install decodo-scraper-skill」即可一键安装,无需额外配置。

Decodo Web Scraper 是免费的吗?

是的,Decodo Web Scraper 完全免费(开源免费),可自由下载、安装和使用。

Decodo Web Scraper 支持哪些平台?

Decodo Web Scraper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Decodo Web Scraper?

由 DonatasDecodo(@donatasdecodo)开发并维护,当前版本 v1.1.0。

💬 留言讨论