← 返回 Skills 市场

Decodo Web Scraper

Name: Decodo Web Scraper
Author: donatasdecodo

作者 DonatasDecodo · GitHub ↗ · v1.1.0

cross-platform ✓ 安全检测通过

692

总下载

当前安装

版本数

在 OpenClaw 中安装

/install decodo-scraper-skill

功能描述

Search Google, scrape web pages, Amazon product pages, YouTube subtitles, or Reddit (post/subreddit) using the Decodo Scraper OpenClaw Skill.

使用说明 (SKILL.md)

Decodo Scraper OpenClaw Skill

Use this skill to search Google, scrape any URL, or fetch YouTube subtitles via the Decodo Web Scraping API. Search outputs a JSON object of result sections; Scrape URL outputs plain markdown; Amazon and Amazon search output parsed product-page or search results (JSON). Amazon search uses --query. YouTube subtitles outputs transcript/subtitles. Reddit post and Reddit subreddit output post/listing content (JSON).

Authentication: Set DECODO_AUTH_TOKEN (Basic auth token from Decodo Dashboard → Scraping APIs) in your environment or in a .env file in the repo root.

Errors: On failure the script writes a JSON error to stderr and exits with code 1.

Tools

1. Search Google

Use this to find URLs, answers, or structured search results. The API returns a JSON object whose results key contains several sections (not all may be present for every query):

Section	Description
`organic`	Main search results (titles, links, snippets).
`ai_overviews`	AI-generated overviews or summaries when Google shows them.
`paid`	Paid/sponsored results (ads).
`related_questions`	“People also ask”–style questions and answers.
`related_searches`	Suggested related search queries.
`discussions_and_forums`	Forum or discussion results (e.g. Reddit, Stack Exchange).

The script outputs only the inner results object (these sections); pagination info (page, last_visible_page, parse_status_code) is not included.

Command:

python3 tools/scrape.py --target google_search --query "your search query"

Examples:

python3 tools/scrape.py --target google_search --query "best laptops 2025"
python3 tools/scrape.py --target google_search --query "python requests tutorial"

Optional: --geo us or --locale en for location/language.

2. Scrape URL

Use this to get the content of a specific web page. By default the API returns content as Markdown (cleaner for LLMs and lower token usage).

Command:

python3 tools/scrape.py --target universal --url "https://example.com"

Examples:

python3 tools/scrape.py --target universal --url "https://example.com"
python3 tools/scrape.py --target universal --url "https://news.ycombinator.com/"

3. Amazon product page

Use this to get parsed data from an Amazon product (or other Amazon) page. Pass the product page URL as --url. The script sends parse: true and outputs the inner results object (e.g. ads, product details, etc.).

Command:

python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/PRODUCT_ID"

Examples:

python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/B09H74FXNW"

4. Amazon search

Use this to search Amazon and get parsed results (search results list, delivery_postcode, etc.). Pass the search query as --query.

Command:

python3 tools/scrape.py --target amazon_search --query "your search query"

Examples:

python3 tools/scrape.py --target amazon_search --query "laptop"

5. YouTube subtitles

Use this to get subtitles/transcript for a YouTube video. Pass the video ID (e.g. from youtube.com/watch?v=VIDEO_ID) as --query.

Command:

python3 tools/scrape.py --target youtube_subtitles --query "VIDEO_ID"

Examples:

python3 tools/scrape.py --target youtube_subtitles --query "dFu9aKJoqGg"

6. Reddit post

Use this to get the content of a Reddit post (thread). Pass the full post URL as --url.

Command:

python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/SUBREDDIT/comments/ID/..."

Examples:

python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/nba/comments/17jrqc5/serious_next_day_thread_postgame_discussion/"

7. Reddit subreddit

Use this to get the listing (posts) of a Reddit subreddit. Pass the subreddit URL as --url.

Command:

python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/SUBREDDIT/"

Examples:

python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/nba/"

Summary

Action	Target	Argument	Example command
Search	`google_search`	`--query`	`python3 tools/scrape.py --target google_search --query "laptop"`
Scrape page	`universal`	`--url`	`python3 tools/scrape.py --target universal --url "https://example.com"`
Amazon product	`amazon`	`--url`	`python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/B09H74FXNW"`
Amazon search	`amazon_search`	`--query`	`python3 tools/scrape.py --target amazon_search --query "laptop"`
YouTube subtitles	`youtube_subtitles`	`--query`	`python3 tools/scrape.py --target youtube_subtitles --query "dFu9aKJoqGg"`
Reddit post	`reddit_post`	`--url`	`python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/nba/comments/17jrqc5/..."`
Reddit subreddit	`reddit_subreddit`	`--url`	`python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/nba/"`

Output: Search → JSON (sections). Scrape URL → markdown. Amazon / Amazon search → JSON (results e.g. ads, product info, delivery_postcode). YouTube → transcript. Reddit → JSON (content).

安全使用建议

This skill appears to do what it says: it calls Decodo's scraping API and returns results. Before installing, confirm the following: 1) the DECODO_AUTH_TOKEN is required (SKILL.md and tools/scrape.py use it) even though registry metadata omitted that — only provide a token from Decodo's dashboard and store it securely (e.g., local .env, secret manager), 2) the tool makes outbound requests to https://scraper-api.decodo.com — ensure your environment/network policy allows that and that you trust Decodo, 3) scraping content may have legal/ToS implications for sites you target (Amazon, Google, Reddit, YouTube); ensure you have the right to scrape and use scraped data, and 4) if you need stronger assurance, verify the repo origin (the README points to a GitHub repo) and check that the hosted homepage and dashboard domain match your expectations before providing credentials.

功能分析

Type: OpenClaw Skill Name: decodo-scraper-skill Version: 1.1.0 The OpenClaw skill 'decodo-scraper' is designed to interact with the Decodo Web Scraping API. It uses a `DECODO_AUTH_TOKEN` to authenticate with `scraper-api.decodo.com` and sends user-provided queries/URLs to this service. The `SKILL.md` file provides clear instructions without any prompt injection attempts. The `tools/scrape.py` script uses `argparse` for input and constructs JSON payloads for `requests.post`, preventing shell injection. There is no evidence of unauthorized data exfiltration, malicious execution, persistence mechanisms, or obfuscation. The behavior is fully aligned with its stated purpose.

能力评估

✓ Purpose & Capability

Name/description, README, SKILL.md and the included tools/scrape.py are all aligned: the skill calls Decodo's scraping API for Google, universal URLs, Amazon, YouTube, and Reddit. No unrelated capabilities or credentials are requested by the code.

✓ Instruction Scope

SKILL.md and the CLI script focus only on building requests to Decodo's scraper API. The runtime instructions ask the agent to set DECODO_AUTH_TOKEN or put it in a .env file; the script reads only that token and does not direct the agent to read other system files or exfiltrate additional environment variables.

✓ Install Mechanism

This is an instruction-only skill with a small Python helper and a requirements.txt (requests, python-dotenv). There is no download-from-arbitrary-URL or install script; risk from install mechanism is low.

ℹ Credentials

The skill legitimately requires a single DECODO_AUTH_TOKEN (Basic auth) to call Decodo's API, which is proportionate. however, registry metadata at the top claims 'Required env vars: none' while SKILL.md and the code require DECODO_AUTH_TOKEN — this metadata mismatch should be fixed. Treat the token as sensitive and only provide one obtained from Decodo's dashboard.

✓ Persistence & Privilege

The skill does not request always:true, does not modify other skills or system configs, and does not attempt privileged or persistent system changes. It uses the agent process to make outbound HTTPS calls to scraper-api.decodo.com.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install decodo-scraper-skill
安装完成后，直接呼叫该 Skill 的名称或使用 /decodo-scraper-skill 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.1.0

- Added detailed documentation for all supported scraping targets: Google Search, web pages, Amazon product and search pages, YouTube subtitles, Reddit posts, and Reddit subreddits. - Clarified required environment variable (`DECODO_AUTH_TOKEN`) and setup instructions for authentication. - Provided example usage commands and described output formats for each tool. - Summarized actions and arguments in a new summary table for easier reference. - Improved error handling documentation, specifying JSON output on failure.

元数据

Slug decodo-scraper-skill

版本 1.1.0

许可证 —

累计安装 1

当前安装数 1

历史版本数 1

常见问题