功能描述

This skill should be used when the user asks to "scrape Google News", "get news articles", "search for news", "extract news data", "monitor news topics", "ge...

使用说明 (SKILL.md)

Google News Scraper with Apify

Name: Apify Google News Scraper
Author: futurizerush

Search and extract news articles from Google News with full article content, enriched descriptions, and metadata. Supports region and language filtering.

Actor: futurizerush/google-news-scraper

Prerequisites

Set APIFY_API_TOKEN in environment. Get a token at console.apify.com/account/integrations.

Execution Flow

Apify runs are asynchronous. Every request follows 3 steps:

Start a run -- POST to the actor API, receive a run ID and dataset ID
Poll until done -- GET the run status, wait for SUCCEEDED
Fetch results -- GET the dataset items (returns a JSON array)

Typical run time: 30-90 seconds depending on query count and article enrichment.

Input Parameters

Parameter	Type	Required	Description
`searchQueries`	array of strings	Yes	Search queries (e.g. `["AI"]`, `["climate change"]`)
`region`	string	No	Region code. Default: `"us"`. Examples: `"us"`, `"tw"`, `"jp"`
`language`	string	No	Language code. Default: `"en"`. Examples: `"en"`, `"zh-TW"`, `"ja"`
`dateFilter`	string	No	Time range: `"1h"`, `"1d"`, `"1w"`, `"1m"`, or `""` (any time). Default: `""`
`maxResults`	integer	No	Max articles per query. Default: 20. Min: 10

Complete Example (Python)

import requests, os, time

TOKEN = os.environ["APIFY_API_TOKEN"]
BASE = "https://api.apify.com/v2"

# Step 1: Start the run
response = requests.post(
    f"{BASE}/acts/futurizerush~google-news-scraper/runs?token={TOKEN}",
    json={
        "searchQueries": ["AI"],
        "region": "us",
        "language": "en",
        "dateFilter": "1d",
        "maxResults": 10,
    },
)
response.raise_for_status()
run = response.json()["data"]
run_id = run["id"]
dataset_id = run["defaultDatasetId"]

# Step 2: Poll until done
while True:
    status = requests.get(
        f"{BASE}/actor-runs/{run_id}?token={TOKEN}"
    ).json()["data"]["status"]
    if status == "SUCCEEDED":
        break
    if status in ("FAILED", "ABORTED", "TIMED-OUT"):
        raise RuntimeError(f"Run failed: {status}")
    time.sleep(5)

# Step 3: Fetch results (JSON array)
items = requests.get(
    f"{BASE}/datasets/{dataset_id}/items?token={TOKEN}"
).json()
for article in items:
    print(f"[{article['source']}] {article['title']}")
    print(f"  URL: {article['articleUrl']}")
    print(f"  Published: {article['pubDate']}")
    if article.get("enrichedDescription"):
        print(f"  Summary: {article['enrichedDescription'][:100]}")

Taiwan news in Chinese

requests.post(
    f"{BASE}/acts/futurizerush~google-news-scraper/runs?token={TOKEN}",
    json={
        "searchQueries": ["台灣"],
        "region": "tw",
        "language": "zh-TW",
        "dateFilter": "1d",
        "maxResults": 10,
    },
)

Multiple queries

requests.post(
    f"{BASE}/acts/futurizerush~google-news-scraper/runs?token={TOKEN}",
    json={
        "searchQueries": ["AI", "climate", "crypto"],
        "region": "us",
        "dateFilter": "1w",
        "maxResults": 10,
    },
)

Complete Example (bash)

# Step 1: Start the run
RUN_RESPONSE=$(curl -s -X POST \
  "https://api.apify.com/v2/acts/futurizerush~google-news-scraper/runs?token=$APIFY_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"searchQueries": ["AI"], "region": "us", "language": "en", "dateFilter": "1d", "maxResults": 10}')

RUN_ID=$(echo "$RUN_RESPONSE" | jq -r '.data.id')
DATASET_ID=$(echo "$RUN_RESPONSE" | jq -r '.data.defaultDatasetId')

# Step 2: Poll until done
while true; do
  STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?token=$APIFY_API_TOKEN" \
    | jq -r '.data.status')
  [ "$STATUS" = "SUCCEEDED" ] && break
  [ "$STATUS" = "FAILED" ] || [ "$STATUS" = "ABORTED" ] && echo "Failed: $STATUS" && exit 1
  sleep 5
done

# Step 3: Fetch results
curl -s "https://api.apify.com/v2/datasets/$DATASET_ID/items?token=$APIFY_API_TOKEN" | jq '.'

Output Format

Each item in the results array (field names verified from real API output on 2026-04-11):

{
  "title": "Vance, Bessent questioned tech giants on AI security...",
  "articleUrl": "https://www.cnbc.com/2026/04/10/...",
  "googleNewsUrl": "https://news.google.com/rss/articles/...",
  "pubDate": "Fri, 10 Apr 2026 20:06:08 GMT",
  "timestamp": "2026-04-10T20:06:08.000Z",
  "source": "CNBC",
  "websiteName": "CNBC",
  "websiteUrl": "https://www.cnbc.com",
  "imageUrl": "https://image.cnbcfm.com/...",
  "description": "Raw RSS description with related headlines...",
  "enrichedDescription": "Bessent and Fed Chair Jerome Powell separately met with...",
  "excerpt": "Bessent and Fed Chair Jerome Powell separately met with...",
  "articleContent": {
    "content": "Full article text (truncated to ~2000 chars)...",
    "characterCount": 2000,
    "tokenCount": 325
  },
  "enrichmentTime": 9848,
  "guid": "unique-article-id",
  "searchQuery": "AI",
  "region": "us",
  "language": "en",
  "scrapedAt": "2026-04-11T06:06:59.618Z"
}

Note: Field names use camelCase. The articleContent object contains the full article text (up to ~2000 characters), character count, and token count. Use enrichedDescription or excerpt for summaries.

Error Handling

Error	Cause	Fix
401 Unauthorized	Invalid or missing API token	Check `APIFY_API_TOKEN`
`invalid-input`: "must be >= 10"	`maxResults` below minimum	Set `maxResults` to at least 10
No results	Query too specific or region has no news	Broaden the query or try a different region

Tips

Use dateFilter: "1h" for real-time news monitoring and alerting.
Use dateFilter: "1d" for daily news digests.
articleContent.content provides the full article text (up to ~2000 chars) -- useful for summarization.
enrichedDescription is a cleaner summary than description (which contains raw RSS data with related headlines).
timestamp is ISO 8601 format, easier to parse than pubDate.
Multiple search queries run in a single actor execution.
No login or API key for Google News required.

Links

安全使用建议

Before installing or enabling this skill: (1) note that SKILL.md requires an APIFY API token but the registry metadata does not list it — treat that as a metadata bug and assume you'll need to provide APIFY_API_TOKEN. (2) Only provide an APIFY token if you trust the actor/owner; verify the actor name (futurizerush/google-news-scraper) on Apify and ideally use a token scoped to a dedicated/minimal Apify account. (3) Ask the skill publisher to update registry metadata to declare APIFY_API_TOKEN as a required credential so automated tooling and reviewers can see the dependency. (4) If you have doubts about the owner (source unknown, no homepage), avoid supplying your main Apify credentials and consider running the actor manually in a sandboxed account to inspect outputs first.

功能分析

Type: OpenClaw Skill Name: apify-google-news Version: 0.1.1 The skill bundle provides legitimate instructions and code examples (Python and Bash) for interacting with the Apify Google News Scraper actor. It uses standard API patterns (POST to start, polling for status, GET for results) and correctly handles the API token via environment variables without any signs of data exfiltration, malicious execution, or prompt injection.

能力标签

crypto

能力评估

ℹ Purpose & Capability

The name/description (Apify Google News Scraper) matches the runtime instructions: the SKILL.md instructs calling the Apify actor futurizerush/google-news-scraper and fetching dataset items from api.apify.com. That functionality is coherent with the stated purpose. However, the registry metadata lists no required environment variables or primary credential while the SKILL.md explicitly requires APIFY_API_TOKEN — this mismatch is unexpected.

✓ Instruction Scope

The instructions are focused: they show how to start an Apify actor run, poll for completion, and fetch dataset items from https://api.apify.com. They only reference an API token (APIFY_API_TOKEN) and standard network calls; they do not ask the agent to read unrelated files, system paths, or other credentials. No unexpected external endpoints are used beyond Apify.

✓ Install Mechanism

This is an instruction-only skill with no install spec and no code files, so nothing is written to disk or downloaded by the skill itself. That is the lowest-risk install mechanism.

⚠ Credentials

The SKILL.md requires APIFY_API_TOKEN (sensitive credential) for API access, but the registry metadata declares no required env vars or primary credential. The token request is legitimate for Apify usage, but the metadata omission is misleading and could cause automated reviewers or users to miss that a secret is needed. The skill should declare APIFY_API_TOKEN as a required credential/primaryEnv.

✓ Persistence & Privilege

The skill does not request persistent/always-on inclusion and does not modify other skills or agent-wide configs. Autonomous invocation is allowed (platform default) but is not combined with other high-privilege requests here.

版本历史

v0.1.1

Add discovery tags for SEO

v0.1.0

Initial release. Search and extract news articles from Google News with full content. All output fields verified.

元数据

Slug apify-google-news

版本 0.1.1

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 2

常见问题

Apify Google News Scraper 是什么？

This skill should be used when the user asks to "scrape Google News", "get news articles", "search for news", "extract news data", "monitor news topics", "ge... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 104 次。

如何安装 Apify Google News Scraper？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install apify-google-news」即可一键安装，无需额外配置。

Apify Google News Scraper 是免费的吗？

是的，Apify Google News Scraper 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Apify Google News Scraper 支持哪些平台？

Apify Google News Scraper 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Apify Google News Scraper？

由 Futurize Rush（@futurizerush）开发并维护，当前版本 v0.1.1。

Apify Google News Scraper