← Back to Skills Marketplace
lgx-00

Google Index Checker

by GuangxianLiu · GitHub ↗ · v1.1.0 · MIT-0
cross-platform ✓ Security Clean
169
Downloads
0
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install google-index-checker
Description
Check Google indexed page count for any domain using the "site:" search operator in Chrome Remote Debugging Protocol (CDP on localhost:9222). Use when the us...
README (SKILL.md)

Google Index Checker

Check the number of indexed pages for any domain(s) using Google's site: search operator via Chrome Remote Debugging Protocol (CDP) on localhost:9222.

Prerequisites

  • Chrome running with --remote-debugging-port=9222
  • Node.js with ws npm package available at /tmp/wsclient/node_modules/ws
    • Install once: npm install ws --prefix /tmp/wsclient
    • If not found, install before starting

Connection Info

  • CDP HTTP endpoint: http://localhost:9222/json
  • Important: Use localhost:9222, NOT 127.0.0.1:9222 — Chrome listens on IPv6 ::1, not IPv4 127.0.0.1
  • All browser tabs share the same cookie/login session (same Chrome profile)
  • After each task, close all tabs and clean up /tmp/wsclient

Instructions

Step 1: Parse user input

Extract the domain(s) to check. Accept:

  • Single: example.com, www.example.com, https://example.com
  • Multiple: comma-separated, space-separated, or line-by-line
  • Normalize: strip protocol and trailing slashes

Step 2: Prepare browser connection

  1. Install ws if needed: npm install ws --prefix /tmp/wsclient
  2. Create one CDP tab: PUT http://localhost:9222/json/new
  3. Save the webSocketDebuggerUrl from the response

Step 3: Query each domain (reuse same tab)

For each domain, use Page.navigate in the same tab (do NOT create new tabs):

  1. Page.navigatehttps://www.google.com/search?q=site:{domain}
  2. Wait for Page.loadEventFired + 3 seconds
  3. Runtime.evaluatedocument.getElementById('result-stats')?.textContent
  4. Parse count from text like "找到约 12,700 条结果" using regex /找到约 ([\d,]+) 条结果/
  5. Strip commas → integer

Step 4: Present results

Single domain

**{domain}** has approximately **{count}** pages indexed by Google.

Multiple domains

## Google Index Coverage Report ({date})

| Domain | Indexed Pages | Notes |
|--------|--------------|-------|
| example.com | 13,200 | — |
| example.org | 8,500 | — |
| example.net | 1,200 | — |

Data source: Google `site:` search operator (approximate values)

Step 5: Clean up

  1. Close the tab: DELETE http://localhost:9222/json/close/{targetId}
  2. Verify: GET http://localhost:9222/json should return []
  3. Remove temp package: rm -rf /tmp/wsclient

CDN Script Template (copy-paste ready)

const WebSocket = require('/tmp/wsclient/node_modules/ws');
const http = require('http');

function cdpSend(ws, id, method, params) {
  return new Promise(resolve => {
    const handler = data => {
      const msg = JSON.parse(data);
      if (msg.id === id) resolve(msg);
    };
    ws.on('message', handler);
    ws.send(JSON.stringify({id, method, params}));
  });
}

function extractCount(text) {
  if (!text) return 'NOT_FOUND';
  const m = text.match(/找到约 ([\d,]+) 条结果/);
  return m ? m[1].replace(/,/g, '') : 'PARSE_ERROR:' + text;
}

async function main() {
  // 1. Create one tab
  const target = await new Promise((resolve, reject) => {
    const req = http.request({hostname: 'localhost', port: 9222, path: '/json/new', method: 'PUT'}, res => {
      let d = ''; res.on('data', c => d += c); res.on('end', () => resolve(JSON.parse(d)));
    });
    req.on('error', reject); req.end();
  });

  // 2. Connect WebSocket
  const ws = new WebSocket(target.webSocketDebuggerUrl);
  await new Promise(r => ws.on('open', r));
  await cdpSend(ws, 1, 'Page.enable', {});
  await cdpSend(ws, 2, 'Runtime.enable', {});

  // 3. Loop through domains
  const domains = [['Name', 'example.com']]; // Replace with actual domains
  for (const [name, domain] of domains) {
    await cdpSend(ws, 10, 'Page.navigate', {url: 'https://www.google.com/search?q=site:' + domain});
    await new Promise(resolve => {
      ws.on('message', data => {
        const msg = JSON.parse(data);
        if (msg.method === 'Page.loadEventFired') resolve();
      });
    });
    await new Promise(r => setTimeout(r, 3000));
    const r = await cdpSend(ws, 11, 'Runtime.evaluate', {expression: "document.getElementById('result-stats')?.textContent || 'NOT_FOUND'"});
    console.log(name + '|' + domain + '|' + extractCount(r.result.result.value));
  }

  // 4. Cleanup
  ws.close();
  http.request({hostname: 'localhost', port: 9222, path: '/json/close/' + target.id, method: 'DELETE'}, () => {}).end();
  await new Promise(r => setTimeout(r, 1000));
  
  // 5. Verify tabs closed
  const remaining = await new Promise((resolve, reject) => {
    const req = http.request({hostname: 'localhost', port: 9222, path: '/json', method: 'GET'}, res => {
      let d = ''; res.on('data', c => d += c); res.on('end', () => resolve(JSON.parse(d)));
    });
    req.on('error', reject); req.end();
  });
  console.log('Tabs remaining:', remaining.length);
  process.exit(0);
}

main().catch(e => { console.error(e); process.exit(1); });

Edge Cases

Problem Solution
#result-stats not found Try div[id^=result] or document.body.innerText
Google CAPTCHA Take screenshot, stop, report to user
0 results Check if site is new or blocked by robots.txt
localhost:9222 returns 404 Chrome not started with --remote-debugging-port=9222
Tabs accumulate Always close tab after use, verify with GET /json

Important Notes

  • The site: operator returns approximate values, not exact counts
  • Results vary between Google data centers
  • For precise data, use Google Search Console
  • One tab, sequential navigation — do NOT create new tabs per domain
Usage Guidance
This skill is coherent for checking Google-index counts by controlling a local Chrome instance, but it operates inside your Chrome profile and thereby can access the browser's authenticated session (cookies, logged-in accounts). Before using: 1) Run Chrome with remote debugging on a dedicated/ephemeral profile or in an isolated environment (container or disposable profile) to avoid exposing personal accounts. 2) Inspect the /tmp/wsclient contents and prefer installing packages from a trusted environment (or use a local copy of 'ws' you control). 3) Ensure localhost:9222 is bound only to loopback and not exposed to your network. 4) Be aware the SKILL.md's JavaScript is a template — operators or automated agents could alter it to navigate to other sites or execute arbitrary JS; only run code you trust. If you cannot run Chrome in an isolated profile, do not use this skill.
Capability Analysis
Type: OpenClaw Skill Name: google-index-checker Version: 1.1.0 The skill is designed to check Google search indexing counts using the Chrome Remote Debugging Protocol (CDP). It uses a Node.js script (found in SKILL.md) to interact with a local Chrome instance on port 9222, navigate to Google Search, and parse result statistics. The instructions include explicit cleanup steps, such as closing browser tabs and removing temporary files in /tmp/wsclient. While it utilizes powerful capabilities like browser automation and package installation, these are strictly aligned with the stated purpose and lack indicators of malicious intent or data exfiltration.
Capability Assessment
Purpose & Capability
Name/description match the instructions: the skill drives Chrome via the local CDP to perform Google site: searches and parse result counts. Requested artifacts (Node/ws, localhost CDP) are coherent with this purpose.
Instruction Scope
Instructions stay within the declared task (navigate to Google search pages and read #result-stats). However, using an existing Chrome profile means the CDP-controlled tab runs in the user's authenticated browser session (cookies, logged-in accounts). While the skill only instructs navigation to google.com, CDP allows arbitrary navigation and JS execution in the browsing context — the instructions rely on the operator to not deviate. The SKILL.md also instructs installing npm packages under /tmp and removing them; it does not access other system files or env vars.
Install Mechanism
No formal install spec; the SKILL.md asks to install the npm 'ws' package into /tmp/wsclient if missing. This is a standard npm install (moderate trust requirement). Using /tmp is reasonable for temporary tooling, but users should verify the package source and contents before executing arbitrary scripts that require it.
Credentials
The skill requests no environment variables, secrets, or config paths. The only notable resource is the local Chrome instance (CDP at localhost:9222), which is necessary for the described functionality.
Persistence & Privilege
Skill is instruction-only, has no install that persists beyond /tmp, and does not request always:true or any persistent privileges. It instructs cleanup of the temporary package and tab closure.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install google-index-checker
  3. After installation, invoke the skill by name or use /google-index-checker
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.1.0
v1.1.0: Fix connection address (localhost not 127.0.0.1), add tab reuse pattern (Page.navigate in same tab), full CDP script template, cleanup steps, edge case handling. Prerequisites now include ws npm package.
v1.0.0
Initial release: check Google indexed page count for single or multiple domains using site: operator
Metadata
Slug google-index-checker
Version 1.1.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is Google Index Checker?

Check Google indexed page count for any domain using the "site:" search operator in Chrome Remote Debugging Protocol (CDP on localhost:9222). Use when the us... It is an AI Agent Skill for Claude Code / OpenClaw, with 169 downloads so far.

How do I install Google Index Checker?

Run "/install google-index-checker" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Google Index Checker free?

Yes, Google Index Checker is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Google Index Checker support?

Google Index Checker is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Google Index Checker?

It is built and maintained by GuangxianLiu (@lgx-00); the current version is v1.1.0.

💬 Comments