← Back to Skills Marketplace
wjl1004

Browser Collector

by wjl1004 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
78
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install browser-collector
Description
浏览器自动化+数据采集框架。支持Playwright控制、DdddOcr验证码识别、东方财富/雪球/AKShare金融数据采集。反爬对抗、UA池、代理。
README (SKILL.md)

browser-collector Skill

版本: 1.0.0 定位: 浏览器自动化 + 数据采集框架 依赖: Playwright, DdddOcr, OpenCV

快速开始

from collectors import EastMoneyCollector

em = EastMoneyCollector()
result = em.get_limit_up(limit=10)

采集器

采集器 方法 说明
eastmoney get_limit_up/get_limit_down 涨跌停
eastmoney get_stock_quote 个股行情
eastmoney get_top_money_flow 行业资金流
akshare collect_stock_quote AKShare股票
akshare collect_index 指数行情

浏览器模式

from collectors import EastMoneyCollector
em = EastMoneyCollector()
result = em.get_limit_up_browser(direction="up", limit=10)

CLI

python collectors/cli.py list
python collectors/cli.py collect eastmoney limit-up --limit 10
Usage Guidance
This package appears to do what it says: Playwright-based scraping, OCR for captchas, and collectors for EastMoney/Xueqiu/AKShare. Before installing, consider: 1) Dependency checklist — you will need to pip-install Playwright, run 'playwright install' (to get browser binaries), and install DdddOcr/OpenCV (and optionally pytesseract); 2) Legal/ethics — scraping financial sites can violate terms of service; use responsibly and respect rate limits; 3) Privacy — login cookies are persisted to ~/.openclaw/cookies; don't store sensitive account credentials unless you understand the implications; 4) Isolation — run in a virtualenv/container to avoid dependency/version conflicts and to contain network activity; 5) Review core/config.py (included) for logging, proxy, or telemetry settings before use. If you need higher assurance, request a review of the remaining truncated files (core/config.py, collectors/cli.py, and any omitted code) to confirm there are no hidden network callbacks or telemetry hooks.
Capability Analysis
Type: OpenClaw Skill Name: browser-collector Version: 1.0.0 The skill bundle provides a comprehensive framework for financial data scraping and browser automation, but it includes several high-risk security practices. Specifically, 'browser/playwright.py' initializes the browser with '--disable-web-security' and '--allow-running-insecure-content', which significantly weakens the browser's sandbox and security model. Additionally, 'browser/login.py' stores sensitive authentication cookies in plaintext within the user's home directory (~/.openclaw/cookies/). While these features appear intended to facilitate scraping and session persistence on platforms like Xueqiu and EastMoney, they constitute significant security vulnerabilities that could be leveraged if the agent navigates to untrusted sites.
Capability Assessment
Purpose & Capability
Name/description promise Playwright-driven scraping, OCR for captchas, and built-in collectors for EastMoney, Xueqiu and AKShare; the repository contains Playwright control, captcha solver, login manager, and collectors for those exact sources. No unrelated credentials or surprising binaries are required.
Instruction Scope
SKILL.md contains usage examples and a CLI invocation that match the code. The runtime instructions and code only reference site APIs and local cookie storage; there are no instructions to read arbitrary unrelated files or to exfiltrate data to unknown endpoints.
Install Mechanism
There is no install spec (instruction-only skill) which reduces install risk, but the code has non-trivial Python runtime dependencies (playwright, ddddocr, opencv, akshare, possibly pytesseract). Installing Playwright also requires browser binaries (playwright install) — the SKILL.md lists dependencies but does not provide an automated, auditable install step. This is not a security red flag by itself, but users should be prepared to install large/privileged dependencies.
Credentials
The skill does not declare required environment variables or external credentials. It does persist cookies under ~/.openclaw/cookies to support login flows (expected for a collector that can use authenticated APIs). No unrelated secrets or config paths are requested.
Persistence & Privilege
Flags show normal defaults (always: false, model invocation allowed). The skill writes cookies to its own folder under the user's home, which is a reasonable behavior for a login manager and does not modify other skills or system-wide settings.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install browser-collector
  3. After installation, invoke the skill by name or use /browser-collector
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of browser-collector: - Provides browser automation and data collection framework. - Supports Playwright automation, DdddOcr captcha recognition, and collection of financial data from EastMoney, Xueqiu, and AKShare. - Includes anti-crawling measures, user agent pool, and proxy support. - Offers both Python interface and CLI for data collection tasks. - Collectors available for stock quotes, limit up/down stocks, industry money flows, and index data.
Metadata
Slug browser-collector
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Browser Collector?

浏览器自动化+数据采集框架。支持Playwright控制、DdddOcr验证码识别、东方财富/雪球/AKShare金融数据采集。反爬对抗、UA池、代理。 It is an AI Agent Skill for Claude Code / OpenClaw, with 78 downloads so far.

How do I install Browser Collector?

Run "/install browser-collector" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Browser Collector free?

Yes, Browser Collector is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Browser Collector support?

Browser Collector is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Browser Collector?

It is built and maintained by wjl1004 (@wjl1004); the current version is v1.0.0.

💬 Comments