← Back to Skills Marketplace
beipian261

闲鱼数据抓取

by beipian261 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
423
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install xianyu-data-grabber
Description
闲鱼数据抓取技能。使用 Playwright + OCR 技术突破反爬虫,抓取闲鱼商品数据(标题、价格、想要人数等),自动上传截图和数据到 Gitee 仓库。支持批量关键词搜索、竞品分析、市场调研。
Usage Guidance
Key things to consider before installing or running this skill: - Credentials: The skill asks for a Gitee personal access token and (optionally) a site login cookie. Only provide a token with the minimal scopes required (projects/repo write) and consider creating a dedicated repository/account for uploads. Avoid providing long-lived credentials you use elsewhere. - Inspect install scripts: Do not run curl | bash on an untrusted URL. Inspect install.sh, uploader.sh and cron-setup.sh locally before executing. The included install.sh will install system packages, pip/npm modules and attempt to add cron jobs—these require elevated privileges and change system state. - Cron/persistence: The skill's installer will (by default) add scheduled tasks that run scraping and uploads automatically. If you do not want autonomous recurring scraping, do not install the crontab entries or remove them after install. - Private data and file permissions: The docs claim token/cookie are stored with 600 permissions, but the install script does not enforce this. After creating the config file, set permissions (chmod 600 ~/.openclaw/workspace/.xianyu-grabber-config.json) and consider restricting who can read the workspace directory. - Upload destination: The uploader will push screenshots and data to a Gitee repo. Confirm the uploader.sh targets your intended repo/owner and that you trust that destination. If you prefer not to upload, leave uploadToGitee=false or avoid supplying a token. - Run in isolation for first test: If possible, run the skill in a disposable VM or container so you can observe network activity, filesystem changes, and cron modifications before trusting it on a production host. - Review truncated/omitted files: The repository included many scripts; some files were truncated in the review. Inspect any additional scripts (uploader.sh, run.sh, update.sh) for unexpected network endpoints, hardcoded URLs, or behaviour before full deployment. If you want, I can: (a) summarize the contents of uploader.sh/run.sh/update.sh and any omitted files to check for hidden endpoints or unexpected behavior; (b) show the exact crontab entries and recommend safer alternatives; or (c) suggest a minimal manual installation checklist to reduce risk.
Capability Analysis
Type: OpenClaw Skill Name: xianyu-data-grabber Version: 1.0.0 The skill bundle implements a complex data scraping and reporting tool with several high-risk behaviors. Most notably, 'update.sh' provides a remote code execution mechanism by downloading and overwriting local scripts with code from a Gitee repository. Additionally, 'install.sh' and 'cron-setup.sh' perform system-level modifications, including installing packages via apt-get/yum and establishing persistence through crontab entries. The skill also requires sensitive credentials (GITEE_TOKEN and XIANFU_COOKIE), which are stored in a local JSON file and used in potentially insecure ways, such as embedding the token in a Git URL within 'uploader.sh'. While these features support the stated purpose of an automated scraper, the combination of remote updates, persistence, and high-privilege requirements constitutes a significant security risk.
Capability Assessment
Purpose & Capability
Name/description match the code: Playwright + Tesseract OCR + uploader scripts implement scraping, OCR, reporting and Gitee upload. One minor metadata inconsistency: the registry 'Required binaries' in the provided metadata shows placeholders ([object Object]) while SKILL.md and the code clearly require node, python3, tesseract and playwright.
Instruction Scope
SKILL.md and scripts instruct the agent/user to store a Gitee personal access token and an (optional) 闲鱼 login cookie in a local config file and to run scripts that will: launch headless browsers, take full-page screenshots, perform OCR, and (optionally) upload data and screenshots to Gitee. Those actions are expected for the stated purpose, but the instructions also: (a) ask the user to place sensitive tokens/cookies in a file in the home workspace (no enforced permission setting in install.sh), (b) include explicit commands to configure cron entries for repeated autonomous scraping and upload, and (c) the INSTALL.md suggests piping a remote install.sh via curl|bash (a high-risk pattern). The skill does not instruct reading system credentials or unrelated config files, but the automatic scheduling and remote-install suggestions broaden the operational scope and risk.
Install Mechanism
The package contains an install.sh that performs system package installs (apt-get/yum), pip/npm installs, npx playwright install, and attempts to install cron jobs into the system crontab. INSTALL.md also advertises a curl -sL https://raw.githubusercontent.com/your-username/xianyu-data-grabber/main/install.sh | bash pattern. Although raw.githubusercontent.com is a common release host, piping an external script to bash is risky unless the URL is a verified official release; here the URL uses a placeholder 'your-username' which is ambiguous. The included install.sh will modify system state (packages, crontab) and requires elevated permissions to succeed—this increases risk and requires the user to review the install script carefully before execution.
Credentials
The skill requests a Gitee token (for repository file create/update) and an optional site login cookie (to improve scraping success). Those credentials are proportionate to the claimed functionality. However, the cookie and token are sensitive; the skill's docs claim storing them with permission 600, but the shipped install.sh does not explicitly set secure permissions on the config file. Also the registry metadata listing of required env vars is malformed (placeholders), which is an inconsistency to be aware of.
Persistence & Privilege
The skill itself is not marked always:true, but the provided install scripts (install.sh, cron-setup.sh) create and install cron jobs that will run scraping, report generation and uploads on a schedule (daily/weekly). That gives the skill ongoing system presence and the ability to perform repeated network access and uploads. This level of persistence is expected for a scheduler-based scraper but is a material privilege: if you install, the system will run periodic automated scraping and (if configured) upload to Gitee without further prompts. Review/consent should be explicit before enabling.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install xianyu-data-grabber
  3. After installation, invoke the skill by name or use /xianyu-data-grabber
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
初始发布 - 60+ 关键词,增强 OCR,可视化,智能推荐,定时任务,自动更新
Metadata
Slug xianyu-data-grabber
Version 1.0.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is 闲鱼数据抓取?

闲鱼数据抓取技能。使用 Playwright + OCR 技术突破反爬虫,抓取闲鱼商品数据(标题、价格、想要人数等),自动上传截图和数据到 Gitee 仓库。支持批量关键词搜索、竞品分析、市场调研。 It is an AI Agent Skill for Claude Code / OpenClaw, with 423 downloads so far.

How do I install 闲鱼数据抓取?

Run "/install xianyu-data-grabber" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 闲鱼数据抓取 free?

Yes, 闲鱼数据抓取 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 闲鱼数据抓取 support?

闲鱼数据抓取 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 闲鱼数据抓取?

It is built and maintained by beipian261 (@beipian261); the current version is v1.0.0.

💬 Comments