← 返回 Skills 市场
闲鱼数据抓取
作者
beipian261
· GitHub ↗
· v1.0.0
· MIT-0
423
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install xianyu-data-grabber
功能描述
闲鱼数据抓取技能。使用 Playwright + OCR 技术突破反爬虫,抓取闲鱼商品数据(标题、价格、想要人数等),自动上传截图和数据到 Gitee 仓库。支持批量关键词搜索、竞品分析、市场调研。
安全使用建议
Key things to consider before installing or running this skill:
- Credentials: The skill asks for a Gitee personal access token and (optionally) a site login cookie. Only provide a token with the minimal scopes required (projects/repo write) and consider creating a dedicated repository/account for uploads. Avoid providing long-lived credentials you use elsewhere.
- Inspect install scripts: Do not run curl | bash on an untrusted URL. Inspect install.sh, uploader.sh and cron-setup.sh locally before executing. The included install.sh will install system packages, pip/npm modules and attempt to add cron jobs—these require elevated privileges and change system state.
- Cron/persistence: The skill's installer will (by default) add scheduled tasks that run scraping and uploads automatically. If you do not want autonomous recurring scraping, do not install the crontab entries or remove them after install.
- Private data and file permissions: The docs claim token/cookie are stored with 600 permissions, but the install script does not enforce this. After creating the config file, set permissions (chmod 600 ~/.openclaw/workspace/.xianyu-grabber-config.json) and consider restricting who can read the workspace directory.
- Upload destination: The uploader will push screenshots and data to a Gitee repo. Confirm the uploader.sh targets your intended repo/owner and that you trust that destination. If you prefer not to upload, leave uploadToGitee=false or avoid supplying a token.
- Run in isolation for first test: If possible, run the skill in a disposable VM or container so you can observe network activity, filesystem changes, and cron modifications before trusting it on a production host.
- Review truncated/omitted files: The repository included many scripts; some files were truncated in the review. Inspect any additional scripts (uploader.sh, run.sh, update.sh) for unexpected network endpoints, hardcoded URLs, or behaviour before full deployment.
If you want, I can: (a) summarize the contents of uploader.sh/run.sh/update.sh and any omitted files to check for hidden endpoints or unexpected behavior; (b) show the exact crontab entries and recommend safer alternatives; or (c) suggest a minimal manual installation checklist to reduce risk.
功能分析
Type: OpenClaw Skill
Name: xianyu-data-grabber
Version: 1.0.0
The skill bundle implements a complex data scraping and reporting tool with several high-risk behaviors. Most notably, 'update.sh' provides a remote code execution mechanism by downloading and overwriting local scripts with code from a Gitee repository. Additionally, 'install.sh' and 'cron-setup.sh' perform system-level modifications, including installing packages via apt-get/yum and establishing persistence through crontab entries. The skill also requires sensitive credentials (GITEE_TOKEN and XIANFU_COOKIE), which are stored in a local JSON file and used in potentially insecure ways, such as embedding the token in a Git URL within 'uploader.sh'. While these features support the stated purpose of an automated scraper, the combination of remote updates, persistence, and high-privilege requirements constitutes a significant security risk.
能力评估
Purpose & Capability
Name/description match the code: Playwright + Tesseract OCR + uploader scripts implement scraping, OCR, reporting and Gitee upload. One minor metadata inconsistency: the registry 'Required binaries' in the provided metadata shows placeholders ([object Object]) while SKILL.md and the code clearly require node, python3, tesseract and playwright.
Instruction Scope
SKILL.md and scripts instruct the agent/user to store a Gitee personal access token and an (optional) 闲鱼 login cookie in a local config file and to run scripts that will: launch headless browsers, take full-page screenshots, perform OCR, and (optionally) upload data and screenshots to Gitee. Those actions are expected for the stated purpose, but the instructions also: (a) ask the user to place sensitive tokens/cookies in a file in the home workspace (no enforced permission setting in install.sh), (b) include explicit commands to configure cron entries for repeated autonomous scraping and upload, and (c) the INSTALL.md suggests piping a remote install.sh via curl|bash (a high-risk pattern). The skill does not instruct reading system credentials or unrelated config files, but the automatic scheduling and remote-install suggestions broaden the operational scope and risk.
Install Mechanism
The package contains an install.sh that performs system package installs (apt-get/yum), pip/npm installs, npx playwright install, and attempts to install cron jobs into the system crontab. INSTALL.md also advertises a curl -sL https://raw.githubusercontent.com/your-username/xianyu-data-grabber/main/install.sh | bash pattern. Although raw.githubusercontent.com is a common release host, piping an external script to bash is risky unless the URL is a verified official release; here the URL uses a placeholder 'your-username' which is ambiguous. The included install.sh will modify system state (packages, crontab) and requires elevated permissions to succeed—this increases risk and requires the user to review the install script carefully before execution.
Credentials
The skill requests a Gitee token (for repository file create/update) and an optional site login cookie (to improve scraping success). Those credentials are proportionate to the claimed functionality. However, the cookie and token are sensitive; the skill's docs claim storing them with permission 600, but the shipped install.sh does not explicitly set secure permissions on the config file. Also the registry metadata listing of required env vars is malformed (placeholders), which is an inconsistency to be aware of.
Persistence & Privilege
The skill itself is not marked always:true, but the provided install scripts (install.sh, cron-setup.sh) create and install cron jobs that will run scraping, report generation and uploads on a schedule (daily/weekly). That gives the skill ongoing system presence and the ability to perform repeated network access and uploads. This level of persistence is expected for a scheduler-based scraper but is a material privilege: if you install, the system will run periodic automated scraping and (if configured) upload to Gitee without further prompts. Review/consent should be explicit before enabling.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install xianyu-data-grabber - 安装完成后,直接呼叫该 Skill 的名称或使用
/xianyu-data-grabber触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
初始发布 - 60+ 关键词,增强 OCR,可视化,智能推荐,定时任务,自动更新
元数据
常见问题
闲鱼数据抓取 是什么?
闲鱼数据抓取技能。使用 Playwright + OCR 技术突破反爬虫,抓取闲鱼商品数据(标题、价格、想要人数等),自动上传截图和数据到 Gitee 仓库。支持批量关键词搜索、竞品分析、市场调研。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 423 次。
如何安装 闲鱼数据抓取?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install xianyu-data-grabber」即可一键安装,无需额外配置。
闲鱼数据抓取 是免费的吗?
是的,闲鱼数据抓取 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
闲鱼数据抓取 支持哪些平台?
闲鱼数据抓取 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 闲鱼数据抓取?
由 beipian261(@beipian261)开发并维护,当前版本 v1.0.0。
推荐 Skills