← 返回 Skills 市场
jinwangmok

Gmail Link Archiver

作者 목진왕 · GitHub ↗ · v1.1.0 · MIT-0
cross-platform ✓ 安全检测通过
125
总下载
0
收藏
1
当前安装
3
版本数
在 OpenClaw 中安装
/install gmail-link-archiver
功能描述
Connects to Gmail via IMAP, filters emails by subject prefix keyword in a specified mailbox, crawls links found in filtered emails using Playwright (to bypas...
使用说明 (SKILL.md)

Gmail Link Archiver

Archive web content from your email links. This skill connects to Gmail via IMAP, filters emails by a subject prefix keyword, crawls every link using Playwright (headless Chromium), converts pages to Markdown, and saves them to your OpenClaw workspace.

Quick Start

1. Install dependencies (one-time)

bash references/setup.sh

This automatically installs:

  • playwright (Python) + Chromium browser binary
  • html2text for HTML→Markdown conversion

2. First run — interactive setup

python3 references/gmail_link_archiver.py

The first run will prompt you for:

Setting Description Default
IMAP server Gmail IMAP host imap.gmail.com
IMAP port SSL port 993
Gmail address Your full email address
App password Gmail App Password (NOT your regular password)
Default mailbox IMAP folder to search INBOX
Subject prefix Filter emails whose subject starts with this
Workspace path Where to save Markdown files ~/openclaw-workspace/mail-archive

Credentials are saved locally to ~/.config/gmail-link-archiver/config.json with 0600 permissions. They are never transmitted or logged.

Gmail App Password: You need to generate an App Password at https://myaccount.google.com/apppasswords (requires 2FA enabled).

3. Subsequent runs

After the first setup, subsequent runs will read credentials from the saved config:

# Use saved config defaults
python3 references/gmail_link_archiver.py

# Override mailbox and prefix on the fly
python3 references/gmail_link_archiver.py --mailbox "INBOX" --subject-prefix "[Newsletter]"

# Save to a different workspace
python3 references/gmail_link_archiver.py --workspace ~/my-archive

# Limit number of links to crawl
python3 references/gmail_link_archiver.py --max-links 10

# Re-run the setup interview
python3 references/gmail_link_archiver.py --reconfigure

How It Works

  1. Connect — Authenticates to Gmail via IMAP SSL
  2. Filter — Searches the specified mailbox for emails matching the subject prefix
  3. Extract — Parses email bodies (HTML + plain text) to find HTTP/HTTPS links
  4. Crawl — Opens each link in headless Chromium via Playwright (bypasses bot detection, renders JavaScript)
  5. Convert — Transforms the crawled HTML into clean Markdown with metadata headers
  6. Save — Writes each Markdown file to the workspace directory

Pipeline Diagram

Gmail IMAP ──► Filter by Subject ──► Extract Links
                                          │
                                          ▼
                         Playwright + Chromium (headless)
                                          │
                                          ▼
                              HTML → Markdown (html2text)
                                          │
                                          ▼
                           Save to OpenClaw Workspace

CLI Reference

usage: gmail_link_archiver.py [-h] [--mailbox MAILBOX]
                               [--subject-prefix PREFIX]
                               [--workspace PATH]
                               [--max-links N]
                               [--reconfigure]

Options:
  --mailbox, -m        IMAP mailbox to search (default: from config)
  --subject-prefix, -s Subject prefix to filter emails
  --workspace, -w      Directory to save Markdown files
  --max-links          Max number of links to crawl (default: 50)
  --reconfigure        Re-run the setup interview

Output Format

Each crawled page is saved as a Markdown file with YAML frontmatter:

---
source: https://example.com/article
crawled_at: 2026-03-27T12:00:00Z
---

# Article Title

Article content converted to clean Markdown...

Files are named using a sanitized version of the URL plus a short hash for uniqueness.

Example Usage with Claude

Ask Claude to run the archiver:

"Run the Gmail Link Archiver to crawl links from my emails with subject starting with '[ReadLater]'"

Claude will execute:

python3 references/gmail_link_archiver.py --subject-prefix "[ReadLater]"

Or to set up fresh:

"Set up the Gmail Link Archiver with my credentials"

python3 references/gmail_link_archiver.py --reconfigure

Troubleshooting

"App password" rejected?

Playwright/Chromium issues?

# Reinstall Chromium
python3 -m playwright install chromium
# Install system dependencies (Linux)
sudo python3 -m playwright install-deps chromium

No emails found?

  • Check the mailbox name (use INBOX, [Gmail]/All Mail, etc.)
  • Verify the subject prefix matches exactly (case-sensitive)
  • Try a broader prefix

Permission denied on config file?

chmod 600 ~/.config/gmail-link-archiver/config.json

Security

  • Credentials are stored locally at ~/.config/gmail-link-archiver/config.json
  • File permissions are set to 0600 (owner read/write only)
  • Credentials are never transmitted anywhere except to the IMAP server
  • Credentials are never logged or printed to stdout
  • Use Gmail App Passwords (not your main Google password)
  • The config directory has 0700 permissions

Requirements

  • Python 3.8+
  • Linux (Ubuntu/Debian) for MVP
  • Gmail account with IMAP enabled and App Password
  • Internet connection for IMAP and web crawling
安全使用建议
This skill appears to do what it says, but review these points before installing: - It requires an IMAP App Password for your Gmail account. Use a Google App Password (not your main account password) and enable 2FA. Consider revoking the App Password when you no longer need the skill. - Credentials are stored as JSON at ~/.config/gmail-link-archiver/config.json with 0600 permissions. If you prefer, run the script each time without saving or store the config in an encrypted location. - The crawler opens every extracted link in headless Chromium and executes page JavaScript. That is necessary to capture JS-rendered content but can cause additional network traffic and may load third-party resources. Run in an isolated environment if you are concerned about exposure. - The setup script creates a Python virtualenv and downloads Playwright and the Chromium binary. Ensure you trust running these standard tools and have sufficient disk/network policy. - If you want additional assurance, inspect references/gmail_link_archiver.py yourself (it is included) or run it in a sandboxed account/container. Avoid providing your primary Gmail password; use an App Password as instructed. Overall this skill is internally coherent with its stated purpose; the main residual risks are sensitive local storage of credentials and the expected effects of automated browsing of arbitrary links.
能力评估
Purpose & Capability
Name/description (archive links from Gmail via IMAP and crawl them with Playwright) matches the included files and declared requirements: python3, a setup script that installs Playwright/html2text, and a Python script that logs into IMAP, extracts links, crawls pages, converts to Markdown, and saves files. No unrelated binaries or credentials are requested.
Instruction Scope
SKILL.md and the Python code instruct the agent to collect Gmail IMAP credentials (App Password), read mailboxes, extract links, and open those links in headless Chromium via Playwright. That is within the described scope, but crawling executes remote JavaScript and network requests for each link — a powerful capability that can fetch third-party resources and run remote code in a headless browser, which is expected but potentially impactful.
Install Mechanism
No packaged install spec in registry; installation is via the provided setup.sh which creates a virtualenv and uses pip to install playwright and html2text, then runs 'playwright install chromium' to download browser binaries. This is a standard approach for Playwright but involves downloading and installing browser binaries (network action). The install sources are normal PyPI/playwright tooling and not an arbitrary unknown URL.
Credentials
The skill asks the user for an IMAP host, account, and an App Password (2FA App Password). Those are proportionate to Gmail IMAP access. The script saves credentials in plaintext JSON at ~/.config/gmail-link-archiver/config.json (file mode 0600). Storing an app-specific password locally is convenient but sensitive—no unrelated secrets or system credentials are requested.
Persistence & Privilege
The skill is not marked 'always' and does not attempt to modify other skills or system-wide agent settings. It writes its own config in the user's home config directory and creates a local virtualenv under the references folder for dependencies — expected for a local CLI tool.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install gmail-link-archiver
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /gmail-link-archiver 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.1.0
Version 1.1.0 - Added version field to SKILL.md and updated documentation to reflect version 1.1.0. - Made minor updates and maintenance to documentation and source files. - No major functional changes to codebase; primarily a metadata and docs update.
v1.0.1
Fix pip install for PEP 668 externally-managed environments
v1.0.0
Initial release
元数据
Slug gmail-link-archiver
版本 1.1.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 3
常见问题

Gmail Link Archiver 是什么?

Connects to Gmail via IMAP, filters emails by subject prefix keyword in a specified mailbox, crawls links found in filtered emails using Playwright (to bypas... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 125 次。

如何安装 Gmail Link Archiver?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install gmail-link-archiver」即可一键安装,无需额外配置。

Gmail Link Archiver 是免费的吗?

是的,Gmail Link Archiver 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Gmail Link Archiver 支持哪些平台?

Gmail Link Archiver 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Gmail Link Archiver?

由 목진왕(@jinwangmok)开发并维护,当前版本 v1.1.0。

💬 留言讨论