/install content-parser
When to Use
- User provides a URL and wants to extract/read its content
- Another skill needs to parse source material from a URL before generation
- User says "parse this URL", "extract content from this link"
- User says "解析链接", "提取内容"
When NOT to Use
- User already has text content and doesn't need URL parsing
- User wants to generate audio/video content (not content extraction)
- User wants to read a local file (use standard file reading tools)
Purpose
Extract and normalize content from URLs across supported platforms. Returns structured data including content body, metadata, and references. Useful as a preprocessing step for content generation skills or standalone content extraction.
Hard Constraints
- No shell scripts. Construct curl commands from the API reference files listed in Resources
- Always read
shared/authentication.mdfor API key and headers - Follow
shared/common-patterns.mdfor polling, errors, and interaction patterns - URL must be a valid HTTP(S) URL
- Always read config following
shared/config-pattern.mdbefore any interaction - Never save files to
~/Downloads/or.listenhub/— save to the current working directory
\x3CHARD-GATE> Use the AskUserQuestion tool for every multiple-choice step — do NOT print options as plain text. Ask one question at a time. Wait for the user's answer before proceeding to the next step. After collecting URL and options, confirm with the user before calling the extraction API. \x3C/HARD-GATE>
Step -1: API Key Check
Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.
Step 0: Config Setup
Follow shared/config-pattern.md Step 0.
If file doesn't exist — ask location, then create immediately:
mkdir -p ".listenhub/content-parser"
echo '{"autoDownload":true}' > ".listenhub/content-parser/config.json"
CONFIG_PATH=".listenhub/content-parser/config.json"
# (or $HOME/.listenhub/content-parser/config.json for global)
Then run Setup Flow below.
If file exists — read config, display summary, and confirm:
当前配置 (content-parser):
自动下载:{是 / 否}
Ask: "使用已保存的配置?" → 确认,直接继续 / 重新配置
Setup Flow (first run or reconfigure)
- autoDownload: "自动保存提取的内容到当前目录?"
- "是(推荐)" →
autoDownload: true - "否" →
autoDownload: false
- "是(推荐)" →
Save immediately:
NEW_CONFIG=$(echo "$CONFIG" | jq --argjson dl {true/false} '. + {"autoDownload": $dl}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")
Interaction Flow
Step 1: URL Input
Free text input. Ask the user:
What URL would you like to extract content from?
Step 2: Options (optional)
Ask if the user wants to configure extraction options:
Question: "Do you want to configure extraction options?"
Options:
- "No, use defaults" — Extract with default settings
- "Yes, configure options" — Set summarize, maxLength, or Twitter tweet count
If "Yes", ask follow-up questions:
- Summarize: "Generate a summary of the content?" (Yes/No)
- Max Length: "Set maximum content length?" (Free text, e.g., "5000")
- Twitter count (only if URL is Twitter/X profile): "How many tweets to fetch?" (1-100, default 20)
Step 3: Confirm & Extract
Summarize:
Ready to extract content:
URL: {url}
Options: {summarize: true, maxLength: 5000, twitter.count: 50} / default
Proceed?
Wait for explicit confirmation before calling the API.
Workflow
-
Validate URL: Must be HTTP(S). Normalize if needed (see
references/supported-platforms.md) -
Build request body:
{ "source": { "type": "url", "uri": "{url}" }, "options": { "summarize": true/false, "maxLength": 5000, "twitter": { "count": 50 } } }Omit
optionsif user chose defaults. -
Submit (foreground):
POST /v1/content/extract→ extracttaskId -
Tell the user extraction is in progress
-
Poll (background): Run the following exact bash command with
run_in_background: trueandtimeout: 300000. Note: status field is.data.status(notprocessStatus), interval is 5s, values areprocessing/completed/failed:TASK_ID="\x3Cid-from-step-3>" for i in $(seq 1 60); do RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/content/extract/$TASK_ID" \ -H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null) STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.status // "processing"') case "$STATUS" in completed) echo "$RESULT"; exit 0 ;; failed) echo "FAILED: $RESULT" >&2; exit 1 ;; *) sleep 5 ;; esac done echo "TIMEOUT" >&2; exit 2 -
When notified, download and present result:
If
autoDownloadistrue:- Write
{taskId}-extracted.mdto the current directory — full extracted content in markdown - Write
{taskId}-extracted.jsonto the current directory — full raw API response data
echo "$CONTENT_MD" > "${TASK_ID}-extracted.md" echo "$RESULT" > "${TASK_ID}-extracted.json"Present:
内容提取完成! 来源:{url} 标题:{metadata.title} 长度:~{character count} 字符 消耗积分:{credits} 已保存到当前目录: {taskId}-extracted.md {taskId}-extracted.json - Write
-
Show a preview of the extracted content (first ~500 chars)
-
Offer to use content in another skill (e.g.
/podcast,/tts)
Estimated time: 10-30 seconds depending on content size and platform.
API Reference
- Content extract:
shared/api-content-extract.md - Supported platforms:
references/supported-platforms.md - Polling:
shared/common-patterns.md§ Async Polling - Error handling:
shared/common-patterns.md§ Error Handling - Config pattern:
shared/config-pattern.md
Example
User: "Parse this article: https://en.wikipedia.org/wiki/Topology"
Agent workflow:
- URL:
https://en.wikipedia.org/wiki/Topology - Options: defaults (omit options)
- Submit extraction
curl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": {
"type": "url",
"uri": "https://en.wikipedia.org/wiki/Topology"
}
}'
- Poll until complete:
curl -sS "https://api.marswave.ai/openapi/v1/content/extract/69a7dac700cf95938f86d9bb" \
-H "Authorization: Bearer $LISTENHUB_API_KEY"
- Present extracted content preview and offer next actions.
User: "Extract recent tweets from @elonmusk, get 50 tweets"
Agent workflow:
- URL:
https://x.com/elonmusk - Options:
{"twitter": {"count": 50}} - Submit extraction
curl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": {
"type": "url",
"uri": "https://x.com/elonmusk"
},
"options": {
"twitter": {
"count": 50
}
}
}'
- Poll until complete, present results.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install content-parser - 安装完成后,直接呼叫该 Skill 的名称或使用
/content-parser触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Content Parser 是什么?
Extract and parse content from URLs. Triggers on: user provides a URL to extract content from, another skill needs to parse source material, "parse this URL"... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 288 次。
如何安装 Content Parser?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install content-parser」即可一键安装,无需额外配置。
Content Parser 是免费的吗?
是的,Content Parser 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Content Parser 支持哪些平台?
Content Parser 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Content Parser?
由 0xFango(@0xfango)开发并维护,当前版本 v0.1.0。