← Back to Skills Marketplace
guoqiao

HackerNews Extract

by guoqiao · GitHub ↗ · v0.1.5
darwinlinuxwin32 ⚠ suspicious
2730
Downloads
3
Stars
6
Active Installs
6
Versions
Install in OpenClaw
/install hn-extract
Description
Extract a HackerNews post (article + comments) into single clean Markdown for quick reading or LLM input.
README (SKILL.md)

HackerNews Extract

Extract a HackerNews post (article + comments) into single clean Markdown for quick reading or LLM input.

see Examples

What it does

  • Accepts an HackerNews id or url
  • Download the linked article HTML, cleans and formats it.
  • Fetches the Hacknews post metadata and comments.
  • Outputs a readable combined markdown file with original article, threaded comments, and key metadata.

Requirements

  • uv installed and in PATH.

Install

No install beyond having uv. Dependencies will be installed automatically by uv into to a dedicated venv when run this script.

Usage Workflow (Mandatory for Agents)

When an agent is asked to extract a HackerNews post:

  1. Run the script with an output path: uv run --script ${baseDir}/hn-extract.py \x3Cinput> -o /tmp/hn-\x3Cid>.md.
  2. Send ONE combined message: Upload the file and ask the question in the same tool call. Use the message tool (action=send, filePath="/tmp/hn-\x3Cid>.md", message="Extraction complete. Do you want me to summarize it?").
  3. Do not output the full text or a summary directly in the chat unless specifically requested.

Usage

# run as uv script
uv run --script ${baseDir}/hn-extract.py \x3Chn-id|hn-url|path/to/item.json> [-o path/to/output.md]

# Examples
uv run --script ${baseDir}/hn-extract.py 46861313 -o /tmp/output.md
uv run --script ${baseDir}/hn-extract.py "https://news.ycombinator.com/item?id=46861313"
  • Omit -o to print to stdout.
  • Directories for -o are created automatically.

Notes

  • Retries are enabled for HTTP fetches.
  • Comments are indented by thread depth.
  • Sites requires authentication or blocks scraping may still fail.
Usage Guidance
This skill appears to do what it says: fetch HN metadata and the linked article, clean it, and produce markdown. It requires 'uv' and will make outbound HTTP requests to hn.algolia.com and whatever article URLs are linked (expected behavior). The main concern is the registry flag always: true — that makes the skill available in every agent run and increases exposure without an obvious reason. Before installing, consider: 1) Do you want this skill force-enabled for all agents? If not, ask the publisher to remove always:true or set it to user-invocable only. 2) Are you comfortable with the skill making outbound requests and writing files (temporary paths like /tmp)? 3) If you plan to pass local file paths, be aware the script will read those files. If anything about the always:true setting or network/file behavior worries you, do not install until the author justifies the permanent inclusion or you review/modify the code yourself.
Capability Analysis
Type: OpenClaw Skill Name: hn-extract Version: 0.1.5 The skill bundle is benign. The `SKILL.md` provides clear instructions for the AI agent to run the Python script, save output to `/tmp`, and then upload the file with a follow-up question, all aligned with the stated purpose and lacking any prompt injection attempts for malicious actions. The `hn-extract.py` script uses standard Python libraries (`requests`, `trafilatura`) to fetch data from the HackerNews API and linked articles, clean HTML, and format it into Markdown. It performs network requests and file system writes (to the specified output path or `/tmp`), which are necessary for its functionality, without any evidence of data exfiltration to unauthorized endpoints, malicious execution, persistence, or obfuscation.
Capability Assessment
Purpose & Capability
Name/description align with the code and instructions: the script fetches HN metadata from hn.algolia.com, downloads the linked article HTML, cleans it, and writes combined Markdown. The only required binary is 'uv', which the SKILL.md and shebang explain is used to run the script and manage Python deps.
Instruction Scope
SKILL.md gives explicit runtime steps: run the Python script via 'uv', then upload the generated file in a single message tool call and optionally ask whether to summarize. These instructions are scoped to delivering the extracted file to the user/agent. Note: the script fetches arbitrary article URLs (expected for this tool) and can read an input .json file if provided — both are consistent with the stated purpose but mean the skill performs outbound network requests and can read local JSON files passed as inputs.
Install Mechanism
No install spec is provided beyond requiring 'uv' on PATH. The script declares its Python dependencies in header comments (uv will install them into a venv at run time). This is low-risk compared to arbitrary remote installers or embedded binary downloads.
Credentials
The skill requests no environment variables, no credentials, and no config paths. That is appropriate for an extractor that operates with public HN APIs and by fetching public article URLs.
Persistence & Privilege
The registry flag always: true forces the skill to be included in every agent run. There is no justification in the SKILL.md for it to be always active — an extraction tool does not normally require permanent inclusion. always:true increases the attack surface and should be questioned or removed unless a clear reason is provided.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install hn-extract
  3. After installation, invoke the skill by name or use /hn-extract
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.5
hn-extract 0.1.5 - Updated documentation to clarify accepted inputs: now specifies support for only HackerNews ID or URL, removing reference to saved Algolia JSON files. - Examples and usage instructions streamlined to match supported input types. - Notes section updated to remove mention of article retrieval using trafilatura and Algolia specifics.
v0.1.4
- No user-facing changes in this version. - Internal code updates only; documentation remains unchanged.
v0.1.3
- No user-facing changes in this release; internal changes only. - Documentation, usage, and workflow descriptions remain unchanged.
v0.1.2
- Clarified workflow instructions for agents: agents must upload the file and send the follow-up message in a single tool call. - Updated example link from a single output file to an examples directory. - Minor language updates for clarity and consistency in documentation.
v0.1.1
- Added a mandatory workflow section for agents, specifying script execution, file upload, and user confirmation steps. - Clarified that agents should not output the full text or a summary in chat unless requested. - Minor clarification to the notes on article fetching with `trafilatura`.
v0.1.0
Initial release of hn-extract. - Extracts HackerNews articles and comments into clean Markdown for easy reading or LLM input. - Accepts HackerNews post IDs, URLs, or saved Algolia JSON files as input. - Scrapes and formats the main article and fetches complete comment threads, combining them with story metadata. - Outputs as a readable Markdown file or to stdout. - Handles dependencies automatically using `uv`; only `uv` needs to be pre-installed. - Supports robust HTTP fetching with retries and creates output directories as needed.
Metadata
Slug hn-extract
Version 0.1.5
License
All-time Installs 6
Active Installs 6
Total Versions 6
Frequently Asked Questions

What is HackerNews Extract?

Extract a HackerNews post (article + comments) into single clean Markdown for quick reading or LLM input. It is an AI Agent Skill for Claude Code / OpenClaw, with 2730 downloads so far.

How do I install HackerNews Extract?

Run "/install hn-extract" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is HackerNews Extract free?

Yes, HackerNews Extract is completely free (open-source). You can download, install and use it at no cost.

Which platforms does HackerNews Extract support?

HackerNews Extract is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux, win32).

Who created HackerNews Extract?

It is built and maintained by guoqiao (@guoqiao); the current version is v0.1.5.

💬 Comments