← Back to Skills Marketplace
Telegram PDF Scraper
by
koppakanagaharsha-lang
· GitHub ↗
· v1.0.0
· MIT-0
104
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install telegram-pdf-scraper
Description
Automatically downloads and organizes PDFs from specified Telegram channels via Telegram Web, creating folders based on message headers.
Usage Guidance
This skill is not clearly malicious, but there are several inconsistencies and operational risks you should consider before installing or running it:
- Documentation vs implementation mismatch: The README says use your Chrome login and that it operates on Telegram 'Document' objects, but the code launches a Playwright browser with its own profile directory ('./openclaw_chrome_profile') and relies on link/text heuristics. Expect to be prompted to log in inside the Playwright-opened window; logging into your real Chrome profile is not enough.
- Persistent local profile: The skill writes a browser profile directory (cookies, session data) and downloaded files to disk. If you care about account separation, do not use a machine/profile that contains other sensitive accounts; review or sandbox that folder before reuse.
- Buggy/overbroad URL filter: The code treats any href containing 'http://' or 'https://' as 'malicious' and blocks it. This is extremely broad and likely to block real Telegram downloads (or conversely allow unexpected non-http links). The heuristic could both prevent successful downloads and produce false assumptions about 'safety.'
- No explicit install steps: The code depends on Playwright which normally requires installing browser binaries (e.g., running 'playwright install'). Verify how the platform will install dependencies and whether additional manual steps are required.
- Audit the code before running: If you plan to run it, inspect the repo and consider running it in an isolated environment (VM or throwaway user account) so the created profile and downloads are contained. Confirm the download target directory and profile path are acceptable, and be prepared to delete './openclaw_chrome_profile' after use.
If you want to proceed, consider asking the maintainer to fix the documented behavior (clarify login flow), remove the overly broad URL blocking rule, and add an explicit install guide for Playwright. If you cannot inspect or sandbox the skill, treat it as potentially unsafe for sensitive environments.
Capability Analysis
Type: OpenClaw Skill
Name: telegram-pdf-scraper
Version: 1.0.0
The skill automates the downloading and organization of PDF files from Telegram Web using Playwright. It includes a 'Safety Filter' in `main.py` that explicitly blocks external URLs (including all http/https links) to ensure only internal Telegram document objects are processed. The code uses a local persistent Chrome profile to maintain the user's session and implements filename sanitization to prevent filesystem errors or path traversal.
Capability Assessment
Purpose & Capability
Overall purpose (scraping PDFs from Telegram Web) aligns with the code using browser automation (Playwright). Requesting no env vars/credentials is appropriate. However the SKILL.md claims it interacts only with native Telegram Document objects and asks the user to be logged in to Chrome — the code actually uses Playwright's chromium persistent context and inspects anchor (<a>) elements and message text heuristics rather than explicit Document APIs. This mismatch suggests either sloppy documentation or implementation drift.
Instruction Scope
SKILL.md instructs the user to log into Telegram Web in their Chrome browser; the code launches a Playwright chromium instance with a local profile directory ('./openclaw_chrome_profile') and instructs the user to log into that Playwright-opened window. That is misleading. The code reads chat DOM, creates local folders and files, and will persist a browser profile locally. It also uses message text and anchor href/text heuristics (not robust 'Document' object handling) to decide what to download. There are no explicit exfiltration endpoints, but the skill will interact with Telegram Web via an automated browser and write files + a profile to disk.
Install Mechanism
No platform install spec was provided, but a requirements.txt (playwright>=1.40.0) is included. Playwright is a reasonable dependency for browser automation, but installing Playwright typically involves extra steps (downloading browser binaries). The skill does not declare how those steps run. Lack of an explicit install script is a documentation/operational gap that could confuse users or lead to unexpected manual install steps.
Credentials
The skill requests no environment variables, credentials, or external tokens — appropriate for a local scraper. It does create and persist a local browser profile directory ('./openclaw_chrome_profile') and writes downloads to the specified download directory; this local filesystem access is proportional to its purpose but is persistent and should be noted by the user.
Persistence & Privilege
The skill persists a browser profile directory in the working folder and saves downloaded files to disk. While 'always' is false and the skill does not modify other system settings, persisting a profile (cookies, sessions) across runs can be sensitive — it stores authentication state locally in './openclaw_chrome_profile'. Users should be aware of that persistent credential-like data.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install telegram-pdf-scraper - After installation, invoke the skill by name or use
/telegram-pdf-scraper - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
latest
Metadata
Frequently Asked Questions
What is Telegram PDF Scraper?
Automatically downloads and organizes PDFs from specified Telegram channels via Telegram Web, creating folders based on message headers. It is an AI Agent Skill for Claude Code / OpenClaw, with 104 downloads so far.
How do I install Telegram PDF Scraper?
Run "/install telegram-pdf-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Telegram PDF Scraper free?
Yes, Telegram PDF Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Telegram PDF Scraper support?
Telegram PDF Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Telegram PDF Scraper?
It is built and maintained by koppakanagaharsha-lang (@koppakanagaharsha-lang); the current version is v1.0.0.
More Skills