/install clickhouse-github-forensics
ClickHouse GitHub Forensics
Query 10+ billion GitHub events for security investigations.
Author: Rufio @ Permiso Security
Use Case: Built during the Trivy supply chain compromise investigation (March 2026)
Quick Start
curl -s "https://play.clickhouse.com/?user=play" \
--data "SELECT ... FROM github_events WHERE ... FORMAT PrettyCompact"
- Endpoint:
https://play.clickhouse.com/?user=play - Table:
github_events - Auth: None required (public read-only)
- Freshness: Near real-time (~minutes behind)
- Volume: 10+ billion events
Key Columns
| Column | Type | Use |
|---|---|---|
created_at |
DateTime | Event timestamp |
event_type |
Enum | PushEvent, CreateEvent, DeleteEvent, ReleaseEvent, etc. |
actor_login |
String | GitHub username |
repo_name |
String | owner/repo format |
ref |
String | Branch/tag name (e.g., refs/heads/main, 0.33.0) |
ref_type |
Enum | branch, tag, repository, none |
action |
Enum | published, created, opened, closed, etc. |
For full schema (29 columns): see references/schema.md
Common Investigation Patterns
1. Actor Timeline (Who did what, when?)
SELECT created_at, event_type, repo_name, ref, action
FROM github_events
WHERE actor_login = 'TARGET_ACCOUNT'
AND created_at >= '2026-03-01'
ORDER BY created_at
2. Repo Activity Window (What happened during incident?)
SELECT created_at, event_type, actor_login, ref, ref_type, action
FROM github_events
WHERE repo_name = 'owner/repo'
AND created_at >= 'START_TIME'
AND created_at \x3C= 'END_TIME'
ORDER BY created_at
3. Anomaly Detection (First-time repo access)
SELECT repo_name,
countIf(created_at \x3C 'ATTACK_DATE') as before,
countIf(created_at >= 'ATTACK_DATE') as during
FROM github_events
WHERE actor_login = 'SUSPECT_ACCOUNT'
AND created_at >= 'LOOKBACK_START'
GROUP BY repo_name
ORDER BY during DESC
4. Tag/Release Tampering
SELECT created_at, event_type, actor_login, ref, ref_type
FROM github_events
WHERE repo_name = 'owner/repo'
AND event_type IN ('CreateEvent', 'DeleteEvent', 'ReleaseEvent')
AND ref_type = 'tag'
ORDER BY created_at
5. Actor Profile (Is this account legitimate?)
SELECT toStartOfMonth(created_at) as month,
count() as events,
uniqExact(repo_name) as unique_repos
FROM github_events
WHERE actor_login = 'TARGET_ACCOUNT'
GROUP BY month
ORDER BY month
6. Org-Wide Activity (All repos in an org)
SELECT created_at, event_type, actor_login, repo_name, ref
FROM github_events
WHERE repo_name LIKE 'orgname/%'
AND created_at >= 'START_TIME'
ORDER BY created_at
7. New Accounts During Incident (Potential attacker alts)
SELECT actor_login, min(created_at) as first_ever, count() as events
FROM github_events
WHERE repo_name LIKE 'orgname/%'
GROUP BY actor_login
HAVING first_ever >= 'INCIDENT_START' AND first_ever \x3C= 'INCIDENT_END'
ORDER BY first_ever
8. Hourly Breakdown (Attack timeline)
SELECT toStartOfHour(created_at) as hour,
actor_login,
count() as events,
groupArray(distinct repo_name) as repos,
groupArray(distinct event_type) as types
FROM github_events
WHERE repo_name LIKE 'orgname/%'
AND created_at >= 'START_TIME'
GROUP BY hour, actor_login
ORDER BY hour
Event Types Reference
| Event | Significance |
|---|---|
PushEvent |
Code pushed to branch |
CreateEvent |
Branch/tag/repo created |
DeleteEvent |
Branch/tag deleted |
ReleaseEvent |
Release published/edited |
PullRequestEvent |
PR opened/closed/merged |
IssueCommentEvent |
Comment on issue |
ForkEvent |
Repo forked |
WatchEvent |
Repo starred |
Tips
- Output formats:
FORMAT PrettyCompactfor tables,FORMAT TabSeparatedfor parsing - macOS curl: Use
--datanot-dfor multi-line queries - Timestamps: Use UTC, format
YYYY-MM-DD HH:MM:SS - No payload JSON: Raw event payloads aren't available; use structured columns
- Bot accounts: Filter with
actor_login NOT IN ('github-actions[bot]', 'dependabot[bot]')
Security & Privacy
- Uses ClickHouse's public playground — all queries sent to
play.clickhouse.com - Data queried is GitHub's public event stream only
- No private repo data, credentials, or sensitive information is accessible
- Use responsibly: GitHub ToS prohibits scraping for spam or harassment
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install clickhouse-github-forensics - After installation, invoke the skill by name or use
/clickhouse-github-forensics - Provide required inputs per the skill's parameter spec and get structured output
What is ClickHouse GitHub Forensics?
Query GitHub event data via ClickHouse for supply chain investigations, actor profiling, and anomaly detection. Use when investigating GitHub-based attacks,... It is an AI Agent Skill for Claude Code / OpenClaw, with 163 downloads so far.
How do I install ClickHouse GitHub Forensics?
Run "/install clickhouse-github-forensics" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is ClickHouse GitHub Forensics free?
Yes, ClickHouse GitHub Forensics is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does ClickHouse GitHub Forensics support?
ClickHouse GitHub Forensics is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created ClickHouse GitHub Forensics?
It is built and maintained by 1aN0rmus (@1an0rmus); the current version is v1.0.0.