功能描述

Diagnose and fix excessive Postgres egress (network data transfer) in a codebase. Use when a user mentions high database bills, unexpected data transfer cost...

使用说明 (SKILL.md)

Postgres Egress Optimizer

Name: Neon Postgres Egress Optimizer
Author: andrelandgraf

Guide the user through diagnosing and fixing application-side query patterns that cause excessive data transfer (egress) from their Postgres database. Most high egress bills come from the application fetching more data than it uses.

Step 1: Diagnose

Identify which queries transfer the most data. The primary tool is the pg_stat_statements extension.

Check if pg_stat_statements is available

SELECT 1 FROM pg_stat_statements LIMIT 1;

If this errors, the extension needs to be created:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

On Neon, it is available by default but may need this CREATE EXTENSION step.

Handle empty stats

Stats are cleared when a Neon compute scales to zero and restarts. If the stats are empty or the compute recently woke up:

Reset the stats to start a clean measurement window: SELECT pg_stat_statements_reset();
Let the application run under representative traffic for at least an hour.
Return and run the diagnostic queries below.

If the user has stats from a production database, use those. If they have no access to production stats, proceed to Step 2 and analyze the codebase directly — code-level patterns are often sufficient to identify the worst offenders.

Diagnostic queries

Run these to identify the top egress contributors. Focus on queries that return many rows, return wide rows (JSONB, TEXT, BYTEA columns), or are called very frequently.

Queries returning the most total rows:

SELECT query, calls, rows AS total_rows, rows / calls AS avg_rows_per_call
FROM pg_stat_statements
WHERE calls > 0
ORDER BY rows DESC
LIMIT 10;

Queries returning the most rows per execution (poorly scoped SELECTs, missing pagination):

SELECT query, calls, rows AS total_rows, rows / calls AS avg_rows_per_call
FROM pg_stat_statements
WHERE calls > 0
ORDER BY avg_rows_per_call DESC
LIMIT 10;

Most frequently called queries (candidates for caching):

SELECT query, calls, rows AS total_rows, rows / calls AS avg_rows_per_call
FROM pg_stat_statements
WHERE calls > 0
ORDER BY calls DESC
LIMIT 10;

Longest running queries (not a direct egress measure, but helps identify problem queries during a spike):

SELECT query, calls, rows AS total_rows,
  round(total_exec_time::numeric, 2) AS total_exec_time_ms
FROM pg_stat_statements
WHERE calls > 0
ORDER BY total_exec_time DESC
LIMIT 10;

Interpret the results

Rank findings by estimated egress impact:

High row count + wide rows = biggest egress. A query returning 1,000 rows where each row includes a 50KB JSONB column transfers ~50MB per call.
Extreme call frequency on even small queries adds up. A query called 50,000 times/day returning 10 rows each = 500,000 rows/day.
Cross-reference with the schema to identify which columns are wide. Look for JSONB, TEXT, BYTEA, and large VARCHAR columns.

Step 2: Analyze codebase

For each query identified in Step 1, or for each database query in the codebase if no stats are available, check:

Does it select only the columns the response needs?
Does it return a bounded number of rows (LIMIT/pagination)?
Is it called frequently enough to benefit from caching?
Does it fetch raw data that gets aggregated in application code?
Does it use a JOIN that duplicates parent data across child rows?

Step 3: Fix

Apply the appropriate fix for each problem found. Below are the most common egress anti-patterns and how to fix them.

Unused columns (SELECT *)

Problem: The query fetches all columns but the application only uses a few. Large columns (JSONB blobs, TEXT fields) get transferred over the wire and discarded.

Before:

SELECT * FROM products;

After:

SELECT id, name, price, image_urls FROM products;

Missing pagination

Problem: A list endpoint returns all rows with no LIMIT. This is an unbounded egress risk — every new row in the table increases data transfer on every request. Flag this regardless of current table size.

This is easy to miss because the application may work fine with small datasets. But at scale, an unpaginated endpoint returning 10,000 rows with even moderate column widths can transfer hundreds of megabytes per day.

Before:

SELECT id, name, price FROM products;

After:

SELECT id, name, price FROM products
ORDER BY id
LIMIT 50 OFFSET 0;

When adding pagination, check whether the consuming client already supports paginated responses. If not, pick sensible defaults and document the pagination parameters in the API.

High-frequency queries on static data

Problem: A query is called thousands of times per day but returns data that rarely changes. Every call transfers the same rows from the database. This pattern is only visible from pg_stat_statements — the code itself looks normal.

Look for queries with extremely high call counts relative to other queries. Common examples: configuration tables, category lists, feature flags, user role definitions.

Fix: Add a caching layer between the application and the database so it avoids hitting the database on every request.

Application-side aggregation

Problem: The application fetches all rows from a table and then computes aggregates (averages, counts, sums, groupings) in application code. The full dataset transfers over the wire even though the result is a small summary.

Fix: Push the aggregation into SQL.

Before: The application fetches entire tables and aggregates in code with loops or .reduce().

After:

SELECT p.category_id,
       AVG(r.rating) AS avg_rating,
       COUNT(r.id) AS review_count
FROM reviews r
INNER JOIN products p ON r.product_id = p.id
GROUP BY p.category_id;

JOIN duplication

Problem: A JOIN between a wide parent table and a child table duplicates all parent columns across every child row. If a product has 200 reviews and the product row includes a 50KB JSONB column, the join sends that 50KB × 200 = ~10MB for a single request.

This is distinct from the SELECT * problem. Even if you select only needed columns, a JOIN still repeats the parent data for every child row. The fix is structural: avoid the join entirely.

Before:

SELECT * FROM products
LEFT JOIN reviews ON reviews.product_id = products.id
WHERE products.id = 1;

After (two separate queries):

SELECT id, name, price, description, image_urls FROM products WHERE id = 1;
SELECT id, user_name, rating, body FROM reviews WHERE product_id = 1;

Two queries instead of one JOIN. The product data is fetched once. The reviews are fetched once. No duplication.

Step 4: Verify

After applying fixes:

Run existing tests to confirm nothing broke.
Check the responses — make sure the API still returns the same data shape. Column selection and pagination changes can break clients that depend on specific fields or full result sets.
Measure the improvement — if pg_stat_statements data is available, reset it (SELECT pg_stat_statements_reset();), let traffic run, then re-run the diagnostic queries to compare before and after.

This skill appears coherent for diagnosing Postgres egress, but exercise operational caution before running its SQL on production: - Prefer providing read-only or monitoring replicas instead of full-production writable credentials. - CREATE EXTENSION may require elevated privileges and may be disallowed in managed DBs; ask your DB admin first. - pg_stat_statements_reset() clears historic statistics (it will disrupt existing monitoring windows); avoid running it if you need existing stats. - If you cannot or will not share DB credentials, you can supply query outputs or anonymized samples instead, or let the skill focus on codebase analysis only. - Be careful not to expose secrets when sharing code or schema; consider masking sensitive fields. Overall this skill is internally consistent, but protect production systems and credentials when following its instructions.

功能分析

Type: OpenClaw Skill Name: neon-postgres-egress-optimizer Version: 1.0.0 The skill bundle provides standard diagnostic SQL queries and architectural advice for optimizing PostgreSQL network egress costs. It uses the legitimate 'pg_stat_statements' extension to identify inefficient query patterns and suggests common fixes like pagination and column selection, with no evidence of malicious intent or data exfiltration.

能力评估

✓ Purpose & Capability

The name/description (Postgres egress diagnosis & fixes) match the instructions: queries against pg_stat_statements, schema/code inspection, and query-level fixes. The skill does not ask for unrelated binaries, services, or environment variables.

ℹ Instruction Scope

The SKILL.md focuses on reading pg_stat_statements and analyzing code/query patterns, which is appropriate. It also suggests operations that change database state (CREATE EXTENSION IF NOT EXISTS pg_stat_statements; SELECT pg_stat_statements_reset();). Those are reasonable for gathering diagnostics but are state-changing and may require elevated privileges or affect monitoring — so they warrant caution before running on production systems.

✓ Install Mechanism

No install spec or downloaded code — instruction-only skill. Nothing is written to disk or installed by the skill itself.

✓ Credentials

The skill declares no required environment variables or credentials. Its operations require DB access in practice (the user or operator would need to supply connection credentials), which is proportional to the task. No unrelated secrets are requested by the skill metadata or instructions.

✓ Persistence & Privilege

The skill is not always-enabled, does not request persistent system presence, and does not modify other skills or system-wide agent settings. Autonomous invocation is allowed by platform default but not excessive for this purpose.

版本历史

v1.0.0

- Initial release of neon-postgres-egress-optimizer. - Guides users through diagnosing and resolving excessive Postgres egress (network transfer) issues. - Provides step-by-step instructions using `pg_stat_statements` to identify high-egress queries. - Details common anti-patterns (SELECT *, missing pagination, unnecessary JOINs) and offers practical fixes. - Includes advice on codebase review and verification steps to ensure optimizations are successful. - References official Neon documentation for further reading.

元数据

Slug neon-postgres-egress-optimizer

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Neon Postgres Egress Optimizer 是什么？

Diagnose and fix excessive Postgres egress (network data transfer) in a codebase. Use when a user mentions high database bills, unexpected data transfer cost... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 172 次。

如何安装 Neon Postgres Egress Optimizer？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install neon-postgres-egress-optimizer」即可一键安装，无需额外配置。

Neon Postgres Egress Optimizer 是免费的吗？

是的，Neon Postgres Egress Optimizer 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Neon Postgres Egress Optimizer 支持哪些平台？

Neon Postgres Egress Optimizer 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Neon Postgres Egress Optimizer？

由 Andre Landgraf（@andrelandgraf）开发并维护，当前版本 v1.0.0。

Neon Postgres Egress Optimizer

Postgres Egress Optimizer

Step 1: Diagnose

Check if pg_stat_statements is available

Handle empty stats

Diagnostic queries

Interpret the results

Step 2: Analyze codebase

Step 3: Fix

Unused columns (SELECT *)

Missing pagination

High-frequency queries on static data

Application-side aggregation

JOIN duplication

Step 4: Verify

Further reading

Neon Postgres Egress Optimizer 是什么？

如何安装 Neon Postgres Egress Optimizer？

Neon Postgres Egress Optimizer 是免费的吗？

Neon Postgres Egress Optimizer 支持哪些平台？

谁开发了 Neon Postgres Egress Optimizer？

💬 留言讨论