功能描述

Analyzes SQL queries and EXPLAIN outputs to identify performance issues, explain causes, and provide specific optimization suggestions with examples across m...

使用说明 (SKILL.md)

SQL Profiler ⚡

Name: SQL Profiler
Author: nerikko

Overview

The sql-profiler skill helps data engineers and developers identify and optimize performance bottlenecks in their SQL queries. It provides in-depth analysis of query plans, suggests specific optimizations, and explains complex database concepts in plain English, supporting various SQL dialects including Databricks SQL, PostgreSQL, Spark SQL, and ANSI SQL.

Features

Query Analysis: Detects common performance anti-patterns like missing indexes, full table scans, N+1 problems, inefficient joins, and improper use of functions.
EXPLAIN/EXPLAIN ANALYZE Interpretation: Translates verbose query plan outputs into actionable insights, highlighting the most expensive operations.
Optimization Suggestions: Offers concrete, actionable recommendations for query rewrites, index creation, partitioning strategies, and other performance enhancements.
Before/After Examples: Provides clear examples of inefficient queries and their optimized counterparts.
Multi-Dialect Support: Understands and profiles SQL for:
- Databricks SQL
- PostgreSQL
- Spark SQL
- ANSI SQL (general standard)
Performance Impact Estimation: Where possible, estimates the potential performance gains from applying suggested optimizations.

Usage

The sql-profiler skill is invoked with the /sql-profiler command, followed by a subcommand and relevant arguments.

`analyze` - Analyze a SQL query for performance issues

Analyzes a given SQL query and identifies potential performance bottlenecks.

Syntax: /sql-profiler analyze --query "SELECT * FROM my_table WHERE ..." [--dialect postgresql|databricks|sparksql|ansi] [--explain-output "ACTUAL EXPLAIN OUTPUT HERE"]

Arguments:

--query (required): The SQL query string to analyze.
--dialect (optional): The SQL dialect. Supported: postgresql, databricks, sparksql, ansi. Defaults to ansi if not specified.
--explain-output (optional): The output from EXPLAIN or EXPLAIN ANALYZE for the query. Providing this significantly improves the accuracy and depth of the analysis.

Example:

/sql-profiler analyze --query "SELECT customer_name, SUM(order_total) FROM orders GROUP BY customer_name ORDER BY SUM(order_total) DESC LIMIT 10;" --dialect postgresql

Example with EXPLAIN output:

/sql-profiler analyze --query "SELECT * FROM large_table WHERE created_at \x3C '2023-01-01' AND status = 'active';" --dialect databricks --explain-output "== Physical Plan ==
*(1) Project [id#123, created_at#124, status#125]
+- *(1) Filter (isnotnull(created_at#124) AND (created_at#124 \x3C 2023-01-01) AND isnotnull(status#125) AND (status#125 = active))
   +- *(1) FileScan csv [id#123, created_at#124, status#125] Batched: false, DataFilters: [isnotnull(created_at#124), (created_at#124 \x3C 2023-01-01), isnotnull(status#125), (status#125 = active)], Format: CSV, Location: InMemoryFileIndex[dbfs:/user/hive/warehouse/large_table], PartitionFilters: [], PushedFilters: [IsNotNull(created_at), LessThan(created_at,2023-01-01), IsNotNull(status), EqualTo(status,active)], ReadSchema: struct\x3Cid:string,created_at:timestamp,status:string>
"

Output: The command will return a detailed analysis including:

Identified Issues: A list of potential performance problems.
Explanation: A plain English explanation of why each issue is a problem.
Suggestions: Specific recommendations for optimizing the query.
Before/After: Code examples demonstrating the original and optimized query (if applicable).
Estimated Impact: A qualitative or quantitative estimate of performance improvement.

`explain-plan` - Interpret EXPLAIN/EXPLAIN ANALYZE output

Provides a human-readable interpretation of a raw EXPLAIN or EXPLAIN ANALYZE output.

Syntax: /sql-profiler explain-plan --output "RAW EXPLAIN OUTPUT HERE" [--dialect postgresql|databricks|sparksql|ansi]

Arguments:

--output (required): The full text output from an EXPLAIN or EXPLAIN ANALYZE command.
--dialect (optional): The SQL dialect the EXPLAIN output belongs to. Defaults to ansi.

Example:

/sql-profiler explain-plan --output "Aggregate  (cost=250.75..250.76 rows=1 width=36) (actual time=0.089..0.089 rows=1 loops=1)
  ->  Sort  (cost=250.75..250.76 rows=1 width=36) (actual time=0.088..0.088 rows=1 loops=1)
        Sort Key: (sum(orders.order_total)) DESC
        Sort Method: quicksort  Memory: 25kB
        ->  HashAggregate  (cost=250.72..250.73 rows=1 width=36) (actual time=0.082..0.082 rows=1 loops=1)
              Group Key: orders.customer_name
              Batches: 1  Memory Usage: 24kB
              ->  Seq Scan on orders  (cost=0.00..200.00 rows=10000 width=16) (actual time=0.003..0.024 rows=10000 loops=1)
" --dialect postgresql

Output: A plain English breakdown of the query plan, highlighting:

The most expensive operations (e.g., full table scans, sorts, hash joins).
Why these operations are costly.
Recommendations to reduce their impact.

`optimize` - Get specific optimization suggestions for a query

Directly asks for optimization suggestions for a query without full analysis, assuming common issues.

Syntax: /sql-profiler optimize --query "SELECT * FROM customers WHERE region = 'EMEA';" [--dialect postgresql|databricks|sparksql|ansi]

Arguments:

--query (required): The SQL query to optimize.
--dialect (optional): The SQL dialect. Defaults to ansi.

Example:

/sql-profiler optimize --query "SELECT o.order_id, c.customer_name FROM orders o JOIN customers c ON o.customer_id = c.customer_id WHERE c.registration_date \x3C '2022-01-01';" --dialect sparksql

Output:

Suggestions: A list of direct optimization tips, potentially including index recommendations, join order changes, or query rewrites.
Before/After: Code examples for rewrites.
Rationale: Brief explanation of why the suggestion improves performance.

Core Concepts Explained

Missing Indexes

Problem: When a database needs to find specific rows based on conditions in WHERE clauses, JOIN conditions, or ORDER BY clauses, it often performs a "full table scan" (reads every row). This is slow for large tables. Solution: Creating an index on the columns used in these conditions allows the database to quickly jump to the relevant rows, similar to using an index in a book. Impact: Can dramatically reduce query execution time, especially for large tables and selective queries.

Full Table Scans

Problem: Reading every single row in a table to find a small subset of data. This is inefficient. Solution: Often solved by adding appropriate indexes, partitioning large tables, or rewriting queries to filter earlier. Impact: Avoids unnecessary I/O and CPU usage, leading to faster queries.

N+1 Problem

Problem: Occurs when an application executes N additional queries for each result of an initial query. For example, fetching a list of users, then for each user, fetching their associated orders in separate queries. Solution: Use JOIN operations, IN clauses, or subqueries to fetch all related data in a single, more efficient query. In some ORMs, "eager loading" helps. Impact: Reduces the number of round trips to the database, significantly speeding up data retrieval.

Bad Joins (Cross Joins, Inefficient Join Order)

Problem:

Cross Join (unintentional): Occurs when join conditions are missing or incorrect, resulting in a Cartesian product (every row from table A joined with every row from table B). This generates huge result sets and can crash databases.
Inefficient Join Order: Databases try to optimize join order, but sometimes a specific order (e.g., filtering a large table before joining with another) is much more efficient. Solution:
Always specify explicit join conditions (ON clause).
Consider the cardinality and size of tables when joining. Filter small tables or highly selective conditions first.
Use EXPLAIN to see the join order and adjust if necessary (e.g., using hints, though generally not recommended unless absolutely needed). Impact: Correct joins prevent performance disasters and ensure data accuracy. Optimized join order can significantly reduce intermediate result set sizes.

Databricks SQL / Spark SQL Specifics

Databricks SQL and Spark SQL operate on a distributed architecture. Optimizations often involve:

Data Skew: Uneven distribution of data, causing some tasks to take much longer. Handled by salting keys, repartitioning, or using BROADCAST hints for small tables.
Shuffle Operations: Data movement across the network, which is expensive. Minimizing shuffles through efficient JOIN strategies, GROUP BY, and ORDER BY clauses is crucial.
Caching: Caching frequently accessed tables or intermediate results in memory can speed up subsequent queries.
File Formats: Using columnar formats like Parquet or Delta Lake is crucial for performance due to predicate pushdown and column pruning.
Z-Ordering: For Delta Lake, OPTIMIZE ... ZORDER BY helps colocate related data in the same set of files, reducing the amount of data read for queries with high-cardinality columns.

PostgreSQL Specifics

PostgreSQL is a powerful relational database. Key optimizations include:

Indexes: B-tree, Hash, GIN, GiST, BRIN indexes for various data types and query patterns.
VACUUM and ANALYZE: Regularly running these commands helps the query planner make accurate decisions and reclaims space from dead tuples.
WITH clauses (CTEs): Can improve readability but sometimes prevent the optimizer from pushing down predicates, leading to materialized CTEs that are less efficient.
Partitioning: For very large tables, declarative partitioning can greatly improve performance by allowing queries to scan only relevant partitions.
Prepared Statements: Reduce parsing overhead for frequently executed queries.

ANSI SQL (General Best Practices)

Select only necessary columns: Avoid SELECT *.
Filter early: Use WHERE clauses to reduce data before joins or aggregations.
Avoid functions in WHERE clauses on indexed columns: WHERE YEAR(date_column) = 2023 prevents index usage. Instead, WHERE date_column BETWEEN '2023-01-01' AND '2023-12-31'.
Understand UNION vs UNION ALL: UNION implies a distinct sort, UNION ALL does not. Use UNION ALL if duplicates are acceptable and performance is critical.
Subqueries vs. Joins: Often, JOINs are more performant than correlated subqueries.
Use LIMIT with ORDER BY: If you only need a few rows, combine LIMIT with an ORDER BY to return the most relevant data efficiently.

Disclaimer

The sql-profiler skill provides AI-driven suggestions. While highly effective, it's crucial to:

Test all optimizations: Apply suggestions to a staging environment and measure actual performance impact.
Understand your data: The best optimization depends on your specific data distribution, query patterns, and system architecture.
Consult documentation: Refer to the official documentation for your specific SQL dialect for advanced tuning.

This skill is a powerful assistant, not a replacement for human expertise and rigorous testing.

Contributing

Found a bug or have a suggestion? Open an issue or submit a pull request on the ClawHub repository.

License

MIT License.

安全使用建议

This skill is instruction-only and analyzes SQL text you paste; it does not connect to databases or request credentials. Before using it, avoid pasting sensitive data (production connection strings, PII, or secrets) into the query or EXPLAIN text. If you need live profiling that queries your database, prefer a tool that integrates securely with your DB and follow your organization’s credential-handling policies. Otherwise, this skill appears coherent and appropriate for offline/explained SQL analysis.

功能分析

Type: OpenClaw Skill Name: sql-profiler Version: 1.0.0 The sql-profiler skill is a purely documentation-based tool designed to guide an AI agent in analyzing SQL queries and EXPLAIN plan outputs for performance bottlenecks. It contains no executable code, network requests, or instructions to access sensitive system data, focusing entirely on providing optimization suggestions for dialects like PostgreSQL and Databricks SQL.

能力评估

✓ Purpose & Capability

The name and description (SQL analysis, EXPLAIN interpretation, optimization suggestions) match the SKILL.md, README, and package.json. No binaries, credentials, or external services are required, which is consistent for a skill that relies on user-provided queries and plans.

✓ Instruction Scope

The SKILL.md instructs the agent to accept queries and EXPLAIN text from the user and produce analysis/rewrite suggestions. It does not instruct the agent to read system files, fetch database credentials, or send data to third-party endpoints. Note: because it does not connect to databases itself, accuracy depends on the user pasting representative EXPLAIN/ANALYZE output.

✓ Install Mechanism

No install spec or code files that run at install time exist; this is instruction-only. The presence of README and package.json is informational only and there are no downloads or extracted archives.

✓ Credentials

No environment variables, credentials, or config paths are requested. That is proportionate for a tool that analyzes user-provided SQL text rather than connecting to databases.

✓ Persistence & Privilege

always is false and the skill is user-invocable. There is no indication it modifies other skills or requests permanent agent presence.

版本历史

v1.0.0

Initial release

元数据

Slug sql-profiler

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

SQL Profiler 是什么？

Analyzes SQL queries and EXPLAIN outputs to identify performance issues, explain causes, and provide specific optimization suggestions with examples across m... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 242 次。

如何安装 SQL Profiler？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install sql-profiler」即可一键安装，无需额外配置。

SQL Profiler 是免费的吗？

是的，SQL Profiler 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

SQL Profiler 支持哪些平台？

SQL Profiler 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 SQL Profiler？

由 Nerikko（@nerikko）开发并维护，当前版本 v1.0.0。

SQL Profiler

SQL Profiler ⚡

Overview

Features

Usage

analyze - Analyze a SQL query for performance issues

explain-plan - Interpret EXPLAIN/EXPLAIN ANALYZE output

optimize - Get specific optimization suggestions for a query

Core Concepts Explained

Missing Indexes

Full Table Scans

N+1 Problem

Bad Joins (Cross Joins, Inefficient Join Order)

Databricks SQL / Spark SQL Specifics

PostgreSQL Specifics

ANSI SQL (General Best Practices)

Disclaimer

Contributing

License

SQL Profiler 是什么？

如何安装 SQL Profiler？

SQL Profiler 是免费的吗？

SQL Profiler 支持哪些平台？

谁开发了 SQL Profiler？

💬 留言讨论

`analyze` - Analyze a SQL query for performance issues

`explain-plan` - Interpret EXPLAIN/EXPLAIN ANALYZE output

`optimize` - Get specific optimization suggestions for a query