← 返回 Skills 市场
wpank

API Rate Limiting

作者 wpank · GitHub ↗ · v1.0.0
cross-platform ✓ 安全检测通过
1738
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install api-rate-limiting
功能描述
Rate limiting algorithms, implementation strategies, HTTP conventions, tiered limits, distributed patterns, and client-side handling. Use when protecting APIs from abuse, implementing usage tiers, or configuring gateway-level throttling.
使用说明 (SKILL.md)

Rate Limiting Patterns

Algorithms

Algorithm Accuracy Burst Handling Best For
Token Bucket High Allows controlled bursts API rate limiting, traffic shaping
Leaky Bucket High Smooths bursts entirely Steady-rate processing, queues
Fixed Window Low Allows edge bursts (2x) Simple use cases, prototyping
Sliding Window Log Very High Precise control Strict compliance, billing-critical
Sliding Window Counter High Good approximation Production APIs — best tradeoff

Fixed window problem: A user sends the full limit at 11:59 and again at 12:01, doubling the effective rate. Sliding window fixes this.

Token Bucket

Bucket holds tokens up to capacity. Tokens refill at a fixed rate. Each request consumes one.

class TokenBucket:
    def __init__(self, capacity: int, refill_rate: float):
        self.capacity = capacity
        self.tokens = capacity
        self.refill_rate = refill_rate  # tokens per second
        self.last_refill = time.monotonic()

    def allow(self) -> bool:
        now = time.monotonic()
        elapsed = now - self.last_refill
        self.tokens = min(self.capacity, self.tokens + elapsed * self.refill_rate)
        self.last_refill = now
        if self.tokens >= 1:
            self.tokens -= 1
            return True
        return False

Sliding Window Counter

Hybrid of fixed window and sliding window log — weights the previous window's count by overlap percentage:

def sliding_window_allow(key: str, limit: int, window_sec: int) -> bool:
    now = time.time()
    current_window = int(now // window_sec)
    position_in_window = (now % window_sec) / window_sec

    prev_count = get_count(key, current_window - 1)
    curr_count = get_count(key, current_window)

    estimated = prev_count * (1 - position_in_window) + curr_count
    if estimated >= limit:
        return False
    increment_count(key, current_window)
    return True

Implementation Options

Approach Scope Best For
In-memory Single server Zero latency, no dependencies
Redis (INCR + EXPIRE) Distributed Multi-instance deployments
API Gateway Edge No code, built-in dashboards
Middleware Per-service Fine-grained per-user/endpoint control

Use gateway-level limiting as outer defense + application-level for fine-grained control.


HTTP Headers

Always return rate limit info, even on successful requests:

RateLimit-Limit: 1000
RateLimit-Remaining: 742
RateLimit-Reset: 1625097600
Retry-After: 30
Header When to Include
RateLimit-Limit Every response
RateLimit-Remaining Every response
RateLimit-Reset Every response
Retry-After 429 responses only

429 Response Body

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Maximum 1000 requests per hour.",
    "retry_after": 30,
    "limit": 1000,
    "reset_at": "2025-07-01T12:00:00Z"
  }
}

Never return 500 or 503 for rate limiting — 429 is the correct status code.


Rate Limit Tiers

Apply limits at multiple granularities:

Scope Key Example Limit Purpose
Per-IP Client IP 100 req/min Abuse prevention
Per-User User ID 1000 req/hr Fair usage
Per-API-Key API key 5000 req/hr Service-to-service
Per-Endpoint Route + key 60 req/min on /search Protect expensive ops

Tiered pricing:

Tier Rate Limit Burst Cost
Free 100 req/hr 10 $0
Pro 5,000 req/hr 100 $49/mo
Enterprise 100,000 req/hr 2,000 Custom

Evaluate from most specific to least specific: per-endpoint > per-user > per-IP.


Distributed Rate Limiting

Redis-based pattern for consistent limiting across instances:

def redis_rate_limit(redis, key: str, limit: int, window: int) -> bool:
    pipe = redis.pipeline()
    now = time.time()
    window_key = f"rl:{key}:{int(now // window)}"
    pipe.incr(window_key)
    pipe.expire(window_key, window * 2)
    results = pipe.execute()
    return results[0] \x3C= limit

Atomic Lua script (prevents race conditions):

local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local current = redis.call('INCR', key)
if current == 1 then
    redis.call('EXPIRE', key, window)
end
return current \x3C= limit and 1 or 0

Never do separate GET then SET — the gap allows overcount.


API Gateway Configuration

NGINX:

http {
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    server {
        location /api/ {
            limit_req zone=api burst=20 nodelay;
            limit_req_status 429;
        }
    }
}

Kong:

plugins:
  - name: rate-limiting
    config:
      minute: 60
      hour: 1000
      policy: redis
      redis_host: redis.internal

Client-Side Handling

Clients must handle 429 gracefully:

async function fetchWithRetry(url: string, maxRetries = 3): Promise\x3CResponse> {
  for (let attempt = 0; attempt \x3C maxRetries; attempt++) {
    const res = await fetch(url);
    if (res.status !== 429) return res;

    const retryAfter = res.headers.get('Retry-After');
    const delay = retryAfter
      ? parseInt(retryAfter, 10) * 1000
      : Math.min(1000 * 2 ** attempt, 30000);
    await new Promise(r => setTimeout(r, delay));
  }
  throw new Error('Rate limit exceeded after retries');
}
  • Always respect Retry-After when present
  • Use exponential backoff with jitter when absent
  • Implement request queuing for batch operations

Monitoring

Track these metrics:

  • Rate limit hit rate — % of requests returning 429 (alert if >5% sustained)
  • Near-limit warnings — requests where remaining \x3C 10% of limit
  • Top offenders — keys/IPs hitting limits most frequently
  • Limit headroom — how close normal traffic is to the ceiling
  • False positives — legitimate users being rate limited

Anti-Patterns

Anti-Pattern Fix
Application-only limiting Always combine with infrastructure-level limits
No retry guidance Always include Retry-After header on 429
Inconsistent limits Same endpoint, same limits across services
No burst allowance Allow controlled bursts for legitimate traffic
Silent dropping Always return 429 so clients can distinguish from errors
Global single counter Per-endpoint counters to protect expensive operations
Hard-coded limits Use configuration, not code constants

NEVER Do

  1. NEVER rate limit health check endpoints — monitoring systems will false-alarm
  2. NEVER use client-supplied identifiers as sole rate limit key — trivially spoofed
  3. NEVER return 200 OK when rate limiting — clients must know they were throttled
  4. NEVER set limits without measuring actual traffic first — you'll block legitimate users or set limits too high to matter
  5. NEVER share counters across unrelated tenants — noisy neighbor problem
  6. NEVER skip rate limiting on internal APIs — misbehaving internal services can take down shared infrastructure
  7. NEVER implement rate limiting without logging — you need visibility to tune limits and detect abuse
安全使用建议
This skill is an instruction-only reference about rate limiting and appears internally consistent. Before using: review any example install commands (the README's npx/copy examples) and avoid running unfamiliar scripts or downloads; if you adapt the snippets to production, ensure atomic operations for counters (use the shown Lua script or equivalent), secure your Redis/gateway endpoints, and validate header formats and limits against your privacy/security requirements.
功能分析
Type: OpenClaw Skill Name: api-rate-limiting Version: 1.0.0 The skill bundle provides comprehensive documentation and code examples for implementing rate limiting patterns. All code snippets are illustrative and do not contain any malicious functions or system calls. The `SKILL.md` and `README.md` files are purely informational and educational, lacking any prompt injection attempts, instructions for data exfiltration, persistence, or other harmful behaviors. The installation instructions involve copying local files or using `npx add` from a public GitHub repository, which is the source of the skill itself, and does not introduce external malicious dependencies.
能力评估
Purpose & Capability
Name/description match the content: SKILL.md contains algorithms, gateway examples (NGINX, Kong), Redis patterns, HTTP header guidance, client retry patterns and monitoring notes — all appropriate for a rate-limiting skill.
Instruction Scope
Runtime instructions and code snippets remain within the domain of rate limiting. The examples reference helper functions (get_count, increment_count) and Redis, which are reasonable placeholders; nothing in the instructions asks the agent to read unrelated files, exfiltrate secrets, or contact unexpected endpoints. The README includes example install copy commands (local paths) and an npx URL example (documentation only) but the skill itself is instruction-only.
Install Mechanism
No install spec or code files are present (instruction-only), so nothing is written to disk by the skill. The README documents manual copy commands and an 'npx add' example pointing at a GitHub tree; those are documentation instructions rather than an automated install spec. As always, verify any external install command before running it.
Credentials
The skill declares no required environment variables, credentials, or config paths. Example configs reference Redis hosts (e.g., redis.internal) which is appropriate for distributed rate limiting and consistent with the stated purpose.
Persistence & Privilege
Flags show default behavior (not always:true). There is no install, no persistent privileges requested, and the skill does not attempt to modify other skills or system settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install api-rate-limiting
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /api-rate-limiting 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release featuring comprehensive documentation on rate limiting algorithms, implementation strategies, HTTP conventions, tiered limits, distributed patterns, and client-side handling. - Provides practical code samples for Token Bucket, Sliding Window Counter, Redis-based distributed rate limiting (Python and Lua), and API gateway configuration (NGINX, Kong). - Details recommended HTTP headers and response structures for communicating rate limits and 429 errors. - Covers best practices, anti-patterns, and critical "never do" guidelines for robust API rate limiting. - Includes monitoring recommendations and client retry logic examples.
元数据
Slug api-rate-limiting
版本 1.0.0
许可证
累计安装 3
当前安装数 1
历史版本数 1
常见问题

API Rate Limiting 是什么?

Rate limiting algorithms, implementation strategies, HTTP conventions, tiered limits, distributed patterns, and client-side handling. Use when protecting APIs from abuse, implementing usage tiers, or configuring gateway-level throttling. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1738 次。

如何安装 API Rate Limiting?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install api-rate-limiting」即可一键安装,无需额外配置。

API Rate Limiting 是免费的吗?

是的,API Rate Limiting 完全免费(开源免费),可自由下载、安装和使用。

API Rate Limiting 支持哪些平台?

API Rate Limiting 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 API Rate Limiting?

由 wpank(@wpank)开发并维护,当前版本 v1.0.0。

💬 留言讨论