← Back to Skills Marketplace
saschabuehrle

cascadeflow: Cost + Latency Reduction

by Sascha Buehrle · GitHub ↗ · v1.1.1 · MIT-0
cross-platform ⚠ suspicious
694
Downloads
3
Stars
1
Active Installs
11
Versions
Install in OpenClaw
/install cascadeflow
Description
OpenClaw-native domain cascading. Use when users need cost/latency reduction via cascading, domain-aware model assignment, OpenClaw-native event handling, an...
README (SKILL.md)

CascadeFlow: Cost + Latency Reduction | 17+ Domain-Aware Models + OpenClaw-Native Events

Use CascadeFlow as an OpenClaw provider to lower cost and latency via cascading. Assign up to 17 domain-specific models (for coding, web search, reasoning, and more), including OpenClaw-native event handling, and cascade between them (small model first, verifier when needed). Keep setup minimal, then verify with one health check and one chat call.

Why Use It

  • Reduce spend with drafter/verifier cascading.
  • Run 17+ domain-aware model assignments (code, reasoning, web-search, and more).
  • Support cascading with streaming and multi-step agent loops.
  • Handle OpenClaw-native event/domain signals for smarter model selection.

Security Defaults

  • Install from PyPI and verify package artifact before first run.
  • Keep the server bound to localhost by default.
  • Use explicit auth tokens for chat and stats endpoints (recommended for production).
  • Expose remote access only behind TLS/reverse proxy with strong tokens.
  • Use least-privilege provider keys (separate test keys from production keys).

How It Works

  1. OpenClaw sends requests to CascadeFlow through OpenAI-compatible /v1/chat/completions.
  2. CascadeFlow reads prompt context plus OpenClaw-native event/domain metadata (for example metadata.method, metadata.event, and channel/category hints).
  3. CascadeFlow selects a domain-aware drafter/verifier pair (small model first).
  4. If quality passes threshold, drafter answer is returned (cost/latency advantage).
  5. If quality fails threshold, verifier runs and final answer is upgraded.
  6. The same cascading behavior is supported for streaming and multi-step agent loops.

Advantages

  • Lower average cost by avoiding verifier calls when not needed.
  • Lower average latency for simple and medium tasks.
  • Better quality on hard tasks through verifier fallback.
  • Better operational handling through OpenClaw-native event/domain understanding.

Quick Start

Or ask your OpenClaw agent to set it up for you as an OpenClaw custom provider with OpenClaw-native events and domain understanding.

  1. Install and verify package source:
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade "cascadeflow[openclaw]>=0.7,\x3C0.8"
python -m pip show cascadeflow
python -m pip download --no-deps "cascadeflow[openclaw]>=0.7,\x3C0.8" -d /tmp/cascadeflow_pkg
python -m pip hash /tmp/cascadeflow_pkg/cascadeflow-*.whl

Optional variants:

python -m pip install --upgrade "cascadeflow[openclaw,anthropic]>=0.7,\x3C0.8"   # Anthropic-only preset
python -m pip install --upgrade "cascadeflow[openclaw,openai]>=0.7,\x3C0.8"      # OpenAI-only preset
python -m pip install --upgrade "cascadeflow[openclaw,providers]>=0.7,\x3C0.8"   # Mixed preset
  1. Pick preset + credentials:
  • Presets: examples/configs/anthropic-only.yaml, examples/configs/openai-only.yaml, examples/configs/mixed-anthropic-openai.yaml
  • Provider key(s): ANTHROPIC_API_KEY=... and/or OPENAI_API_KEY=... (required based on selected preset)
  • Service tokens: --auth-token ... and --stats-auth-token ... (recommended for production; use long random values)
  1. Start server (safe local default):
set -a; source .env; set +a
python3 -m cascadeflow.integrations.openclaw.openai_server \
  --host 127.0.0.1 --port 8084 \
  --config examples/configs/anthropic-only.yaml \
  --auth-token local-openclaw-token \
  --stats-auth-token local-stats-token

Optional harness activation (runtime in-loop policy controls):

# Observe first (recommended): log decisions, no blocking
python3 -m cascadeflow.integrations.openclaw.openai_server \
  --host 127.0.0.1 --port 8084 \
  --config examples/configs/anthropic-only.yaml \
  --harness-mode observe

# Enforce mode with limits
python3 -m cascadeflow.integrations.openclaw.openai_server \
  --host 127.0.0.1 --port 8084 \
  --config examples/configs/anthropic-only.yaml \
  --harness-mode enforce \
  --harness-budget 1.0 \
  --harness-max-tool-calls 12 \
  --harness-max-latency-ms 3500 \
  --harness-compliance strict
  1. Configure OpenClaw provider:
  • baseUrl: http://\x3Ccascadeflow-host>:8084/v1 (local default: http://127.0.0.1:8084/v1)
  • If remote: http://\x3Cserver-ip>:8084/v1 or https://\x3Cdomain>/v1 (TLS/reverse proxy)
  • api: openai-completions
  • model: cascadeflow
  • apiKey: same value as your --auth-token

Commands

  • /model cflow: default OpenClaw model switch using alias cflow.
  • /cascade: optional custom command (if configured in OpenClaw).
  • /cascade savings: optional custom subcommand for cost stats.
  • /cascade health: optional custom subcommand for service status.

Links

  • Full setup + configs: references/clawhub_publish_pack.md
  • Listing strategy: references/market_positioning.md
  • Official docs: https://github.com/lemony-ai/cascadeflow/blob/main/docs/guides/openclaw_provider.md
  • GitHub repository: https://github.com/lemony-ai/cascadeflow
Usage Guidance
This skill appears to be what it claims (a local CascadeFlow OpenClaw provider) but there are a few things to check before installing: 1) Listing metadata mismatch — the SKILL.md expects provider API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY) and auth tokens, yet the registry metadata claims no required env vars. Ask the publisher to correct the listing so required secrets are explicit. 2) Verify the PyPI package before running it: follow the SKILL.md advice to download the wheel, hash it, and inspect the source or the GitHub repo (https://github.com/lemony-ai/cascadeflow). Prefer installing in an isolated virtualenv or sandbox. 3) Use the suggested safe defaults: bind to localhost, require long random auth tokens for the /v1 endpoints, and put any remote deployments behind TLS + reverse proxy. 4) Limit provider API keys to least-privilege/test keys during evaluation; do not use production keys until you have reviewed package source and network behaviour. 5) If you intend to expose the service remotely, validate the stats/auth-token separation and confirm no unintended endpoints are exposed. If you want higher assurance, request an explicit listing update (declare required env vars) and/or a reproducible release artifact (signed release or pinned Git tag) before installing.
Capability Analysis
Type: OpenClaw Skill Name: cascadeflow Version: 1.1.1 CascadeFlow is a utility designed to optimize LLM costs and latency by acting as a cascading proxy (drafter/verifier model routing) for OpenClaw. The skill bundle provides clear instructions for the agent to install the 'cascadeflow' package from PyPI, set up a local server, and configure it as an OpenAI-compatible provider. It emphasizes security best practices, including localhost binding, mandatory auth tokens, and package hash verification (SKILL.md), and lacks any indicators of data exfiltration, obfuscation, or unauthorized persistence.
Capability Assessment
Purpose & Capability
Name and description claim cascading/drafter-verifier routing for cost/latency reduction. The SKILL.md exclusively documents installing a Python package (cascadeflow), configuring provider presets (OpenAI/Anthropic), and running a local server — all consistent with the stated purpose.
Instruction Scope
Runtime instructions focus on installing the cascadeflow package, starting a local OpenAI-compatible server, and configuring OpenClaw provider settings. The only in-scope data accesses described are reading OpenClaw payload metadata (metadata.method/event/channel) and provider keys, which are appropriate for domain-aware routing. The instructions do not request arbitrary host/system file reads or unrelated credentials.
Install Mechanism
Install is via pip from PyPI (python -m pip install "cascadeflow[openclaw]>=0.7,<0.8"). This is expected for a Python provider but carries the usual PyPI risks (supply-chain/backdoored package). The SKILL.md sensibly recommends verifying the wheel hash and keeping the server bound to localhost; that reduces risk but does not remove it. No download-from-arbitrary-URL or extract steps are present.
Credentials
SKILL.md and reference docs clearly require provider credentials (OPENAI_API_KEY, ANTHROPIC_API_KEY) and service tokens (auth-token, stats token). However the skill registry metadata shows "Required env vars: none" and "Primary credential: none" — an inconsistency. The credentials requested by the runtime are proportionate to the purpose, but the omission from listing metadata is a meaningful mismatch that could mislead users about what secrets they must supply.
Persistence & Privilege
The skill has no install spec and is instruction-only; it does not request always:true nor does it instruct modifying other skills or system-wide settings. It suggests running a local service and using auth tokens; this is normal for a provider integration and not excessive.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install cascadeflow
  3. After installation, invoke the skill by name or use /cascadeflow
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.1.1
CascadeFlow 1.1.1 - No code or documentation changes in this version. - All functionality and documentation remain unchanged from the previous release.
v1.1.0
CascadeFlow 1.0.4 - Added "Optional harness activation" section to Quick Start, documenting harness observe/enforce modes and runtime in-loop policy controls. - No code changes; documentation updated for improved setup and operational guidance.
v1.0.3
- No changes in this version; content remains the same as the previous release. Scanner was blocked
v1.0.2
- Improved installation instructions to include package artifact verification (using pip download and pip hash). - Updated security guidance: recommend explicit auth tokens for production; clarify provider key requirements per preset. - Minor clarifications to Quick Start steps and configuration instructions for better usability. - No code or file changes; documentation updates only.
v1.0.1
**Security guidance and setup documentation improved** - Added a new Security Defaults section with installation and authentication best practices. - Updated installation commands to recommend explicit version ranges and package verification steps. - Clarified required API/service tokens for safe local and remote deployment. - Expanded instructions to highlight provider credential separation and secure remote exposure. - Added link to the official GitHub repository for reference.
v1.0.0
CascadeFlow 0.7.5 - Updated documentation for clarity, better outlining cost/latency reduction, domain-aware assignments, and OpenClaw-native event handling. - Quick start and installation steps made more concise, with clearer preset and server setup instructions. - Added a summarized "How It Works" and "Advantages" section for easier understanding of the cascading flow. - New dedicated commands section describing available OpenClaw commands and custom `/cascade` subcommands. - Expanded links for full setup, listing strategy, and official documentation. - No code changes in this release—documentation improvements only.
v0.7.4
- Add guidance for using external hosts: users are now instructed to replace the default local server address with their own host/IP or domain if running the server remotely or behind a proxy/TLS. - No other changes.
v0.7.3
- Expanded support for 17+ domain-aware models, enabling fine-tuned cascading for coding, search, reasoning, and more. - Added OpenClaw-native event handling and domain understanding. - Updated installation instructions for quicker setup and clearer provider presets. - Enhanced documentation for OpenClaw integration, including fast start variants and minimal steps.
v0.7.2
- Added installation option clarifying recommended install for OpenClaw + providers versus minimal install. - Updated instructions to show how to install only OpenClaw extras separately from provider SDKs. - No functional changes to usage, configuration, or setup steps.
v0.7.1
CascadeFlow 0.7.1 Changelog - Streamlined skill description for clarity and rapid onboarding. - Added concise, copy-paste install and server startup steps. - Simplified preset selection and environment variable setup instructions. - Clarified provider setup with explicit OpenClaw configuration details and model alias guidance. - Grouped key integration benefits and security defaults under "What Users Get" and "Safe Defaults". - Pointed users to comprehensive documentation for advanced setup and publishing.
v0.7.0
## 0.7.0 - 2026-02-14 ------ - Initial ClawHub release of the CascadeFlow OpenClaw provider skill. - Added required `SKILL.md` with setup and publishing workflow. - Added `agents/openai.yaml` with display metadata and default prompt. - Added `references/clawhub_publish_pack.md` with copy-paste setup, config, security defaults, and validation checklist.
Metadata
Slug cascadeflow
Version 1.1.1
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 11
Frequently Asked Questions

What is cascadeflow: Cost + Latency Reduction?

OpenClaw-native domain cascading. Use when users need cost/latency reduction via cascading, domain-aware model assignment, OpenClaw-native event handling, an... It is an AI Agent Skill for Claude Code / OpenClaw, with 694 downloads so far.

How do I install cascadeflow: Cost + Latency Reduction?

Run "/install cascadeflow" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is cascadeflow: Cost + Latency Reduction free?

Yes, cascadeflow: Cost + Latency Reduction is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does cascadeflow: Cost + Latency Reduction support?

cascadeflow: Cost + Latency Reduction is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created cascadeflow: Cost + Latency Reduction?

It is built and maintained by Sascha Buehrle (@saschabuehrle); the current version is v1.1.1.

💬 Comments