← 返回 Skills 市场
bencoremans

CLIProxy Media

作者 bencoremans · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ⚠ suspicious
309
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install cliproxy-media
功能描述
Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use thi...
使用说明 (SKILL.md)

cliproxy-media

Source: https://github.com/bencoremans/site/tree/main/skills/cliproxy-media

Analyze images and PDFs via CLIProxyAPI (Claude Max subscription, zero extra cost).

Setup

Set the endpoint to your CLIProxy instance:

export CLIPROXY_URL=http://your-host:8317/v1/messages

For Docker setups, replace your-host with your container hostname (e.g. cliproxyapi, localhost, or the container IP).

Quick start

# Analyze an image
python3 skills/cliproxy-media/scripts/analyze.py /path/to/image.jpg "What is in this image?"

# Read a PDF
python3 skills/cliproxy-media/scripts/analyze.py /path/to/document.pdf "Give a summary"

# Compare multiple images
python3 skills/cliproxy-media/scripts/analyze.py img1.jpg img2.jpg "Compare these images"

# With streaming (output appears immediately)
python3 skills/cliproxy-media/scripts/analyze.py --stream image.jpg "Describe in detail"

# With system prompt
python3 skills/cliproxy-media/scripts/analyze.py --system "You are a medical expert" scan.jpg "What do you see?"

# With higher token limit
python3 skills/cliproxy-media/scripts/analyze.py --max-tokens 4096 document.pdf "Extensive analysis"

What works ✅ / What doesn't ❌

✅ Supported file types

Type Format Note
Image .jpg / .jpeg Requires valid JPEG data
Image .png Fully supported
Image .gif Fully supported
Image .webp Fully supported
Document .pdf Base64-encoded, via document content type
Image via URL http:// / https:// Direct URL reference, no download needed

Multiple files at once: Provide multiple paths before the question. Max ~100 per request (Anthropic limit).

❌ Not supported

  • Office files (.docx, .xlsx, .pptx) — Workaround: convert to PDF
  • Audio (.mp3, .wav, .ogg) — Use Whisper for transcription
  • Video (.mp4, .mov, .avi) — Not supported by the model
  • Other document types (.txt, .html, .md as document) — Send text directly as a string

⚠️ System prompt warning

CLIProxyAPI accepts only the array notation for system prompts. The string notation is silently ignored — the model does not see it, but you also won't get an error message!

# ❌ DOES NOT WORK — ignored without error message
payload["system"] = "You are an expert."

# ✅ WORKS — always use array notation
payload["system"] = [{"type": "text", "text": "You are an expert."}]

The --system argument in analyze.py automatically uses the correct array notation.

Configuration (env vars)

Variable Default Description
CLIPROXY_URL http://localhost:8317/v1/messages Full endpoint URL
CLIPROXY_MODEL claude-sonnet-4-6 Model to use

Example:

export CLIPROXY_URL=http://localhost:8317/v1/messages
export CLIPROXY_MODEL=claude-opus-4-6
python3 skills/cliproxy-media/scripts/analyze.py image.jpg "question"

Additional options

--stream          Streaming output via SSE (output appears immediately)
--system TEXT     System prompt (automatically sent as array)
--max-tokens N    Maximum output tokens (default: 1024)
--model MODEL     Model override (overrides CLIPROXY_MODEL)
--url URL         Endpoint override (overrides CLIPROXY_URL)

Compatibility

This script works with any API that supports the Anthropic Messages format:

Provider Compatible Note
CLIProxyAPI ✅ Yes Primarily tested, system prompt array required
OpenRouter ✅ Yes Use Bearer token instead of x-api-key: dummy
LiteLLM ✅ Yes As proxy for Anthropic format
Anthropic direct ✅ Yes Use ANTHROPIC_API_KEY as x-api-key

Note for non-CLIProxy endpoints: Some proxies do accept string notation for system prompts. Always use array notation for maximum compatibility.

Known limitations of CLIProxyAPI

  • temperature and top_p may not be used at the same time (HTTP 400)
  • PDF as document with URL source does not work (Unable to download the file)
  • Only claude-sonnet-4-6 and claude-opus-4-6 available (haiku is deprecated)
  • inference_geo is always not_available in the response

Direct Python API

If you want to call the script from your own Python code:

import subprocess, json

result = subprocess.run(
    ["python3", "skills/cliproxy-media/scripts/analyze.py", "image.jpg", "Describe this"],
    capture_output=True, text=True
)
print(result.stdout)

Or use the built-in exec tool:

exec: python3 skills/cliproxy-media/scripts/analyze.py /path/to/image.jpg "question"
安全使用建议
This skill will read whatever local files you pass to it and send their base64 contents to the URL you configure (CLIPROXY_URL). Before installing or running it: - Only set CLIPROXY_URL to a proxy you control/trust. Treat that endpoint as capable of receiving any files you give the script. - Do not let an autonomous agent call this skill with unchecked file-path inputs (it can be used to exfiltrate arbitrary local files). - Note the SKILL.md claims compatibility with Anthropic/OpenRouter, but the script hardcodes 'x-api-key: dummy' and does not send real API keys or Authorization headers — if you intend to use non-CLIProxy providers, inspect and modify the script to add proper auth headers and test in a safe environment. - If you need stricter controls, require that the skill be invoked only with pre-approved file paths or add whitelist checks in the script to prevent accidental upload of sensitive files.
功能分析
Type: OpenClaw Skill Name: cliproxy-media Version: 1.0.3 The skill contains instructions in SKILL.md that explicitly direct the AI agent to bypass built-in security/analysis tools ("NEVER use the built-in image or pdf tools") in favor of this custom skill. This 'tool diversion' tactic steers sensitive user data (images and PDFs) to an external proxy endpoint defined by the CLIPROXY_URL environment variable. While the implementation in scripts/analyze.py is a functional API wrapper using standard libraries, the combination of steering instructions and the redirection of potentially sensitive documents to a third-party proxy presents a significant risk of data exposure.
能力评估
Purpose & Capability
The skill's stated purpose (analyze images/PDFs via a CLIProxy endpoint) matches the included script: it encodes files and posts them to the configured endpoint. However the SKILL.md claims broader compatibility (Anthropic direct, OpenRouter, etc.) that the script does not actually implement (see environment/headers mismatch below). The registry metadata also lists no required env vars while the docs expect CLIPROXY_URL/CLIPROXY_MODEL—minor mismatches but explainable.
Instruction Scope
Runtime instructions and the script require the caller to pass file paths (or URLs). The script will read local files and embed their base64 contents into requests to the configured endpoint. That is expected for media analysis, but it also means an agent invoking this skill (or a user using exec) can send any local file path supplied to the endpoint — a potential exfiltration vector if untrusted endpoints are used or if the agent is given permission to pick arbitrary paths.
Install Mechanism
No install spec; instruction-only plus a simple Python script. Nothing is downloaded or written during install, which is low-risk from an installation perspective.
Credentials
The skill expects CLIPROXY_URL and CLIPROXY_MODEL in its docs, which are reasonable. But the SKILL.md claims compatibility with Anthropic/OpenRouter and suggests using ANTHROPIC_API_KEY/Bearer tokens for non-CLIProxy endpoints — the included script does not read any API-key env or set an Authorization header (it hardcodes 'x-api-key: dummy'), so those compatibility claims are misleading. No high-privilege secrets are requested by the skill, but the documentation/code mismatch could cause users to accidentally send credentials to the wrong place if they modify the script or proxy behavior.
Persistence & Privilege
The skill does not request permanent presence (always:false) nor modifies system-wide config. It simply provides an executable script invoked on demand.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install cliproxy-media
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /cliproxy-media 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.3
Added source URL in SKILL.md for transparency
v1.0.2
Fixed: default endpoint changed to localhost:8317, removed hardcoded Docker hostname
v1.0.1
Fixed: translated all content to English
v1.0.0
Initial release: analyze images (jpg/png/gif/webp) and PDFs via CLIProxyAPI or any Anthropic-compatible proxy (OpenRouter, LiteLLM). Supports multi-image, streaming, system prompts, configurable endpoint/model via env vars.
元数据
Slug cliproxy-media
版本 1.0.3
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 4
常见问题

CLIProxy Media 是什么?

Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use thi... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 309 次。

如何安装 CLIProxy Media?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install cliproxy-media」即可一键安装,无需额外配置。

CLIProxy Media 是免费的吗?

是的,CLIProxy Media 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

CLIProxy Media 支持哪些平台?

CLIProxy Media 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 CLIProxy Media?

由 bencoremans(@bencoremans)开发并维护,当前版本 v1.0.3。

💬 留言讨论