ASCII Vision
/install ascii-vision
ASCII Vision
Fallback image viewer when vision models are unavailable (rate limited, model down, no provider configured, etc.). Converts images to ASCII art using ffmpeg + Python so you (or the agent) can identify visual content — shapes, brightness distribution, textures, and structure — without relying on any vision API.
Also includes color sampling via raw pixel extraction for basic hue identification, and edge detection for texture quantification.
When to Use
image/vision_analyzereturns rate limit, model unavailable, or timeout errors- You need to quickly distinguish between similar-looking images ("is this a dark variant of the same composition?")
- The agent needs visual inspection but no vision provider is configured
- Debugging image generation output — check if an image was actually produced before sending it to the user
- Quantitatively comparing two images (brightness, edges, color)
How It Works
- ffmpeg scales the image to a low resolution (e.g. 60 columns) in grayscale, preserving aspect ratio
- ascii_viewer.py maps each pixel (0–255) to an ASCII character
- Optional
--statsoutputs brightness average, pixel distribution, and unique levels - Optional
--edgesdetects sharp transitions (edges) for texture quantification - Color sampling via
ffmpeg + xxdextracts RGB hex values from specific regions
Character Map
| Range | Char | Meaning |
|---|---|---|
| 0–25 | |
Pure black |
| 26–51 | . |
Very dark |
| 52–76 | : |
Dark |
| 77–102 | - |
Mid-dark |
| 103–127 | = |
Medium |
| 128–153 | + |
Mid-light |
| 154–179 | * |
Light |
| 180–204 | # |
Very light |
| 205–229 | % |
Near white |
| 230–255 | @ |
Pure white |
Setup
The bundled script is at scripts/ascii_viewer.py. Reference it relative to the skill directory:
SCRIPT=scripts/ascii_viewer.py
It accepts optional --width (default: 60) for columns. When paired with ffmpeg's scale=W:-1, height is auto-detected from the pixel data, preserving aspect ratio without distortion.
Requirements:
ffmpeg(with rawvideo support —which ffmpeg)python3
Usage
Basic ASCII Conversion
# Default 60 columns (auto-height, aspect ratio preserved)
ffmpeg -y -i \x3Cimage> -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py
# Custom width
ffmpeg -y -i \x3Cimage> -vf "scale=80:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --width 80
With Statistics and Edge Detection
ffmpeg -y -i \x3Cimage> -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --stats --edges
# Example output:
# brightness_avg=142/255
# bright_pixels=1200
# dark_pixels=800
# unique_levels=180
# edges_detected=400/3600
Color Sampling (No Python Needed)
# Overall average color (RGB hex)
ffmpeg -y -i \x3Cimage> -vf "scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p | head -c 6
# Specific region (e.g. bottom-center quarter)
ffmpeg -y -i \x3Cimage> -vf "crop=iw/2:ih/4:iw/4:3*ih/4,scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p
Batch Scan Multiple Images
for f in *.jpg; do
echo "=== $f ==="
ffmpeg -y -i "$f" -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --stats
echo ""
done
Recommended Widths
| Width | Use case |
|---|---|
| 40 | Quick scan, simple images |
| 60 | Balanced readability vs detail (default) |
| 80 | More detail, complex images |
| 120 | Maximum detail (may be too wide for chat) |
Interpreting the Output
Overall Brightness
- Many
@%#→ bright scene, well-lit - Many
.-:→ dark scene, night-time - Top-to-bottom gradient → directional lighting (lamp above, shadow below)
Content Patterns
- Clusters of
#%@→ bright objects, light sources, highlights - Vertical/horizontal lines of
-=→ edges, furniture, structures - Organized patterns with mixed brightness → text, diagrams, labeled elements
- Heavy texture (
*#%@intermixed) → detailed surfaces (fabric, foliage, textured objects) - Flat bands with little variation → night scenes, skies, plain backgrounds
Distinguishing Image Types
- Bright top + textured center + dark bottom → product shot or figure with directional lighting
- Uniformly dark with sparse clusters → night scene, silhouettes
- Structured patterns with
+=-:#%@formations → technical diagram, text overlay - Same scene as another but with more detail/texture in a zone → variant with more content/elements
Color Analysis Integration
Pair ASCII structural data with RGB color samples for richer diagnosis:
IMG="$1"
# 1. Original dimensions
ffprobe -v error -select_streams v:0 -show_entries stream=width,height -of csv=p=0 "$IMG"
# 2. ASCII + stats + edges
ffmpeg -y -i "$IMG" -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --stats --edges
# 3. Color info
echo "Average color (RGB hex):"
ffmpeg -y -i "$IMG" -vf "scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p | head -c 6
echo "Bottom region color:"
ffmpeg -y -i "$IMG" -vf "crop=iw/2:ih/4:iw/4:3*ih/4,scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p
Limitations
ASCII art is a mechanical fallback — it does NOT replace a vision model.
| Detects | Does NOT detect |
|---|---|
| Overall brightness (light vs dark scene) | Semantic meaning (what the subject is) |
| Contrast between regions | Color (everything is grayscale without xxd) |
| Texture (smooth vs detailed surface) | Legible text (only knows "something is there") |
| Lighting gradients (top-down, side, etc.) | Faces, emotions, or expressions |
| Edges and sharp transitions | Specific objects (person, cat, mask) |
| Spatial distribution of content | Depth, perspective, or real dimensions |
Good for:
- Checking if a generated image actually has content vs being blank
- Distinguishing between two variants of the same composition
- Detecting if there's text/detail in a specific region
- Confirming an image exists before sending it to the user
- Getting RGB color data from image regions
Not good for:
- Reading text (signs, screenshots, memes)
- Color-critical analysis (xxd helps but is coarse)
- Identifying objects, people, or animals
- Images with very fine detail (\x3C 2–3 pixels wide)
ASCII gives you structural data (brightness, texture, edges), not semantics. Like looking at a photo with your eyes closed — you can feel light and shadow, but you can't name what you see.
Common Pitfalls
- Brightness-only. You cannot distinguish red from blue if they have the same luminance — color information is lost (use xxd color sampling for that)
- Too-low width (e.g. 30) loses fine detail like small text. Stick to 60 minimum.
- Too-high width (e.g. 120+) produces ASCII that is illegible in a chat context — too wide to display cleanly.
- Smooth gradients render as solid bands of a single character. This is expected, not a bug.
- Not a vision replacement. ASCII art is a fallback when vision is unavailable, not a substitute. Always prefer the real tool when it works.
- ffmpeg not installed. Verify with
which ffmpegbefore attempting. Minimal Docker images may lack it. - Manual height mismatch. If you specify
--heightmanually, it must match the ffmpegscale=W:Houtput row count, or the ASCII will be misaligned.
Verification Checklist
- ffmpeg is installed (
which ffmpeg) - Script at
scripts/ascii_viewer.pyexists and is executable - Image path exists and is a valid image file
- Width is appropriate for the level of detail needed (60 default)
- Use
scale=W:-1in ffmpeg to auto-preserve aspect ratio (or match--heightif manual) - Output shows recognizable patterns, not just noise
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install ascii-vision - 安装完成后,直接呼叫该 Skill 的名称或使用
/ascii-vision触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
ASCII Vision 是什么?
Fallback image viewer when vision models are unavailable. Converts images to ASCII art via ffmpeg + Python for brightness distribution, texture analysis, edg... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 147 次。
如何安装 ASCII Vision?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install ascii-vision」即可一键安装,无需额外配置。
ASCII Vision 是免费的吗?
是的,ASCII Vision 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
ASCII Vision 支持哪些平台?
ASCII Vision 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 ASCII Vision?
由 Christian de la Cruz(@chdlc)开发并维护,当前版本 v1.2.1。