ASCII Vision
/install ascii-vision
ASCII Vision
Fallback image viewer when vision models are unavailable (rate limited, model down, no provider configured, etc.). Converts images to ASCII art using ffmpeg + Python so you (or the agent) can identify visual content — shapes, brightness distribution, textures, and structure — without relying on any vision API.
Also includes color sampling via raw pixel extraction for basic hue identification, and edge detection for texture quantification.
When to Use
image/vision_analyzereturns rate limit, model unavailable, or timeout errors- You need to quickly distinguish between similar-looking images ("is this a dark variant of the same composition?")
- The agent needs visual inspection but no vision provider is configured
- Debugging image generation output — check if an image was actually produced before sending it to the user
- Quantitatively comparing two images (brightness, edges, color)
How It Works
- ffmpeg scales the image to a low resolution (e.g. 60 columns) in grayscale, preserving aspect ratio
- ascii_viewer.py maps each pixel (0–255) to an ASCII character
- Optional
--statsoutputs brightness average, pixel distribution, and unique levels - Optional
--edgesdetects sharp transitions (edges) for texture quantification - Color sampling via
ffmpeg + xxdextracts RGB hex values from specific regions
Character Map
| Range | Char | Meaning |
|---|---|---|
| 0–25 | |
Pure black |
| 26–51 | . |
Very dark |
| 52–76 | : |
Dark |
| 77–102 | - |
Mid-dark |
| 103–127 | = |
Medium |
| 128–153 | + |
Mid-light |
| 154–179 | * |
Light |
| 180–204 | # |
Very light |
| 205–229 | % |
Near white |
| 230–255 | @ |
Pure white |
Setup
The bundled script is at scripts/ascii_viewer.py. Reference it relative to the skill directory:
SCRIPT=scripts/ascii_viewer.py
It accepts optional --width (default: 60) for columns. When paired with ffmpeg's scale=W:-1, height is auto-detected from the pixel data, preserving aspect ratio without distortion.
Requirements:
ffmpeg(with rawvideo support —which ffmpeg)python3
Usage
Basic ASCII Conversion
# Default 60 columns (auto-height, aspect ratio preserved)
ffmpeg -y -i \x3Cimage> -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py
# Custom width
ffmpeg -y -i \x3Cimage> -vf "scale=80:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --width 80
With Statistics and Edge Detection
ffmpeg -y -i \x3Cimage> -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --stats --edges
# Example output:
# brightness_avg=142/255
# bright_pixels=1200
# dark_pixels=800
# unique_levels=180
# edges_detected=400/3600
Color Sampling (No Python Needed)
# Overall average color (RGB hex)
ffmpeg -y -i \x3Cimage> -vf "scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p | head -c 6
# Specific region (e.g. bottom-center quarter)
ffmpeg -y -i \x3Cimage> -vf "crop=iw/2:ih/4:iw/4:3*ih/4,scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p
Batch Scan Multiple Images
for f in *.jpg; do
echo "=== $f ==="
ffmpeg -y -i "$f" -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --stats
echo ""
done
Recommended Widths
| Width | Use case |
|---|---|
| 40 | Quick scan, simple images |
| 60 | Balanced readability vs detail (default) |
| 80 | More detail, complex images |
| 120 | Maximum detail (may be too wide for chat) |
Interpreting the Output
Overall Brightness
- Many
@%#→ bright scene, well-lit - Many
.-:→ dark scene, night-time - Top-to-bottom gradient → directional lighting (lamp above, shadow below)
Content Patterns
- Clusters of
#%@→ bright objects, light sources, highlights - Vertical/horizontal lines of
-=→ edges, furniture, structures - Organized patterns with mixed brightness → text, diagrams, labeled elements
- Heavy texture (
*#%@intermixed) → detailed surfaces (fabric, foliage, textured objects) - Flat bands with little variation → night scenes, skies, plain backgrounds
Distinguishing Image Types
- Bright top + textured center + dark bottom → product shot or figure with directional lighting
- Uniformly dark with sparse clusters → night scene, silhouettes
- Structured patterns with
+=-:#%@formations → technical diagram, text overlay - Same scene as another but with more detail/texture in a zone → variant with more content/elements
Color Analysis Integration
Pair ASCII structural data with RGB color samples for richer diagnosis:
IMG="$1"
# 1. Original dimensions
ffprobe -v error -select_streams v:0 -show_entries stream=width,height -of csv=p=0 "$IMG"
# 2. ASCII + stats + edges
ffmpeg -y -i "$IMG" -vf "scale=60:-1,format=gray" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| python3 scripts/ascii_viewer.py --stats --edges
# 3. Color info
echo "Average color (RGB hex):"
ffmpeg -y -i "$IMG" -vf "scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p | head -c 6
echo "Bottom region color:"
ffmpeg -y -i "$IMG" -vf "crop=iw/2:ih/4:iw/4:3*ih/4,scale=1:1,format=rgb24" -frames:v 1 -f rawvideo pipe: 2>/dev/null \
| xxd -p
Limitations
ASCII art is a mechanical fallback — it does NOT replace a vision model.
| Detects | Does NOT detect |
|---|---|
| Overall brightness (light vs dark scene) | Semantic meaning (what the subject is) |
| Contrast between regions | Color (everything is grayscale without xxd) |
| Texture (smooth vs detailed surface) | Legible text (only knows "something is there") |
| Lighting gradients (top-down, side, etc.) | Faces, emotions, or expressions |
| Edges and sharp transitions | Specific objects (person, cat, mask) |
| Spatial distribution of content | Depth, perspective, or real dimensions |
Good for:
- Checking if a generated image actually has content vs being blank
- Distinguishing between two variants of the same composition
- Detecting if there's text/detail in a specific region
- Confirming an image exists before sending it to the user
- Getting RGB color data from image regions
Not good for:
- Reading text (signs, screenshots, memes)
- Color-critical analysis (xxd helps but is coarse)
- Identifying objects, people, or animals
- Images with very fine detail (\x3C 2–3 pixels wide)
ASCII gives you structural data (brightness, texture, edges), not semantics. Like looking at a photo with your eyes closed — you can feel light and shadow, but you can't name what you see.
Common Pitfalls
- Brightness-only. You cannot distinguish red from blue if they have the same luminance — color information is lost (use xxd color sampling for that)
- Too-low width (e.g. 30) loses fine detail like small text. Stick to 60 minimum.
- Too-high width (e.g. 120+) produces ASCII that is illegible in a chat context — too wide to display cleanly.
- Smooth gradients render as solid bands of a single character. This is expected, not a bug.
- Not a vision replacement. ASCII art is a fallback when vision is unavailable, not a substitute. Always prefer the real tool when it works.
- ffmpeg not installed. Verify with
which ffmpegbefore attempting. Minimal Docker images may lack it. - Manual height mismatch. If you specify
--heightmanually, it must match the ffmpegscale=W:Houtput row count, or the ASCII will be misaligned.
Verification Checklist
- ffmpeg is installed (
which ffmpeg) - Script at
scripts/ascii_viewer.pyexists and is executable - Image path exists and is a valid image file
- Width is appropriate for the level of detail needed (60 default)
- Use
scale=W:-1in ffmpeg to auto-preserve aspect ratio (or match--heightif manual) - Output shows recognizable patterns, not just noise
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install ascii-vision - After installation, invoke the skill by name or use
/ascii-vision - Provide required inputs per the skill's parameter spec and get structured output
What is ASCII Vision?
Fallback image viewer when vision models are unavailable. Converts images to ASCII art via ffmpeg + Python for brightness distribution, texture analysis, edg... It is an AI Agent Skill for Claude Code / OpenClaw, with 147 downloads so far.
How do I install ASCII Vision?
Run "/install ascii-vision" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is ASCII Vision free?
Yes, ASCII Vision is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does ASCII Vision support?
ASCII Vision is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created ASCII Vision?
It is built and maintained by Christian de la Cruz (@chdlc); the current version is v1.2.1.