← 返回 Skills 市场
hyuuuliu

video-overlay-cleanup-agent

作者 Haroldle · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ✓ 安全检测通过
121
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install cleanup-video-overlay
功能描述
Use this skill when the user wants to clean a video or screen recording by removing overlays such as status bars, notification banners, floating controls, su...
使用说明 (SKILL.md)

Video Overlay Cleanup

Repository: https://github.com/hyuuuliu/video-overlay-cleanup-skill

Overview

Use this skill to turn a raw video or screen recording into a cleaner version with fewer visible overlays.

This skill is a workflow skill, not a claim of magical ground-truth recovery. It is best for:

  • fixed overlays at stable positions
  • short-lived banners or floating UI with a known mask
  • cleaning screen recordings for editing or redistribution
  • preparing a frame-by-frame restoration job for Gemini Nano Banana 2

This skill should frame the task as video cleanup or overlay removal, not guaranteed factual restoration of hidden content.

Provider Support

This version supports one real model provider for generative frame repair:

  • Gemini Nano Banana 2 via the Gemini image generation and editing API

Before using gemini-nano-banana mode, the user must configure an API key:

export GEMINI_API_KEY='your_api_key_here'

Optional model override:

export GEMINI_MODEL='gemini-3.1-flash-image-preview'
export GEMINI_IMAGE_SIZE='1K'

Runtime Requirements

Declare these explicitly when installing or reviewing the skill:

  • required binaries:
    • bash
    • ffmpeg
    • ffprobe
    • python3
  • required credential for Gemini mode:
    • GEMINI_API_KEY
  • optional Gemini env vars:
    • GEMINI_MODEL
    • GEMINI_IMAGE_SIZE
  • required Python packages for Gemini mode:
    • google-genai
    • Pillow

Read references/gemini-provider.md when the user wants the Gemini path.

Best-Fit Requests

Use this skill when the user asks for any of the following:

  • remove a status bar, top bar, bottom bar, subtitle strip, watermark, or floating control from a video
  • clean a phone screen recording before reuse or clipping
  • split a video into frames, repair masked regions, and rebuild it
  • prepare a repeatable FFmpeg pipeline for video overlay removal
  • use Gemini Nano Banana 2 to repair a masked overlay region frame by frame
  • generate masks or region specs for known overlay areas

Default Strategy

Choose the lightest path that can work:

  1. Fixed overlay, stable region: use ffmpeg with a generated mask and removelogo
  2. Fixed region but removelogo quality is not enough: extract frames and run Gemini Nano Banana 2 on the masked area
  3. Dynamic overlay: extract frames, create or propagate masks, run Gemini Nano Banana 2, then rebuild the video

Do not default to full-frame generative editing when a fixed-region or deterministic method is enough.

Workflow

1. Classify the overlay

Decide which of these cases applies:

  • fixed-edge-overlay: top status bar, bottom nav bar, subtitle strip, corner logo
  • fixed-box-overlay: stable watermark or floating widget in one area
  • dynamic-overlay: notification banner, moving sticker, transient floating control
  • unknown: the user has not yet provided enough detail to define a mask

If the overlay location is unclear, ask for the smallest missing detail needed, or propose a reasonable first mask and label it as a draft.

2. Pick the path

For fixed-edge-overlay or fixed-box-overlay, prefer the built-in scripts:

  • scripts/make_mask.py
  • scripts/clean_video.sh --mode removelogo

For Gemini-based workflows, use:

  • scripts/clean_video.sh --mode gemini-nano-banana
  • scripts/gemini_nano_banana_edit.py

For low-level or custom provider workflows, use:

  • scripts/extract_frames.sh
  • scripts/restore_frames.py
  • scripts/rebuild_video.sh

Read references/pipeline.md for the end-to-end flow, references/mask-strategies.md for mask design guidance, and references/gemini-provider.md for Gemini-specific behavior.

3. Build the mask carefully

Masks decide success. Keep these rules:

  • Mask only the overlay, not the whole frame
  • Prefer a slightly tight mask over an oversized one
  • For fixed bars, use presets or percentage-based regions
  • For dynamic overlays, keep the frame-edit scope local and explicit

Region format used by this skill is x:y:w:h. Each value may be pixels like 120 or a percentage like 8%.

Examples:

  • Top 7 percent: 0:0:100%:7%
  • Bottom strip: 0:92%:100%:8%
  • Corner watermark: 78%:3%:18%:12%

4. Run the cleanup path

Fast fixed-overlay path

Use when the overlay stays in one place.

bash scripts/clean_video.sh \
  --input /abs/path/input.mp4 \
  --output /abs/path/output.mp4 \
  --mode removelogo \
  --region '0:0:100%:7%'

You can also add presets:

bash scripts/clean_video.sh \
  --input /abs/path/input.mp4 \
  --output /abs/path/output.mp4 \
  --mode removelogo \
  --preset iphone-status-bar

Gemini Nano Banana 2 path

Use when the user wants masked per-frame restoration.

export GEMINI_API_KEY='your_api_key_here'

bash scripts/clean_video.sh \
  --input /abs/path/input.mp4 \
  --output /abs/path/output.mp4 \
  --mode gemini-nano-banana \
  --region '0:0:100%:12%' \
  --image-size 1K

Optional model override:

bash scripts/clean_video.sh \
  --input /abs/path/input.mp4 \
  --output /abs/path/output.mp4 \
  --mode gemini-nano-banana \
  --region '0:0:100%:12%' \
  --model gemini-3.1-flash-image-preview \
  --image-size 2K

Custom frame-edit path

Use only when the user explicitly wants a custom editor command instead of Gemini.

bash scripts/clean_video.sh \
  --input /abs/path/input.mp4 \
  --output /abs/path/output.mp4 \
  --mode frame-edit \
  --region '0:0:100%:12%' \
  --editor-cmd 'my-editor --input {input} --mask {mask} --output {output}'

The editor command receives:

  • {input}: source frame path
  • {mask}: mask path
  • {output}: destination frame path
  • {index}: zero-based frame number

5. Be explicit about limits

Always note when one of these is true:

  • the hidden content was never visible in any frame
  • the result is a visual reconstruction rather than factual recovery
  • the overlay is dynamic and the mask is approximate
  • frame-by-frame edits may need temporal stabilization beyond this first pass
  • Gemini may vary slightly frame to frame, especially on large masked regions
  • Gemini mode now reports frame count, base cost estimate, and cumulative token/spend progress while running
  • Gemini mode now preserves all unmasked pixels from the original frame at save time, so only masked regions are adopted from model output

Known Limitations

Keep the risk statement simple:

  • masked or covered areas are not restored reliably every time
  • the result is a plausible visual cleanup, not guaranteed true recovery
  • Gemini processing can be expensive, especially on longer videos or higher frame counts

Current Reliability Notes

The current Gemini implementation uses several safeguards:

  • budget estimation before processing begins
  • cumulative token and spend reporting during processing
  • per-frame validation and retry
  • request-level retry for transient provider errors
  • masked compositing that keeps all unmasked pixels from the original frame

That last safeguard is important: if the model edits the hamster, cage, or other visible scene content outside the white mask, the saved output still keeps the original unmasked pixels. This improves visual stability, but it also means all useful edits must happen inside the supplied mask.

Output Expectations

A good result includes:

  • the cleaned video path
  • the chosen method: removelogo, gemini-nano-banana, or frame-edit
  • the exact mask or presets used
  • whether GEMINI_API_KEY was required
  • the frame count that Gemini mode planned to process
  • the base estimated spend before Gemini calls began
  • the cumulative token and spend summary reported during processing
  • any important caveats about realism, consistency, or residual artifacts

When useful, also keep the work directory so the user can inspect:

  • extracted frames
  • generated mask files
  • rebuilt intermediate video
  • job manifest

Quality Bar

  • Prefer stable, repeatable cleanup over ambitious but noisy edits
  • Avoid editing untouched parts of the frame
  • Use deterministic FFmpeg paths first
  • Treat generative editing as a targeted fallback, not the default
  • Separate cleaner-looking from factually restored
  • Remind the user when a provider API key is required

Prompt Patterns

Good invocations:

  • "Use $video-overlay-cleanup to remove the top status bar from this screen recording."
  • "Use $video-overlay-cleanup with Gemini Nano Banana 2 to clean the top banner from this video."
  • "Use $video-overlay-cleanup to split this video into frames, clean the floating overlay, and rebuild it."

For implementation details, load:

安全使用建议
This skill appears internally coherent for cleaning video overlays, but consider the following before installing or running: 1) GEMINI_API_KEY is required for per-frame generative edits — use an account with appropriate billing limits because per-frame API calls can be expensive. 2) The frame-edit path executes an 'editor command' for every frame; ensure the editor_cmd you supply (or the provided gemini wrapper) is trusted because restore_frames.py runs that command via subprocess. 3) The skill manipulates local files and will create and (by default) delete a workdir; run it in a controlled directory and use --keep-workdir while testing. 4) Ensure google-genai and Pillow are installed in your environment before using Gemini mode. 5) Inspect the scripts (they are included) and test with short/clipped videos first to validate quality and cost. If you need stricter isolation, run the skill in a sandboxed environment or with a limited API key and low concurrency.
功能分析
Type: OpenClaw Skill Name: video-overlay-cleanup-agent Version: 1.0.1 The skill bundle provides a legitimate workflow for video overlay removal using FFmpeg and the Gemini API. It includes a complete pipeline for frame extraction, mask generation, AI-powered frame editing, and video reconstruction (clean_video.sh, gemini_nano_banana_edit.py). The implementation follows security best practices by using shlex for safe shell command construction and includes a 'self-improvement' mechanism that tracks token usage and provides execution advice for subsequent runs. No evidence of data exfiltration, unauthorized network access, or malicious persistence was found; specific references to 'hamsters' in the AI prompts appear to be remnants of development testing.
能力评估
Purpose & Capability
Name, description, required binaries (ffmpeg/ffprobe/python3/bash) and the declared primary credential (GEMINI_API_KEY) match the workflow: FFmpeg for frame extraction/rebuild and an image-editing provider for per-frame generative repair. Declared Python packages (google-genai, Pillow) are exactly what the Gemini wrapper and image processing code use.
Instruction Scope
SKILL.md and the scripts only instruct the agent to read/write local video/frame files, call ffmpeg, and call the Gemini wrapper when requested. The instructions reference only the declared env vars and the skill's workdir files (mask, frames, usage, manifest). The frame-edit path does run external editor commands per-frame, which is expected for a pluggable editor workflow.
Install Mechanism
There is no install spec (instruction-only/packaged scripts). No external downloads or archive extraction are performed by the skill itself, so nothing opaque is pulled from the network during install.
Credentials
Only GEMINI_API_KEY (primaryEnv) is required for the Gemini editing path; optional GEMINI_MODEL and GEMINI_IMAGE_SIZE are declared. No unrelated credentials or system config paths are requested.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or system-wide settings. It writes local run artifacts and a persistent_learnings.json inside the skill tree, which is consistent with its stated behavior.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install cleanup-video-overlay
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /cleanup-video-overlay 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
**This update adds explicit runtime requirements and metadata for easier installation and compatibility checks.** - Declares required binaries (bash, ffmpeg, ffprobe, python3) and Python packages (google-genai, Pillow) in metadata. - Documents environment variables needed for Gemini Nano Banana 2 editing, including a required API key. - Specifies homepage and tool permissions (Read, Write, Edit, Bash, Glob, Grep). - No code or workflow logic was changed.
v1.0.0
video-overlay-cleanup-agent 1.0.0 Initial release with core video overlay removal and restoration workflow. - Supports cleaning videos or screen recordings by removing overlays (status bars, banners, controls, subtitle strips, watermarks). - Offers robust FFmpeg-based region masking and removelogo automation for fixed overlays. - Integrates frame-by-frame masked restoration via Gemini Nano Banana 2, with clear guidance for API setup and use. - Provides detailed mask design strategies and workflow branching for fixed vs. dynamic overlays. - Adds budget estimation, process transparency, and visual safeguards for generative edits. - Clearly communicates limitations and output expectations, focusing on plausible visual cleanup rather than guaranteed ground-truth recovery.
元数据
Slug cleanup-video-overlay
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

video-overlay-cleanup-agent 是什么?

Use this skill when the user wants to clean a video or screen recording by removing overlays such as status bars, notification banners, floating controls, su... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 121 次。

如何安装 video-overlay-cleanup-agent?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install cleanup-video-overlay」即可一键安装,无需额外配置。

video-overlay-cleanup-agent 是免费的吗?

是的,video-overlay-cleanup-agent 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

video-overlay-cleanup-agent 支持哪些平台?

video-overlay-cleanup-agent 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 video-overlay-cleanup-agent?

由 Haroldle(@hyuuuliu)开发并维护,当前版本 v1.0.1。

💬 留言讨论