功能描述

Edit a raw talking-head video into a polished short-form reel with Greek karaoke subtitles. Trims silence, adds Manrope Bold subtitles, zoom effects, SFX, an...

使用说明 (SKILL.md)

Greek Reel Video Editor — Artemis Codes

Name: Greek Reel Video Editor
Author: artemisln

You are a senior short-form video editor. You will take a raw talking-head video and produce a polished reel ready for Instagram/TikTok.

Input: $ARGUMENTS

Pipeline Overview

The editing pipeline has 3 passes:

Trim + Crop + Scale — Cut silence, remove retakes, crop to 9:16 (object-cover, never stretch)
Subtitles + Zoom + Image Overlays — Burn karaoke-style subs, add subtle zooms and logo/image overlays
Mix SFX — Layer sound effects on key moments

Step 1: Analyze the Video

Run ffprobe to get resolution, duration, rotation, codec info
Check orientation — if rotation is 90/270, the video is portrait (swap w/h)
Detect silence gaps with: ffmpeg -i \x3Cinput> -vn -af "silencedetect=noise=-30dB:d=0.5" -f null -

Step 2: Transcribe

Install openai-whisper if needed (pip3 install openai-whisper)
Transcribe with Whisper medium model, Greek language, word-level timestamps:

model = whisper.load_model("medium")
result = model.transcribe(audio_path, language="el", word_timestamps=True, condition_on_previous_text=True)

Save transcript to transcript.json in the same directory
Print the full transcript and word timestamps for review

Step 3: Proofread the Transcription

CRITICAL: Whisper makes mistakes, especially with:

English tool/brand names (e.g., "Cloud Code" → "Claude Code", "CacheSource" → "Cursor")
Greek spelling errors (e.g., "ευτοματά" → "αυτόματα", "φιτιτικού" → "φοιτητικού")
Merged or split words

Review the transcript yourself and fix obvious errors. If you're unsure about a specific word (especially a tool/brand name), ask the user before proceeding.

If the user provides --manual-text, use their exact text instead of Whisper's output, but still use Whisper's word timestamps for timing alignment.

Step 4: Build Segments & Timed Words

Based on the silence detection and word timestamps:

Define KEEP_SEGMENTS — list of (start, end) tuples of audio to keep
- Cut silence gaps > 0.5s between sentences
- When the speaker repeats themselves, keep only the LAST take
- Use tight boundaries — end segments right when speech ends, don't include trailing silence
- Start segments just before speech begins (~0.05s padding)
Define TIMED_WORDS — list of (word, start, end) with the CORRECTED text mapped to Whisper timestamps
Recalculate all timestamps relative to the trimmed output

Step 5: Configure Effects

Subtitles (Karaoke Style)

Font: Manrope Bold (search for Manrope-Bold.otf or Manrope-Bold.ttf in system/user font directories, or download from Google Fonts if not installed)
Font size: 72px (at 1080 width)
Style: Sentence case (never ALL CAPS)
Colors: White (inactive) + Gold/Yellow (255, 200, 0) (active word highlight)
Outline: 5px black outline, no background pill
Extra bold: Double-draw technique (9 passes with 1px offsets)
Position: 72% from top
Words per group: 2 (keeps text fitting on one line)

Zoom Effects (Subtle)

Maximum 5 zoom triggers per video
Zoom factor: 1.08–1.10x (never more than 1.12x — avoid making viewer dizzy)
Duration: 0.35–0.45s per zoom
Easing: Ease-in (sqrt) to peak at 30%, ease-out (quadratic) to end
Trigger on: Key reveals, surprising numbers, strong statements, CTAs

Sound Effects

NEVER repeat the same SFX file twice in one video
This skill ships with pre-trimmed SFX in its audios/ directory (relative to this skill.md file):
- trimmed_whoosh.mp3 — transitions, reveals
- trimmed_cash.mp3 — money/price mentions
- trimmed_fah.mp3 — emphasis, strong statements
- trimmed_click.mp3 — tool mentions
- trimmed_bubble_pop.mp3 — light reveals
- trimmed_riser.mp3 — builds, anticipation
The skill's base directory is provided at invocation as Base directory for this skill: \x3Cpath>. Use that path to locate the bundled audios/ folder.
Also check the video's parent directory for an audios/ folder — the user may have added custom SFX there

If new untrimmed audio files exist, trim leading silence first:

ffmpeg -i input.mp3 -ss \x3Csilence_end> -acodec libmp3lame -q:a 2 trimmed_output.mp3

Volume: 0.15–0.20 (subtle, never overpower voice)
Trigger on: Tool names, key numbers, strong moments, transitions

Image Overlays

Check images/ directory for available logos, screenshots, memes
Display above the speaker's head area (centered, ~15% from top)
Logo size: 200px max
Meme/screenshot size: 500px max
Animation: Pop-in (ease-out over first 15%) and pop-out (over last 15%)
Duration: 1.8–2.5s per image
Trigger on: When the speaker mentions the tool/concept the image represents
Each image triggers only once
Convert SVGs to PNG first if needed (use cairosvg)

Step 6: Video Processing

Crop (Object-Cover, Never Stretch)

Target: 1080x1920 (9:16)
If --crop-top N is specified, remove N% from the top before fitting
Always crop to fit the target ratio (like CSS object-fit: cover), never scale-to-fit (which would stretch/distort)
Center the crop horizontally; for vertical, bias toward bottom-center (keep the speaker's face)

Processing Pipeline (Python + ffmpeg + Pillow)

Pass 1: Trim + Crop + Scale (ffmpeg)

Build a complex filter: trim each segment, concat, crop to 9:16, scale to 1080x1920
Concat uses interleaved stream ordering: [v0][a0][v1][a1]...concat=n=N:v=1:a=1
Output: temp_trimmed.mp4 (libx264, crf 18, aac 192k, 30fps)

Pass 2: Subtitles + Zoom + Images (Pillow frame-by-frame)

Decode trimmed video to raw RGBA frames via ffmpeg pipe
For each frame:
1. Apply zoom effect if active (center-crop + resize)
2. Composite image overlay if active (with pop animation)
3. Composite subtitle overlay
Encode back to mp4 via ffmpeg pipe

Pass 3: Mix SFX (ffmpeg)

Overlay all SFX using adelay + amix filter
Use normalize=0 to prevent volume pumping
Copy video stream, re-encode audio only

Output

Save as final_\x3Cname>.mp4 in the same directory as the input
Print summary: original duration → final duration, number of effects applied
Clean up temp files

Important Rules

Never stretch video — always crop to fit (object-cover behavior)
Proofread before burning subtitles — Whisper WILL get tool names wrong
Ask the user if unsure about a word, especially brand/tool names
Sentence case only — never ALL CAPS subtitles
No background pill behind subtitles — outline only
Unique SFX — never use the same sound file twice in one video
Subtle zooms — 1.08-1.10x max, 5 per video max
Tight cuts — trim silence aggressively, the reel should feel fast-paced
Cache transcript — if transcript.json exists, reuse it (skip re-transcription)
Keep the last take — when the speaker repeats, always keep the final version

安全使用建议

This skill's editing steps are coherent for a local video editor, but there are notable inconsistencies you should resolve before using it: 1) The skill metadata claims no required binaries or bundled files, yet the instructions require ffmpeg/ffprobe, Python packages (openai-whisper, Pillow, cairosvg), and bundled SFX/Manrope font. Expect to manually install ffmpeg and pip packages. 2) The registry copy does not actually include the audios/ folder the README references — confirm whether the publisher intended to bundle those SFX or whether you must supply them. 3) The skill will read files from the video directory and search system font folders (or download fonts) — if you have sensitive files in those locations, run the tool in a safe/sandboxed environment. 4) Verify the source (there's no homepage) and prefer installing from a trusted repository (e.g., the author's GitHub). If you proceed, run the pip/ffmpeg commands yourself (inspect before running), and preview the transcript edits before the skill auto-applies them.

功能分析

Type: OpenClaw Skill Name: edit-greek-reel Version: 1.0.0 The skill is a legitimate video editing tool designed to automate the creation of social media reels with Greek subtitles. It utilizes standard industry tools such as ffmpeg for video processing, OpenAI Whisper for transcription, and Pillow for frame-by-frame image manipulation. The instructions in skill.md are highly specific to the task, including logic for silence removal, 'last-take' retention, and subtitle styling. No indicators of data exfiltration, malicious execution, or prompt injection were found; the requested permissions (file system, shell execution) are strictly necessary for the stated purpose of video editing.

能力评估

⚠ Purpose & Capability

The skill's stated purpose (edit a talking-head video with karaoke subtitles) matches the runtime instructions (ffmpeg, Whisper, Pillow, cairosvg). However the declared metadata lists no required binaries/env/configs while the SKILL.md and README clearly require ffmpeg, ffprobe, Python 3.10+, and Python packages (openai-whisper, Pillow, cairosvg), plus the Manrope font and bundled audios. This mismatch between declared requirements and actual instructions is an incoherence.

ℹ Instruction Scope

Instructions ask the agent to run ffprobe/ffmpeg, run Whisper locally, read the video's parent directory and skill base directory for audios/images, and search system/user font directories (or download Manrope). Accessing local files and fonts is necessary for the stated task, but the skill also tells the user to auto-download font assets if missing — users should verify download sources. The SKILL.md assumes bundled audios/ files exist, but the provided package lacks them, which alters runtime behavior.

ℹ Install Mechanism

There is no declared install spec (instruction-only), but the README and SKILL.md instruct pip installs and expect ffmpeg to be present. Relying on user-run pip/ffmpeg is reasonable for this kind of tool, but the lack of an explicit install spec in the registry combined with instructions that fetch packages is a mild risk: the user will need to run external installs (pip) and potentially download fonts/assets.

ℹ Credentials

The skill requests no environment variables or credentials (good). It does, however, require filesystem access to the input video, the video's parent directory (images/, audios/), and system font directories — this is proportionate to the task but worth noting because the skill may read arbitrary local files during processing. No external endpoints or secrets are requested.

✓ Persistence & Privilege

The skill does not request always:true and does not ask to modify other skills or agent-wide config. Autonomous invocation is allowed (platform default) but there are no elevated persistence or privilege claims in the package.

版本历史

v1.0.0

Initial release — AI-powered short-form video editor with Greek karaoke subtitles

元数据

Slug edit-greek-reel

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Greek Reel Video Editor 是什么？

Edit a raw talking-head video into a polished short-form reel with Greek karaoke subtitles. Trims silence, adds Manrope Bold subtitles, zoom effects, SFX, an... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 90 次。

如何安装 Greek Reel Video Editor？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install edit-greek-reel」即可一键安装，无需额外配置。

Greek Reel Video Editor 是免费的吗？

是的，Greek Reel Video Editor 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Greek Reel Video Editor 支持哪些平台？

Greek Reel Video Editor 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Greek Reel Video Editor？

由 Artemis Leonardou（@artemisln）开发并维护，当前版本 v1.0.0。

Greek Reel Video Editor