Description

Agentic AI video editor. One free-form `autonomous_edit` tool decomposes natural-language intents into a planned sequence of canonical edits; a deterministic...

README (SKILL.md)

OpenClaw AI Video Editor

Name: Adscene Video Editor
Author: brajendrak00068

A full agentic video editing surface. Send a prompt, get a planned, verified, executed edit — or call a known canonical action directly with structured params. Both modes flow through the same brain + safety gates as the in-product editor.

Endpoint

POST {ADSCENE_API_URL}/api/v1/misc/openclaw/v1/execute

Auth: Authorization: Bearer {ADSCENE_API_KEY}

Accepts either single-shot JSON (default) or SSE (Accept: text/event-stream or ?stream=true).

Request body:

{
  "tool": "autonomous_edit" | "\x3Callowlisted-tool>",
  "params": { ... },
  "project_id": "optional-project-id",
  "scene": { /* optional client scene; server-side committed scene wins if newer */ }
}

Two Modes

1. `autonomous_edit` — free-form prompt (primary)

Pass a natural-language description in params.prompt. The brain classifies the intent, decomposes into atomic steps, plans a DAG of canonical actions, executes through the safety gates, and verifies the result. Use this when the caller doesn't already know which low-level action(s) are needed.

curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "autonomous_edit",
    "params": {
      "prompt": "Make this a TikTok-ready viral clip: vertical reframe, add bold captions, remove silences, and apply motion tracking to the speaker."
    },
    "project_id": "my-project"
  }'

Behind a single autonomous_edit call the agent can compose any of:

Read / inspect

Inspect the timeline structure, layer properties, and current scene state
Run computer-vision analysis on frames (object/face/scene detection)
Query the transcript with keyword, semantic, or timestamp-window search
Pull video intelligence (narrative peaks, speaker diarization, sentiment, pacing)
Search the asset gallery (videos, images, audio) by type, duration, name
Poll async job status; introspect property schemas

Structural editing

Insert / update / replace / delete layers (video, audio, text, image, shape, group, adjustment)
Trim, split, retime layers (slow-mo 0.5×, fast-forward 2×, freeze-frame)
Reposition on the timeline, sequence layers, snap to transcript
Heal timeline gaps, normalize audio, reconcile durations (pre-export safety pass)
Multi-step undo / redo

Visual editing

Color grading (brightness, contrast, saturation, hue, lift/gamma/gain, RGB curves)
Procedural VFX shaders: smoke, dust, fire, explosion, lightning, snow, glitch, scanlines, grain, glassmorphism, bokeh, lava, telomere/corrosion, portal
Chroma key (green / blue screen) with similarity, smoothness, spill controls; luma / alpha / depth masks
Geometric clip shapes (circle, dome, star, hexagon, …)
Crop (absolute or edge-based), 3D rotation + perspective
Glow, shadow, inner shadow, gradient fills, text gradients
Vertical reframe (9:16) and vertical-reframe montage
Split screen (top/bottom, left/right, picture-in-picture)
Branding overlays (logo / watermark) from gallery or AI-generated
Motion / face tracking with dynamic zoom-follow

Captions & text

Auto-generate captions from transcript
Style captions with built-in templates or an AI director that picks/generates a custom template at runtime
Curved text paths (circle, wave, custom SVG)
Per-word entrance / exit animations (typewriter, slide, fade, scale, rotate, bounce, flip, swing, elastic, blur, glitch, wave, plus matching exits)
Lottie animation playback control

Audio

Clean audio: remove silences, breaths, filler words; word-level mute or cut
Auto-ducking on speech detection (sidechain music vs voice)
Mix / normalize / denoise / EQ (bass boost, vocal clarity, warm, bright)
Sync external master audio to video (offset, mute camera audio)
Beat / kick-drum-synced cuts (provide beat_times or bpm)
Add SFX, generate music (mood / genre / BPM), generate voiceover (TTS or cloned voice)
Render waveform visualizers (bars, wave, circular)

Async generation

AI video / B-roll — duration + aspect ratio
AI images — single or batch at timestamps
AI music — prompt + duration + mood + genre + BPM
AI voiceover — TTS or cloned voice library
Auto-thumbnail extraction
Face blur (all faces or background-only)
Image edit (generative instruction-based)

High-level kits (each is a single canonical action that orchestrates many underlying edits)

APPLY_VIRAL_KIT — vertical reframe + captions + silence removal + motion tracking + emphasis
APPLY_CINEMATIC_DIRECTOR — energy analysis + dynamic zooms + cinematic color grade + mood-based camera moves
APPLY_EMPHASIS_SYSTEM — keyword detection + scaling / glow / pulse coordinated with captions
OPTIMIZE_PACING — filler-word + silence + low-energy segment removal for retention

Export

EXPORT_VIDEO — render to MP4 (resolution / codec / quality tier)
GENERATE_VIRAL_CLIPS — auto-segment short-form clips packaged as ZIP
GENERATE_MULTI_PLATFORM — TikTok + Reels + Shorts + YouTube + Instagram aspect ratios in one pass

2. Deterministic tool allowlist — structured params, no planning

When the caller already knows the action and has structured params, use one of these tool names. The request is dispatched directly to the canonical action (skips intent decomposition / planning entirely):

Tool	Canonical action	Use when
`read_scene`	`READ_SCENE`	Inspect the current timeline
`read_media`	`QUERY_ASSETS`	List gallery assets
`read_visual`	`READ_VISUAL`	Run CV analysis on frames
`query_transcript`	`QUERY_TRANSCRIPT`	Search transcript by text or timestamp
`scene_update`	`UPDATE_LAYER`	Mutate a known layer's properties
`scene_insert`	`CREATE_LAYER`	Add a video / audio / text / image / shape layer
`scene_timing`	`SCENE_TIMING`	Trim, retime, reposition a layer
`scene_mask`	`APPLY_MASK`	Apply a chroma / luma / alpha / depth mask
`chroma_key`	`APPLY_MASK`	Convenience for green-screen / blue-screen keying
`split_screen`	`SPLIT_SCREEN`	Grid layout: top-bottom, left-right, PIP
`caption_compose`	`GENERATE_CAPTIONS`	Generate captions from transcript
`media_treat`	`COLOR_GRADE`	Apply color correction
`scene_track`	`TRACK_MOTION`	Face / object tracking with zoom-follow
`clean_audio`	`CLEAN_AUDIO`	Silence / breath / filler removal
`audio_mix`, `audio_mixing`	`AUDIO_MIXING`	Ducking, normalize, denoise, EQ
`voiceover_add`	`GENERATE_VOICEOVER`	Generate voiceover (text + voice_id)
`music_generate`	`GENERATE_MUSIC`	Generate background music
`export_video`	`EXPORT_VIDEO`	Render to MP4

Any unknown tool name returns 400 UNKNOWN_TOOL with a hint pointing at autonomous_edit.

Example — direct layer update, no planning round-trip:

curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "scene_update",
    "params": { "layer_id": "layer_123", "opacity": 0.5, "rotation": 15 }
  }'

Auto-export follow-up

After any mutating tool call (i.e., not read_*/query_* and not export_video itself), if the scene was actually changed and the agent did not already queue an EXPORT_VIDEO, the route fires one automatically as a second run. The final response carries videoUrl (when ready) or jobId (for polling). Read-only and conversational autonomous_edit calls do NOT trigger auto-export.

Optional input parameters (parity with the in-product editor)

Pass any of these inside params (or at the top level of the body) to drive advanced features:

prompt — required for autonomous_edit; ignored for deterministic tools (structured params win)
workingMemory — durable working-memory snapshot. Re-send to resume after awaiting_approval
requirePlanApproval — true makes the agent stop after planning and emit awaiting_approval; resume with the same workingMemory + an approval prompt ("yes", "approve", "do it", …)
attachedImages — array of base64 screenshots / reference images
flaggedIssues — array of strings describing specific problems the user wants fixed
captionTemplatePreset, captionTemplateMode — style preset routing for caption generation
core_only (also accepted via ?core_only=true) — return a minimal scene shape (rendering-only, no debug fields)
assets — additional asset descriptors to make available to the agent

Response shapes

JSON (default)

{
  "type": "success" | "partial_success",
  "tool": "\x3Ctool-name>",
  "success": true,
  "status": "completed" | "failed" | "awaiting_approval",
  "scene": { /* updated scene */ },
  "reply": "Human-readable summary of what changed",
  "videoUrl": "https://.../output.mp4",
  "jobId": "task_...",
  "viral_clips": [ /* clip metadata if generated */ ],
  "zip_url": "https://.../clips.zip",
  "activeTasks": [ /* queued background jobs */ ],
  "pendingAsyncJobs": [ /* in-flight job status */ ],
  "workflowStepsDetailed": [ /* every executed step */ ],
  "workflowSummary": { "title": "...", "summary": "..." },
  "verificationPassed": true,
  "verificationIssues": [],
  "committedToProjectScene": true,
  "processingTime": 12.3,
  "message": "Same as reply",
  "workingMemory": { /* return this in the next call to resume approval-paused runs */ }
}

Failure response (HTTP 4xx/5xx):

{ "success": false, "error": "...", "code": "UNKNOWN_TOOL" | "MISSING_TOOL" | "EXECUTION_ERROR" }

SSE (`Accept: text/event-stream` or `?stream=true`)

The same event stream the in-product editor uses. Notable event types:

heartbeat — every 15s, keeps the connection alive
status — phase transitions (request_received, runtime_start, …)
mode_select — { mode: "qa" | "action" }
thinking, tool_call, tool_result — per-step reasoning visibility
background_job_completed — async job finished (B-roll, viral clips, …)
workflow_completed — main brain loop done, verification may continue
success / partial_success — final terminal payload (same shape as JSON above)
error — terminal failure

Async job lifecycle

Generation actions (generate_*, EXPORT_VIDEO) return immediately with a jobId in activeTasks / pendingAsyncJobs. Poll status with:

GET {ADSCENE_API_URL}/api/v1/misc/openclaw/v1/jobs/{jobId}
Authorization: Bearer {ADSCENE_API_KEY}

Response:

{
  "success": true,
  "jobId": "task_xxx",
  "status": "queued" | "processing" | "completed" | "failed",
  "progress": 0.74,
  "message": "Rendering frame 142 of 192",
  "result": { /* artifact URLs / clip metadata on completion */ },
  "error": null,
  "createdAt": "...",
  "updatedAt": "..."
}

To pull async-generated content into the timeline once jobs settle, the agent uses APPLY_PENDING internally — autonomous_edit callers don't need to manage this, but direct callers can issue an autonomous_edit prompt like "apply any pending generated content" to harvest.

Plan approval flow

If you pass requirePlanApproval: true, the agent stops after planning and the response carries status: "awaiting_approval" + a populated workingMemory. To proceed, call again with:

{
  "tool": "autonomous_edit",
  "params": {
    "prompt": "yes",
    "workingMemory": { /* the workingMemory from the previous response */ }
  }
}

Accepted approval phrases: yes, y, approve, approved, go, proceed, go ahead, do it, confirm.

Safety, verification, and limits

Every run flows through three deterministic gates (ActionPermissionGate, ArchitectureControlPlane, EditorSafetyPolicy). Destructive actions (CLEAR, mass deletes) require explicit confirmation params. Verification runs after execution and may trigger up to 2 repair loops; failures surface in verificationPassed: false + verificationIssues[]. Concurrent identical requests for the same (user, project, prompt, scene fingerprint) are deduplicated server-side.

Rate-limited per API key. Processing times vary: read-only ~1–3s, structural edits ~3–10s, async generation 30s–5min per artifact, viral-clip / multi-platform exports several minutes.

Supported formats

Video in: MP4, MOV, WebM (HTTP/HTTPS URLs, YouTube URLs, gallery IDs)
Image in: JPG, PNG, WebP
Audio in: MP3, WAV, M4A, AAC (or extracted from video)
Output: MP4 (export), ZIP (viral clips / multi-platform bundles)
Max video length: depends on plan; soft limit ~30 min for synchronous edits, async generation handles longer
Recommended resolution: 1080p or 4K; canvas is configurable per project

Example: end-to-end viral-clip generation

# 1) Kick off the viral-clip pipeline (auto-export follow-up queues rendering)
curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "autonomous_edit",
    "params": {
      "prompt": "Generate 5 viral clips, 15-30 seconds each, focused on the most engaging moments. Add bold captions, vertical reframe, remove silences."
    },
    "project_id": "my-project"
  }' | tee /tmp/result.json | jq -r '.jobId // .activeTasks[0].intent.job_id'

# 2) Poll job status until done
JOB_ID=$(jq -r '.jobId // .activeTasks[0].intent.job_id' /tmp/result.json)
while true; do
  STATUS=$(curl -sS "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/jobs/$JOB_ID" \
    -H "Authorization: Bearer $ADSCENE_API_KEY" | jq -r '.status')
  echo "Status: $STATUS"
  [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ] && break
  sleep 5
done

# 3) Fetch the final artifact URL(s)
curl -sS "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/jobs/$JOB_ID" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" | jq '.result'

Usage Guidance

Review this skill before installing. Verify that the publisher fixes the README/SKILL mismatch and supplies any referenced code, do not fetch a missing handler.ts from an untrusted location, use a restricted API key, and test broad autonomous edits on non-critical video projects first.

Capability Tags

requires-sensitive-credentials

Capability Assessment

⚠ Purpose & Capability

The SKILL.md describes a broad autonomous video editor, while README.md describes a narrower background-replacement skill and a different API endpoint. That mismatch makes the actual installed capability unclear to a user.

ℹ Instruction Scope

The main autonomous_edit interface can translate a natural-language request into multiple project-mutating edits, including deletion and export. This is aligned with a video editor, but users should understand it is broad.

⚠ Install Mechanism

Registry data says this is instruction-only with no install spec and no code files, but README.md tells users to copy a handler.ts file and install npm dependencies. That creates a provenance and completeness gap.

ℹ Credentials

The ADSCENE_API_URL and ADSCENE_API_KEY requirements are expected for a remote provider API, but the bearer key should be treated as sensitive account/project authority.

ℹ Persistence & Privilege

No local persistence or background worker is shown, but edits appear to affect server-side project state via project_id and scene fields, which is expected for a video editor but should be scoped to intended projects.

Version History

v1.0.1

Endpoint moved to /api/v1/misc/openclaw/* to match ingress prefix routing

v1.0.0

Initial release — agentic AI video editor with autonomous_edit + deterministic tool allowlist + auto-export

Metadata

Slug adscene-video-editor

Version 1.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Adscene Video Editor?

Agentic AI video editor. One free-form `autonomous_edit` tool decomposes natural-language intents into a planned sequence of canonical edits; a deterministic... It is an AI Agent Skill for Claude Code / OpenClaw, with 98 downloads so far.

How do I install Adscene Video Editor?

Run "/install adscene-video-editor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Adscene Video Editor free?

Yes, Adscene Video Editor is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Adscene Video Editor support?

Adscene Video Editor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Adscene Video Editor?

It is built and maintained by brajendrak00068 (@brajendrak00068); the current version is v1.0.1.

More Skills

Adscene Video Editor