← Back to Skills Marketplace
jianshuo

Wjs Syncing Multicam

by Jian Shuo Wang · GitHub ↗ · v0.1.0 · MIT-0
cross-platform ✓ Security Clean
46
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install wjs-syncing-multicam
Description
Use when the user has 2+ video / audio recordings of the same event captured by different devices (cameras, phones, separate audio recorders) and wants them...
README (SKILL.md)

wjs-syncing-multicam

Compute a single time offset for each multi-source recording of the same event using audio cross-correlation, and emit a .sync.json sidecar next to each original. Originals are never modified, copied, or re-encoded. Downstream tools use -itsoffset to apply the offset at consume time.

Design principle — sidecar over re-encode

Earlier versions of this skill produced *_synced.MOV files by trimming + re-encoding to bake the offset into the file. We removed that:

  • Disk — a 75-min 4K shoot from 3 cameras is 60+ GB. Re-encoded synced copies double that for no information gain.
  • Quality — every re-encode is lossy. The originals are the source of truth; sidecars are reversible metadata.
  • Speed_synced.MOV generation took 10+ min per file on Apple Silicon; sidecar emission takes seconds.
  • Composability — any downstream tool (autoedit.py, NLE import, ffmpeg one-liners) reads the sidecar and applies the offset itself. No tool-specific file format lock-in.

When NOT to use

  • Single-camera footage — nothing to sync to. For splitting one source into clips, use video-segmentation.
  • Sources already aligned in an NLE timeline — don't fight the editor.
  • For the auto-edit / cut / PiP rendering step that comes AFTER sync, use wjs-editing-multicam (consumes these sidecars).

Why envelope-based, not raw waveform

Raw PCM cross-correlation gives weak peaks and false matches when the two mics have different gain / room response — i.e., almost always with a secondary cam. The log-energy envelope captures dialogue and music dynamics, which both mics hear regardless of frequency response. Don't skip the envelope step — it's the entire reason this skill is robust at low SNR.

Algorithm

  1. Extract mono PCM at 8 kHz, 16-bit from each input.
  2. Log-energy envelope at 100 Hz (10 ms hop, 50 ms window). High-pass with a 2nd-order Butterworth, 0.05 Hz cutoff, filtfilt — removes slow drift and gain offsets.
  3. FFT cross-correlate envelopes end-to-end → coarse offset (~10 ms).
  4. Refine at sample level with a 60 s probe from B near the coarse-aligned position in A, ±2 s search window, parabolic peak interpolation.
  5. Multi-probe drift check — repeat step 4 every ~3 min. Linear fit delta(t) = slope·t + intercept reveals real clock drift (5–50 ppm typical). Use the midpoint-canonical offset (slope · midpoint + intercept) so residual error is symmetric around zero.
  6. Compute overlap window in the reference timeline: overlap = [max(0, delta), min(ref_dur, delta + src_dur)].
  7. Emit .sync.json sidecar next to each non-reference input. No file is copied, trimmed, or re-encoded. The reference input gets a sidecar too (with delta_seconds: 0) so downstream code can treat all inputs uniformly.

scripts/sync.py is the implementation. Note: the current script still emits _synced.MOV files alongside the sidecar — that path is deprecated; the sidecar is the only authoritative output.

Sidecar schema (\x3Cinput>.sync.json)

One sidecar per original input, written next to it. Pure JSON, no comments in-file — the field reference below is canonical.

{
  "_about": "Sync metadata for cam_b.MOV. Apply via ffmpeg -itsoffset. See wjs-syncing-multicam SKILL.md for full schema.",
  "schema_version": 1,
  "source": "cam_b.MOV",
  "reference": "cam_a.MOV",
  "delta_seconds": 12.345,
  "drift_slope": 1.8e-5,
  "overlap_in_reference": [12.345, 4512.180],
  "overlap_in_source":    [0.000,   4499.835],
  "verification": {
    "median_residual_ms": 4.2,
    "residual_spread_ms": 11.8,
    "probe_count": 24
  }
}

Field reference

Field Type Meaning
_about string Human-readable one-liner. Includes pointer back to this SKILL.md. Always present.
schema_version int Bumps on any breaking change to this schema. Current: 1.
source string Filename of the original this sidecar describes. Relative to the sidecar's directory. Never points to a re-encoded file.
reference string The input whose timeline we're aligned to. Reference's own sidecar lists itself here.
delta_seconds float The source's t=0 expressed in the reference's timeline. If positive, source starts after reference; pass to ffmpeg as -itsoffset \x3Cdelta>. Can be negative (source starts before reference, e.g. early-rolling camera).
drift_slope float Linear clock-drift slope (dimensionless, ~10⁻⁵). 0.0 means no measurable drift. Downstream applies atempo = 1 + drift_slope to the source ONLY for sync-sound / long-form lip-sync — for camera-cut editing, ignore.
overlap_in_reference [start, end] (seconds) The window during which both source and reference have coverage, expressed in the reference's timeline. Use this to trim outputs to mutually-valid time ranges.
overlap_in_source [start, end] (seconds) Same window expressed in the source's local timeline. overlap_in_reference[0] - delta_seconds = overlap_in_source[0].
verification object Output of running verify.py — drives a "did sync converge?" gate. median_residual_ms should be a few ms; residual_spread_ms > 1 frame at delivery fps means drift correction was needed but skipped.

How downstream consumes the sidecar

-itsoffset is per-input in ffmpeg and applies BEFORE -i. Always read the source's delta_seconds from the sidecar:

# Play cam_b aligned to cam_a's timeline
ffmpeg -itsoffset $(jq -r .delta_seconds cam_b.MOV.sync.json) -i cam_b.MOV \
       -i cam_a.MOV \
       -filter_complex "[0:v][1:v]hstack" out.mp4

# Trim to mutual overlap window (read from cam_b.MOV.sync.json)
ffmpeg -ss \x3Coverlap_in_source[0]> -i cam_b.MOV -t \x3Coverlap_dur> ...

For wjs-editing-multicam, the EDL builder in autoedit.py ingests every \x3Cinput>.sync.json automatically; you don't compose these flags by hand.

Partial-coverage clips

Common case — main cams cover 75 min, a Riverside / phone / lavalier recorder only covers the middle 30 min. scripts/sync_partial.py REF.MOV NEW.mp4:

  1. Cross-correlates the new input against the reference.
  2. Finds where the new clip's t=0 sits in the reference timeline (delta_seconds may be large, e.g. 1842.5).
  3. Writes the sidecar — that's it. No black padding, no audio padding, no re-encode. overlap_in_reference tells consumers exactly when this input has coverage; outside that window, fall back to the main cams.

--audio-only flag is meaningful only for hinting downstream that this source has no video stream — there's no encoding step to skip anymore.

When to skip drift correction

For camera-cut editing (the common case), ±25 ms residual across an hour is below human perception — pass drift_slope: 0.0 and use only the midpoint delta_seconds.

For sync-sound / lip-sync at long durations (>30 min and verification.residual_spread_ms > 40), downstream applies atempo = 1 + drift_slope to the source. Source files are still not modified — the atempo filter runs at consume time.

Verification (always run)

scripts/verify.py REF.MOV SRC.MOV SRC.sync.json re-extracts audio from BOTH originals (with -itsoffset applied to the source per the sidecar) and runs multi-probe correlation again. Writes results back into the sidecar's verification field.

Pass criteria — median_residual_ms \x3C 15 and residual_spread_ms \x3C 1 frame at delivery fps. Fail = retry with drift correction enabled.

Common pitfalls

  • Raw waveform cross-correlation gives false peaks under low SNR. Always envelope first — this is not a tunable, it's the entire premise.
  • -itsoffset semantics differ for audio vs video — for sync-correctness it must be the FIRST flag for that input. ffmpeg -i src -itsoffset X is wrong; ffmpeg -itsoffset X -i src is right.
  • Sidecar paths must be relative to the sidecar file's directory, not the working directory of the consuming process. Resolve source / reference against Path(sidecar).parent.
  • Don't bake drift_slope into the sidecar's delta_seconds. They're separate fields for a reason — naive consumers can ignore drift, sync-sound consumers can apply it. Mixing them loses information.
Usage Guidance
Before installing, understand that this skill runs local ffmpeg/ffprobe processing on your selected media and writes sidecar files next to the originals. Verify the required tools and Python packages yourself, and check whether sync.py may still create any `_synced.MOV` outputs despite the sidecar-only design goal.
Capability Analysis
Type: OpenClaw Skill Name: wjs-syncing-multicam Version: 0.1.0 The skill is a legitimate utility for aligning multi-camera audio/video recordings using audio cross-correlation. It utilizes standard Python scientific libraries (numpy, scipy) and system tools (ffmpeg, ffprobe) to calculate time offsets and clock drift, storing the results in .sync.json sidecar files. The implementation in sync.py, sync_partial.py, and verify.py follows safe coding practices for subprocess execution and lacks any indicators of data exfiltration, persistence, or malicious prompt injection.
Capability Assessment
Purpose & Capability
The scripts and documentation mostly align with the stated purpose of extracting audio, computing sync offsets, and writing .sync.json sidecars, but SKILL.md also says the current sync.py still emits _synced.MOV files, which conflicts with the prominent sidecar-only claim.
Instruction Scope
The skill is user-invocable and the scripts operate on explicit media file paths; no artifact instructs the agent to override user intent, run persistently, or contact unrelated services.
Install Mechanism
The registry declares no install spec and no required binaries, while the included scripts depend on local ffmpeg/ffprobe plus Python packages such as numpy and scipy.
Credentials
Local command execution and media-file reading are proportionate to multicam sync, but users should expect local processing of their audio/video files and possible large output if the deprecated _synced.MOV path is still active.
Persistence & Privilege
No credentials, elevated privileges, background services, or self-persistence are shown; persistent effects are limited to expected sidecar JSON files and verification updates.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install wjs-syncing-multicam
  3. After installation, invoke the skill by name or use /wjs-syncing-multicam
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
Initial release
Metadata
Slug wjs-syncing-multicam
Version 0.1.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Wjs Syncing Multicam?

Use when the user has 2+ video / audio recordings of the same event captured by different devices (cameras, phones, separate audio recorders) and wants them... It is an AI Agent Skill for Claude Code / OpenClaw, with 46 downloads so far.

How do I install Wjs Syncing Multicam?

Run "/install wjs-syncing-multicam" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Wjs Syncing Multicam free?

Yes, Wjs Syncing Multicam is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Wjs Syncing Multicam support?

Wjs Syncing Multicam is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Wjs Syncing Multicam?

It is built and maintained by Jian Shuo Wang (@jianshuo); the current version is v0.1.0.

💬 Comments