Description

Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key c...

README (SKILL.md)

Music Analysis (Local, No External APIs)

Name: Music Analysis
Author: adam-researchh

Primary tool: a full listen that combines snapshot analysis, structure, groove, harmonic tension, temporal mood mapping, and optional Whisper lyric alignment into one report.

1. Full Listen — primary / recommended

python3 skills/music-analysis/scripts/listen.py /path/to/audio.mp3
python3 skills/music-analysis/scripts/listen.py track.mp3 --json
python3 skills/music-analysis/scripts/listen.py track.mp3 --out report.txt
python3 skills/music-analysis/scripts/listen.py track.mp3 --json --out report.json

What it does in one pass:

Snapshot analysis: tempo, pulse stability, swing proxy, key clarity, harmonic tension, timbre, structure
Whisper lyric transcription and filtering first — keep only real lyric text, drop artifact tags like [MUSIC]
Temporal listen: windowed energy / mood / tension journey
Synthesis layer that aligns lyrics with peak / tension / quiet windows and lets the lyric layer override the final vibe when confidence is high

Human-readable output structure

SNAPSHOT
- groove/pocket
- structure summary + repeated sections
- harmony (key clarity + tension)
- timbre descriptor tags
INSTRUMENT READ
- likely instrument palette (strong/likely/possible confidence)
- per-section instrument entrances and exits
- how instruments color the emotional feel
- written as natural language, not clinical data
TEMPORAL JOURNEY
- opening / middle / closing mood-energy-tension read
- peak / quietest / tensest moments
- mood journey and transition count
EMOTIONAL READ
- explainable emotion summary based on measured features
LYRICS
- Whisper segment count
- excerpt or graceful skip note
SYNTHESIS
- lyric-energy/tension alignment
- peak / tension / quiet lyric moments
ALIGNED TIMELINE
- per-window moments where transitions / lyrics / tension spikes occur

2. Snapshot Analysis — standalone

python3 skills/music-analysis/scripts/analyze_music.py /path/to/audio.mp3
python3 skills/music-analysis/scripts/analyze_music.py track.mp3 --json

Reports:

tempo / pulse stability / pulse confidence / swing proxy / pocket
key estimate / key clarity / chroma entropy / harmonic change / tonal motion / tension
timbre descriptors (brightness, richness, low-end, contrast, dynamic range)
section labels (A/B/C...) and repeated material detection
explainable emotional read with reasons

3. Temporal Listen — standalone

python3 skills/music-analysis/scripts/temporal_listen.py /path/to/audio.mp3
python3 skills/music-analysis/scripts/temporal_listen.py track.mp3 --json

Reports:

sliding-window timeline (4s windows, 2s hops)
energy contour
mood labels
harmonic tension + tonal motion
transition types (drop hits, pulls back, tightens harmonically, shifts color, evolves)
narrative arc (mountain / ascending / descending / plateau / wave)

Interpretation rules

Structure labels are similarity labels, not verse/chorus claims.
Swing proxy is a feel estimate, not drummer-grade microtiming truth.
Emotion is explainable, derived from pulse + timbre + harmonic tension rather than a black-box mood guess.
Lyrics can override the final vibe when filtered Whisper text is confident and emotionally clear.

Audio sourcing

The tool needs a real audio file on disk.

Direct file (mp3, wav, flac, ogg, m4a — anything ffmpeg/librosa can read)
YouTube / supported URLs: yt-dlp -x --audio-format mp3 -o "output.mp3" "URL_OR_SEARCH"

Whisper lyrics transcription

listen.py uses:

CLI: /opt/homebrew/bin/whisper-cli
Model: ~/.local/share/whisper-cpp/ggml-large-v3-turbo.bin
Preprocess: convert input to mono 16kHz WAV via ffmpeg
Fallback: skip gracefully if Whisper is missing or errors

Dependencies

Python:

librosa
numpy

System:

ffmpeg
ffprobe

Workspace hygiene

Keep temporary audio files in a dedicated temp/output folder for the skill.
Avoid modifying unrelated project files while working on audio analysis tasks.

Usage Guidance

This skill appears to do what it claims: offline analysis of audio files using librosa and local tools. Before installing or running: 1) ensure you trust and want any local binaries it calls (ffmpeg/ffprobe and optionally whisper-cli) because the scripts invoke them via subprocess; 2) the hardcoded whisper-cli path (/opt/homebrew/bin/whisper-cli) and the model path (~/.local/share/whisper-cpp/...) are mac- and home-directory-specific — if you don't have Whisper or the model, the code will skip it, but if you do, be aware Whisper models can be large; 3) the README suggests using yt-dlp to fetch YouTube audio — that will download data from the network if you follow that workflow; 4) no credentials are requested and there are no network callbacks in the code, but you should still review the included scripts yourself (they are present) before running them on sensitive data. If you want higher confidence, run the scripts in an isolated environment (temporary folder or container) and verify the absence/presence of whisper-cli and the model if you do not want transcription.

Capability Analysis

Type: OpenClaw Skill Name: music-analysis Version: 3.0.2 The music-analysis skill bundle is a legitimate tool for local audio processing, utilizing librosa and numpy for feature extraction (tempo, key, timbre) and whisper-cli for lyric transcription. The implementation in scripts like listen.py and analyze_music.py follows security best practices by using subprocess with argument lists to interact with ffmpeg and ffprobe, preventing shell injection. No evidence of data exfiltration, network communication, or malicious prompt injection was found; the hardcoded paths for Whisper components are environment-specific but not harmful.

Capability Assessment

✓ Purpose & Capability

Name/description (local music/audio analysis) align with the included Python scripts and declared dependencies (librosa, numpy, ffmpeg/ffprobe). The code implements tempo, timbre, structure, instrument detection and optional lyric alignment via a local Whisper CLI/model which fits the stated purpose. Minor note: SKILL.md suggests using yt-dlp for fetching YouTube audio (network use) as an optional audio sourcing method; that is outside the 'no external APIs' claim but presented as an explicit optional workflow.

✓ Instruction Scope

Runtime instructions and scripts operate on local audio files, run ffmpeg/ffprobe and (optionally) whisper-cli, and write analysis reports to disk if requested. The SKILL.md and scripts do not instruct reading unrelated system files, environment secrets, or posting data to remote endpoints. Whisper usage is optional and the code includes a fallback path if Whisper is missing.

✓ Install Mechanism

There is no install spec and requirements.txt only lists librosa and numpy. The skill relies on system binaries (ffmpeg/ffprobe) and optionally a locally installed whisper-cli and model file; nothing in the repository pulls code from arbitrary URLs or writes external installers. This is a low-risk install posture.

✓ Credentials

The skill declares no required environment variables or credentials. The code does reference concrete filesystem paths (a Homebrew whisper-cli path and a home-dir model path) but these are optional and not secrets. No credentials or sensitive environment access is requested.

✓ Persistence & Privilege

The skill is user-invocable, not always-included, and does not modify other skills or agent-wide configuration. It runs as a normal local tool and does not request elevated or persistent privileges.

Version History

v3.0.2

Public docs cleanup: replace internal sandbox/trading references with generic workspace hygiene guidance.

v3.0.1

Cleanup release: remove stale internal v2 reference, rename public output section to 'Instrument Read', and replace first-person/internal language in skill docs and generated output with neutral public wording.

v3.0.0

v3: Instrument detection — spectral profile matching for 11 instrument families, per-section arrangement tracking, natural language narrative ('WHAT I HEAR' section). Upright/electric bass tiebreaker. Joel directive: emotion over science.

v1.2.0

Release 1.2.0: improved full-listen workflow, lyric-aware synthesis, and local analysis tooling.

v1.0.1

Improve full-listen workflow, lyric-aware synthesis, and local analysis tooling.

v1.1.0

Bug fixes: ffprobe path resolution, whisper-cpp integration, min-tokens threshold

v1.0.0

Initial release: analyze_music.py (snapshot) + temporal_listen.py (journey). Pure librosa signal processing — BPM, key, energy, spectral, mood timelines, narrative arcs. No external APIs.

Metadata

Slug music-analysis

Version 3.0.2

License MIT-0

All-time Installs 5

Active Installs 5

Total Versions 7

Frequently Asked Questions

What is Music Analysis?

Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key c... It is an AI Agent Skill for Claude Code / OpenClaw, with 543 downloads so far.

How do I install Music Analysis?

Run "/install music-analysis" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Music Analysis free?

Yes, Music Analysis is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Music Analysis support?

Music Analysis is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Music Analysis?

It is built and maintained by Adam-Researchh (@adam-researchh); the current version is v3.0.2.

More Skills

Music Analysis