Feature: sonic analysis provider by chrisuthe · Pull Request #3516 · music-assistant/server

chrisuthe · 2026-03-31T01:19:57Z

Sonic Analysis Provider

This PR adds a sonic analysis provider that extracts semantic audio features from PCM audio during playback using librosa and stores them as standard AudioAnalysisData fields.

What It Does

Processes audio in 10-second blocks with overlap to avoid STFT artifacts
Derives human-readable semantic descriptors from raw spectral features:

Field	Derivation
`bpm`	Tempo estimation from onset envelope
`key` / `mode`	Krumhansl-Kessler profile correlation against chroma
`energy`	Normalized mean RMS
`danceability`	Onset regularity + tempo suitability
`loudness_integrated` / `loudness_range`	RMS-derived dB approximations
`brightness`	Spectral centroid normalized against Nyquist
`harmonic_complexity`	Shannon entropy of mean chroma vector
`roughness`	Spectral contrast range + flatness
`rhythmic_regularity`	Inter-onset interval coefficient of variation
`rms_energy_per_second` / `spectral_centroid_per_second`	Per-second time series

Stores results via set_audio_analysis() as a plain AudioAnalysisData — no opaque blobs, no custom subclasses
Proposes 4 new upstream fields: brightness, harmonic_complexity, roughness, rhythmic_regularity

Architecture

The provider is a pure feature extraction + distillation layer. It does not store similarity vectors or compute distances — that responsibility belongs to the similarity plugin (separate stacked PR).

Both the provider and the similarity plugin depend on the shared AudioAnalysisData model contract. Any audio analysis provider that populates the same fields can feed the similarity plugin.

Code Organization

helpers.py — Pure feature extraction (extract_block_features, merge_block_features) and semantic derivation (collapse_to_analysis with private _derive_* helpers)
__init__.py — MA integration: PCM streaming, block accumulation, session management, _finalize() stores the result

Testing

test_helpers.py — Tests for block extraction, merging, and collapse_to_analysis (scalar ranges, determinism, noise vs sine differentiation)
test_provider_units.py — Tests for PCM byte conversion (16/24/32-bit, mono/stereo)

Dependencies

Builds on top of the upstream audio_analysis_controller_provider branch (PR Add Audio Analysis controller and Audio Analysis provider #3509)
The sonic similarity plugin (fork PR Bump minimist from 1.2.0 to 1.2.5 in /frontend/src-cordova #1) stacks on this branch

github-actions · 2026-03-31T01:21:01Z

🔒 Dependency Security Report

✅ No dependency changes detected in this PR.

… base class Provider extracts audio features from PCM streams using librosa and stores them as semantic AudioAnalysisData fields (BPM, key, mode, energy, danceability, brightness, harmonic_complexity, roughness, rhythmic_regularity, loudness, beats, duration, true_peak, wave_form). Adapted to upstream AudioAnalysisProvider API: - _start_analysis returns bool (replaces old start_analysis override) - Uses streamdetails (not stream_details) - Stores via mass.streams.audio_analysis.set_audio_analysis()

Empty frequency sets and flat chroma profiles produce harmless warnings during key detection. Now suppressed with targeted warning filters and NaN handling for zero-std correlations.

…provider

…scan Override the AudioAnalysisProvider.analyze_file hook so upstream's AudioAnalysisController._run_background_scan can drive backfill through the generic provider-agnostic interface. Loads audio via librosa, runs block feature extraction and collapse, populates duration and true_peak.

Three cleanups in one commit: 1. Stop computing overlap fields in librosa: bpm <- overlaid by smart_fades (beat_this CNN) key, mode <- overlaid by smart_fades (S-KEY) danceability <- overlaid by clap_analysis (zero-shot, calibrated) These were quality-inferior to their overlay sources and the overlay system guaranteed replacement at vector-assembly time. Computing them in librosa was wasted work; leaving their AudioAnalysisData fields None makes the architecture honest. Install must have the relevant overlay providers enabled or vectors won't assemble — the "no valid signatures found" diagnostic added in the previous commit tells the user exactly which fields are missing when that happens. 2. Remove dead-code feature extractions: librosa.feature.mfcc librosa.feature.tonnetz librosa.feature.spectral_rolloff librosa.feature.zero_crossing_rate These were extracted per block and stored on BlockFeatures but never read by collapse_to_analysis. Legacy from an earlier vector schema. Removing saves roughly 100ms per 10s block of analyzed audio — for a typical 3-min track, ~1.8s less CPU per analysis. 3. Fix pre-existing stale field names in test_helpers.py: rms_energy_per_second -> rms_energy spectral_centroid_per_second -> spectral_centroid These referenced the pre-upstream-alignment field names and had been silently failing 2 tests since the AudioAnalysisData model was updated. Net: -157/+69 lines in helpers.py, test surface shrunk to match. All 102 sonic_analysis + sonic_similarity tests pass.

extract_block_features previously called four librosa feature functions that each computed their own STFT internally — four redundant spectrograms per 10s block. All four (chroma_stft, spectral_contrast, spectral_centroid, spectral_flatness) share the same default n_fft=2048 / hop_length=512, so a single up-front STFT is the correct input for all of them via librosa's `S=` kwarg. Verified byte-identical output to the old per-feature path (max abs diff = 0 on all four feature matrices). All 10 sonic_analysis tests pass unchanged. Measured: 1.56x speedup on a 10s block (25ms -> 16ms). For a 3-min track (18 blocks), that's ~180ms saved per analyzed track. At a user's 12k-track library scale, ~36 minutes of CPU time per full background scan. Millions- of-tracks libraries benefit proportionally. RMS and onset_strength are left unchanged: RMS is time-domain, and onset_strength uses a mel spectrogram with different parameters.

chrisuthe added enhancement new-feature and removed enhancement labels Mar 31, 2026

MacgyverH reviewed Mar 31, 2026

View reviewed changes

Comment thread music_assistant/providers/sonic_analysis/helpers.py Outdated

MarvinSchenkel reviewed Mar 31, 2026

View reviewed changes

Comment thread music_assistant/providers/sonic_analysis/__init__.py

chrisuthe force-pushed the task/sonic-analysis-provider branch 2 times, most recently from 481723c to 043220e Compare April 1, 2026 14:44

chrisuthe mentioned this pull request Apr 2, 2026

Add Sonic Similarity plugin for audio-based track similarity chrisuthe/server#1

Closed

chrisuthe changed the title ~~WIP: Feature: sonic analysis provider~~ Feature: sonic analysis provider Apr 2, 2026

chrisuthe force-pushed the task/sonic-analysis-provider branch from 729ed8a to 3d93152 Compare April 5, 2026 15:37

chrisuthe force-pushed the task/sonic-analysis-provider branch 2 times, most recently from b04676e to fa65f09 Compare April 21, 2026 16:24

chrisuthe added 2 commits April 21, 2026 11:28

fix(sonic_analysis): suppress librosa tuning and numpy corrcoef warnings

7f5d82d

Empty frequency sets and flat chroma profiles produce harmless warnings during key detection. Now suppressed with targeted warning filters and NaN handling for zero-std correlations.

chrisuthe force-pushed the task/sonic-analysis-provider branch from fa65f09 to 7f5d82d Compare April 21, 2026 16:28

chrisuthe added 4 commits April 22, 2026 09:02

Merge remote-tracking branch 'upstream/dev' into task/sonic-analysis-…

ed39664

…provider

chrisuthe closed this Apr 27, 2026

chrisuthe mentioned this pull request Apr 30, 2026

Stream PCM to audio analysis providers during background scan #3821

Merged

chrisuthe deleted the task/sonic-analysis-provider branch May 4, 2026 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: sonic analysis provider#3516

Feature: sonic analysis provider#3516
chrisuthe wants to merge 6 commits intomusic-assistant:devfrom
chrisuthe:task/sonic-analysis-provider

chrisuthe commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

chrisuthe commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Sonic Analysis Provider

What It Does

Architecture

Code Organization

Testing

Dependencies

Uh oh!

github-actions Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Dependency Security Report

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chrisuthe commented Mar 31, 2026 •

edited

Loading

github-actions Bot commented Mar 31, 2026 •

edited

Loading