Skip to content

audio: Add FluidSynth MIDI renderer as an alternative to native audio renderer#1117

Closed
bassdr wants to merge 19 commits into
Kenix3:mainfrom
bassdr:feature/midi-synth-fluidsynth
Closed

audio: Add FluidSynth MIDI renderer as an alternative to native audio renderer#1117
bassdr wants to merge 19 commits into
Kenix3:mainfrom
bassdr:feature/midi-synth-fluidsynth

Conversation

@bassdr

@bassdr bassdr commented May 30, 2026

Copy link
Copy Markdown

Summary

Adds an optional soft-synth abstraction layer that lets a consumer replace the engine's native PCM synthesis with a MIDI-driven backend (FluidSynth). Zero impact when disabled — when no synth is installed MidiSynthManager::GetActiveSynth() returns nullptr and the audio thread is the exact original PCM path.

Requires the float pipeline PR (this branch is based on it). FluidSynth renders at device output rate and feeds in via SetMixSource, bypassing the resampler.

Based on #1116
Closes #1116
Closes #1106

Testing

You can load any SF2 file and configure some instruments to play it.
I personnally used MuseScore General for my experimentation. Grab the .sf2 version. I can let the .sf3 file work later, FluidSynth supports it.

You can use this configuration to test, it sounds pretty good IMO:
fluidsynth_overrides.json

Core abstraction

  • IMidiSynth: pure-virtual interface — NoteOn, NoteOff, ProgramChange, PitchBend, ControlChange, Render
  • MidiSynthManager: thread-safe singleton. SetSynth() must not be called from the audio thread. GetActiveSynth() returns nullptr when no synth is installed.
  • FluidSynth: IMidiSynth implementation backed by libfluidsynth. Gated by -DENABLE_FLUIDSYNTH=ON (default OFF); FluidSynth.cpp is excluded from the build when the option
    is off, so there is no link-time dependency unless explicitly requested.

Graham-Smith volume curve (optional)

Also adds a mode so FluidSynth.

FluidSynth(sampleRate, linearVelocity) — when linearVelocity=true, replaces three SF2 default modulators on GEN_ATTENUATION with versions that keep the concave negative
curve shape but halve the amount (960 cB → 480 cB). This lifts quiet voices ~6 dB at mid-range without flattening dynamics. Technique credited to ANMP (GPL-2, github.com/derselbst/ANMP).

Also raises synth.gain from FluidSynth's stock 0.2 to 1.0 unconditionally — at 0.2 the synth output peak is ~5× quieter than a typical PCM mix partner.

SetReverbParams and similar effects runtime method

Lets callers swap reverb presets without rebuilding the synth. Safe to call any time after construction.

Support adding *.sf2 and *.sf3 files in mods

Adding SoundFonts to mods is now supported. They can be read directly from the *.o2r files. Adding a *.json with the same name of the soundfont will also provide a mapping that can be used to ship pre-configured instrument mapping alongside the soundfont.

Design notes

  • SetSynth(nullptr) gracefully uninstalls the active synth; the audio thread reverts to native PCM on the next buffer.
  • Render() is called from the audio thread and must be real-time safe. The FluidSynth implementation holds a mutex shared with all event methods (NoteOn, ProgramChange, etc.).
  • FluidSynth reads synth.sample-rate once at construction via new_fluid_synth() — fluid_synth_set_sample_rate() is deprecated in FluidSynth 2.x and silently ignored.
  • The linearVelocity name is kept (rather than renamed to grahamSmith) for CVar key stability in consuming projects.

@bassdr bassdr force-pushed the feature/midi-synth-fluidsynth branch from 3481b89 to 386b18d Compare June 1, 2026 14:20
@bassdr bassdr force-pushed the feature/midi-synth-fluidsynth branch 2 times, most recently from 335a87f to b8c4fcb Compare June 8, 2026 19:59
@bassdr bassdr force-pushed the feature/midi-synth-fluidsynth branch 2 times, most recently from e97b5e3 to b007e18 Compare June 14, 2026 00:56
bassdr and others added 19 commits June 17, 2026 20:45
Add AudioResampler, a polyphase windowed-sinc resampler supporting
arbitrary integer ratios. Designed for N64 audio upsampling from
32000 Hz to 48000 Hz (exact ratio 3/2, P=3 Q=2, 8 taps/phase,
Kaiser window beta=6, ~60 dB stopband attenuation).

- AudioSettings gains SourceSampleRate (default 0 = passthrough)
- AudioPlayer::Play() resamples transparently before DoPlay() when
  SourceSampleRate != SampleRate
- GetDesiredBuffered() scales from source rate to output rate so
  OTRAudio_Thread fill logic remains coherent
- Resample output uses a fixed std::array<int16_t, 16384> — no heap
  allocation on the audio hot path
- Default SampleRate changed from 44100 to 48000 Hz
- AudioPlayer destructor made virtual to fix UB in derived class dtors
AudioPlayer gains a parallel float-precision pipeline that consumers can
opt into via AudioSettings::UseFloatPipeline. The s16 path is unchanged
and remains the default, so existing libultraship consumers keep their
byte-exact contract; SoH flips to float when FluidSynth is enabled.

New entry points
- AudioPlayer::Play(const float*, size_t frames) alongside the legacy
  Play(const uint8_t*, size_t len). Each Play asserts it's called in
  the matching mode and drops the buffer with a warning otherwise.
- AudioPlayer::SetUseFloatPipeline(bool) — runtime mode switch via
  DoClose → flip → DoInit. Reverts on failure.
- AudioPlayer::SetMixSource(std::function<void(float*, int)>) — a
  secondary stereo source mixed in *after* the resampler so its
  contribution skips the rate-conversion step entirely. Returns false
  in s16 mode. AudioPlayer sums the source with a tanh-style soft-clip
  before any surround decode.
- audiobridge: AudioPlayerPlayFrame (legacy uint8_t) preserved;
  AudioPlayerPlayFrameF32 added for the float path.

DSP layer
- AudioResampler::Process gains a float overload that operates on
  interleaved float [-1, 1]. The original int16_t overload is preserved
  and wraps the float core with boundary conversions and clamping.
- SoundMatrixDecoder::Process likewise: native float overload plus a
  legacy uint8_t (s16) overload that converts at the boundaries. Both
  reuse the float-internal filter / phase / delay state.

Backends
- SDL / PipeWire / WASAPI / CoreAudio DoInit reads
  IsUsingFloatPipeline() and configures the device format (F32 vs S16).
  Buffered() divides by the matching sample size.
- PipeWire's ring init, sample width in OnProcess, and underrun
  fade-out math all branch on the same flag.

AudioPlayer reorder in float mode
- Stages: resample stereo → optional MixSource sum + soft-clip →
  surround decode (matrix-5.1) → DoPlay. Lets a secondary source
  produced at GetSampleRate() bypass the resampler.
- Resampler channel count differs between modes: stereo (2) in float
  mode (mix + surround decode follow), GetNumOutputChannels() in s16
  mode (legacy decode-first order preserved). RebuildResampler() picks
  the right value on Init / SetUseFloatPipeline / SetAudioChannels.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…kend

Introduces a soft-synth abstraction layer for MIDI-driven synthesis:

- IMidiSynth: pure-virtual interface (NoteOn, NoteOff, ProgramChange,
  PitchBend, ControlChange, Render)
- MidiSynthManager: thread-safe singleton; when nullptr no synth is
  active and the native PCM pipeline is unchanged
- FluidSynth: IMidiSynth implementation backed by libfluidsynth,
  gated by -DENABLE_FLUIDSYNTH=ON; uses the float pipeline
  (SetMixSource) to render directly at device output rate

CMake: adds ENABLE_FLUIDSYNTH option (default OFF); FluidSynth.cpp is
excluded from the build when the option is off.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `bool linearVelocity` parameter to the FluidSynth constructor
(default false preserves stock SF2 behavior). When true,
InstallLinearVelocityModulators() runs once after new_fluid_synth() and
replaces three SF2 default modulators on GEN_ATTENUATION with versions
that keep the perceptual **concave NEGATIVE** shape but halve the
amount (960 cB → 480 cB), pulling the maximum attenuation from −96 dB
to −48 dB:

  1. NoteOn velocity → concave at 480 cB.
  2. CC7 (channel volume) → concave at 480 cB.
  3. CC11 (expression) → concave at 480 cB.

ANMP calls this the "Graham-Smith volume curve" —
`dB = 20·log10(x/127)` instead of the SF2 spec's `40·log10(x/127)`.
Lifts quiet voices ~6 dB at mid-range and more at the low end while
preserving the dynamics shape and the SF2's natural taper near the top
of each input range.

Also lift `synth.gain` from FluidSynth's stock 0.2 to 1.0
unconditionally — at 0.2 the synth's output peak is ~5× quieter than
the native PCM peak (~1.0) it gets mixed against in the additive
Point B path; the soft-clip in OTRGlobals handles brief over-budget
sums. This applies whether or not Graham-Smith is enabled — the
imbalance is structural.

Three implementation notes worth recording:

  1. `fluid_synth_add_default_mod(... OVERWRITE)` only swaps a default
     in place when fluid_mod_test_identity() matches every source flag
     (including the curve type CONCAVE/LINEAR/CONVEX). A first iteration
     of this code switched CC11 to linear and relied on OVERWRITE; the
     identity check failed, OVERWRITE silently degraded to "append",
     and the linear modulator stacked on top of the SF2's concave one
     — adding an extra ~18 dB of attenuation at typical mid-range
     CC11 values. Use fluid_synth_remove_default_mod followed by
     add_default_mod to make the intent explicit and the result
     correct regardless of flag matching.

  2. The same iteration tried to make CC11 truly linear under the
     theory that the translator's sqrt(velocity) curve should be the
     only nonlinear shaping in the chain. But linear NEGATIVE is much
     harsher in the mid-range than concave NEGATIVE (~50% vs ~13%
     attenuation at CC11=64) — every mid-velocity voice ended up
     ~10 dB quieter than stock. Keep concave.

  3. ANMP's own CC11 handling is a *removal* of the modulator
     (Dinosaur Planet uses CC11 for something else). We keep CC11
     active because the SoH translator drives loudness dynamics
     through it.

  4. The modulator install runs after new_fluid_synth() but before
     LoadSoundFont() — SF2 instrument-level modulators are layered
     on top of these defaults at load time, so the synth-level
     defaults have to be in place first.

Reimplemented from scratch (no submodule) with attribution to ANMP
(GPL-2, github.com/derselbst/ANMP), specifically
src/InputLibraryWrapper/FluidsynthWrapper.cpp around L300-333. ANMP's
game-specific CC overrides (CBFD/JFG IIR filter, Dinosaur Planet ADSR
CCs) are intentionally not ported.

The parameter, member, and method retain the historical "linearVelocity"
name for git blame continuity and to mirror the Shipwright-side CVar
key (`CVAR_AUDIO("FluidSynthLinearVelocity")`) which we don't want to
rename out from under saved user settings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a public SetReverbParams(roomsize, damping, width, level) method on
FluidSynth that calls fluid_synth_set_reverb_{roomsize,damp,width,level}
under the synth mutex. Safe to call any time after construction so
callers can swap reverb presets without rebuilding the synth — the
SoH-side Authentic/Enhanced mode switch uses this to apply console-era
or musically-curated reverb defaults at Apply time without paying the
LoadSoundFont cost.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a memory-backed sound-font load path alongside the existing
filesystem one. Consumers (e.g. SoH's modded synth packs distributed
inside .o2r archives) can hand the SF2 bytes directly to FluidSynth
without first extracting to a temp file.

- FluidSynth gains LoadSoundFontFromMemory(data, size). The buffer is
  copied into an instance-owned vector that lives as long as the
  sfont stays loaded, so FluidSynth can safely read from the memory
  during and after sfload.
- A custom fluid_sfloader is registered at construction with five
  callbacks (open/read/seek/tell/close) that read from a static
  in-flight pointer set during LoadSoundFontFromMemory. The sentinel
  path "mem://current" is what fluid_synth_sfload receives; the
  default filesystem loader rejects it and falls through to ours.
- LoadSoundFont (path) now unloads any prior sfont before loading the
  new one, mirroring the memory variant's lifecycle so the two paths
  behave consistently across reloads.
- LoadSoundFontFromMemory is documented as GUI-thread-only and takes
  the synth mutex; the static in-flight pointer is safe under that
  constraint.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- 64 MIDI channels (was 16) so the SoH translator can give each
  (engine font, instrument) pair its own MIDI channel and avoid
  per-pair effect-CC collisions.
- AddSoundFont / AddSoundFontFromMemory / ClearSoundFonts: stack
  multiple SF2s simultaneously. LoadSoundFont* keep their
  single-shot replace semantics as Clear + Add wrappers. Render
  guard now checks mSfontIds.empty() instead of a single id.
- ProgramSelect(channel, sfontId, bank, program): pins the channel
  to a specific loaded sfont so reverse-load-order priority can't
  shadow the caller's intent. Drum/melodic channel type is set
  before the select.
- EnumerateLoadedPresets / GetLoadedSfontIds: expose each loaded
  sfont's (bank, program, name) tuples so the bypass-table UI can
  show real SF2 preset names and label which pack a row resolves to.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add GetActiveVoiceCount() and GetPolyphonyLimit() on the IMidiSynth
interface so hosts can surface real-time synth load (FluidSynth backend
forwards to fluid_synth_get_active_voice_count / fluid_synth_get_polyphony).
mSynthMutex becomes mutable to let the new const accessors lock.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Set the channel pitch-bend range through fluid_synth_pitch_wheel_sens
instead of the RPN CC dance (version-independent) with a verify log, and
suppress SF2-author-baked LFO-to-pitch per voice on NoteOn.

Add two generic IMidiSynth conveniences over the note/bend primitives:
- PitchBendFactor(channel, freqRatio): bend by a frequency ratio
  (1.0 = none), converting to semitones and forwarding to PitchBend,
  which owns the wheel-range clamp.
- NoteOnPitchFactor(channel, note, vel, freqRatio): apply the bend
  before NoteOn so a note can attack already bent.

These let the host pass an engine freqScale ratio straight through, with
the semitone conversion and range clamp living in one place (the synth).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…iguration

The default 256 is sized for a single synth, but in some situations (eg. modded songs),
we can exceed that limit. Allow per-game configuration.

Error when exceeded surfaces as "Ringbuffer full, increase synth.polyphony".
We also hear dropped notes and/or persistent notes (we missed the note-off).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add IMidiSynth::SetMasterGain (default no-op) and a FluidSynth override that
forwards to fluid_synth_set_gain under the synth mutex, so the host can track a
global volume fader on the live synth without rebuilding it. Mirrors
FluidSynthConfig::gain, which sets the same knob at construction.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In float mode the source is always stereo, so any 6-channel output must be
matrix-upmixed. Previously only Matrix 5.1 was, leaving Raw 5.1 sending a
stereo buffer to a 6-channel device. Centralize the decoder lifecycle
(NeedsMatrixDecoder/EnsureMatrixDecoder: Matrix 5.1 always, Raw 5.1 only in
float mode) and key the float-path decode on output channel count. The s16
path is unchanged (Raw 5.1 still passes the engine's native 6 channels
through). The synth, summed into the stereo bus before the upmix, now reaches
all surround channels.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
find_package(FluidSynth REQUIRED) only works where fluidsynth ships a CMake
config (e.g. Gentoo, vcpkg). Debian/Ubuntu and most distros ship only
pkg-config, so fall back to pkg_check_modules(fluidsynth) there.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fluid_synth_set_reverb_{roomsize,damp,width,level} were deprecated in FluidSynth
2.2.0; MSVC flags the C4996 as an error. Use the _group_ variants with
fx_group=-1 (all groups).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
FluidSynth wrote directly to stderr, bypassing our log sinks and level

Install a fluid_set_log_function handler that forwards each message at the matching spdlog level.
FluidLogToShip returns a per-level fluid_log_function_t lambda and the
registration loop iterates the fluid_log_level enum, replacing the
single message-forwarding callback. Plus a clang-format sweep of the
file (early-return / switch-case bracing, log-arg wrapping).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bassdr bassdr force-pushed the feature/midi-synth-fluidsynth branch from 2c176ed to a8d6da8 Compare June 18, 2026 00:56
@bassdr

bassdr commented Jun 23, 2026

Copy link
Copy Markdown
Author

Changes moved in HarbourMasters/Shipwright#6668, now self-contained, not needed in LUS anymore.

@bassdr bassdr closed this Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant