Skip to content

Fix MP3 output corruption#499

Merged
gabeschine merged 13 commits intortl-airband:mainfrom
gabeschine:gs/fix-lame-usage
Mar 9, 2025
Merged

Fix MP3 output corruption#499
gabeschine merged 13 commits intortl-airband:mainfrom
gabeschine:gs/fix-lame-usage

Conversation

@gabeschine
Copy link
Copy Markdown
Collaborator

@gabeschine gabeschine commented Feb 27, 2025

Overview

This PR fixes issue #410, specifically two issues identified with MP3 outputs:

  • Missing initial header in MP3 files.
  • Bad headers distributed throughout files.

Details

  • Writes the "lametag" initial header just prior to file close.
  • Associates lame_t instance lifecycle with outputs, instead of channels. Every MP3 output now has its own MP3 encoding context.

Testing

  • Initial smoke-test
  • 24h on Rpi4 scanner box
  • Community testing by users that have reported associated issues

@SullivanChrisJ
Copy link
Copy Markdown

When I built RTLSDR_Airband some time ago I cloned the repo and built it on my R-Pi. My git/github knowledge is limited so I could use a hint on how to get this version for testing, which I'm happy to do. Thanks/Chris.

@gabeschine
Copy link
Copy Markdown
Collaborator Author

When I built RTLSDR_Airband some time ago I cloned the repo and built it on my R-Pi. My git/github knowledge is limited so I could use a hint on how to get this version for testing, which I'm happy to do. Thanks/Chris.

If you have the gh command line utility installed, the fastest thing would be:

gh pr checkout 499

Otherwise, you'll want to do something using just git:

git remote add gabeschine https://github.com/gabeschine/RTLSDR-Airband.git
git fetch gabeschine
git checkout gabeschine/gs/fix-lame-usage

^ (untested by me but should work)

@SullivanChrisJ
Copy link
Copy Markdown

SullivanChrisJ commented Feb 27, 2025 via email

@gabeschine
Copy link
Copy Markdown
Collaborator Author

Getting this error when I try to build. CMake Error at CMakeLists.txt:13 (message): Failed to detect RTL_AIRBAND_VERSION - "did not find a git root directory at /home/pi/.git and failed to extract a version from pi" Thanks, Chris

Ah, I just encountered this as well. Pushed a fix to the branch - can you make sure you have commit ed8b648?

@SullivanChrisJ
Copy link
Copy Markdown

Yes. I have that as the HEAD, but I still have the same problem.

$ git log --oneline
ed8b648 (HEAD, gabeschine/gs/fix-lame-usage) teach scripts/find_version to work in git submodule
4306311 associate lame_t with outputs, not channels

@gabeschine
Copy link
Copy Markdown
Collaborator Author

gabeschine commented Feb 28, 2025

Yes. I have that as the HEAD, but I still have the same problem.

Ok I'm not sure exactly what's going on. You might have to take a look at scripts/find_version - it's not a complicated script. It's looking for a file or directory called .git in your repo dir, and running the command git describe --tags --abbrev --dirty --always to grab the version. One of those operations is failing and I'm not sure why - let me know what you find.

@rough316
Copy link
Copy Markdown

rough316 commented Mar 2, 2025

git describe --tags --abbrev --dirty --always

v3.2.1-176-ged8b648

I ran into this same error as @SullivanChrisJ

$cmake -DPLATFORM=rpiv2 -DNFM=ON -DPULSEAUDIO=OFF ../
CMake Error at CMakeLists.txt:13 (message):
Failed to detect RTL_AIRBAND_VERSION - "did not find a git root directory
at /home/pi/RTLSDR-Airband/.git and failed to extract a version from
RTLSDR-Airband"

@rough316
Copy link
Copy Markdown

rough316 commented Mar 2, 2025

ed8b648

I got it working.

Change CMakeLists.txt to:

execute_process(
COMMAND ${PROJECT_SOURCE_DIR}/scripts/find_version
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
OUTPUT_VARIABLE RTL_AIRBAND_VERSION
OUTPUT_STRIP_TRAILING_WHITESPACE
ERROR_VARIABLE RTL_AIRBAND_VERSION_ERROR
ERROR_STRIP_TRAILING_WHITESPACE
)

Change the ./scripts/find_version back to -d

if [ -d "${PROJECT_GIT_DIR_PATH}" ] ; then
git describe --tags --abbrev --dirty --always
exit 0
fi

She builds, am running, and mp3 seems to be working better, will test it more tomorrow.

…and into gs/fix-lame-usage

* 'gs/fix-lame-usage' of github.com:gabeschine/RTLSDR-Airband:
  teach scripts/find_version to work in git submodule
@gabeschine
Copy link
Copy Markdown
Collaborator Author

🤦

execute_process( COMMAND ${PROJECT_SOURCE_DIR}/scripts/find_version WORKING_DIRECTORY ${PROJECT_SOURCE_DIR} OUTPUT_VARIABLE RTL_AIRBAND_VERSION OUTPUT_STRIP_TRAILING_WHITESPACE ERROR_VARIABLE RTL_AIRBAND_VERSION_ERROR ERROR_STRIP_TRAILING_WHITESPACE )

I didn't need to do this - not sure what problem it's addressing yet. LMK if you have thoughts.

Change the ./scripts/find_version back to -d

I feel silly: -f is for files only. I changed it to -r which will match for files and directories (and technically more, like symlinks), which is permissive enough to handle both a normal git repo checkout and git submodules.

@SullivanChrisJ
Copy link
Copy Markdown

It's going to be a while before I can test. It builds all right on a Pi 4, but the Pi 4 I use for production developed a fault some time ago which is causing make install to fail as gcc is corrupted. For other packages (e.g. Direwolf) I've been able to build on a different pi and install on the production one, but there's something going on underneath that must be accessing a corrupted image.

(TL;DR in December the Pi Micro-SD went read-only. The system build is quite detailed and file protections are very tricky because other airband systems upload to it so I cloned the Micro-SD onto a M.2 NVME and ran the system from USB SSD, with recordings going to USB HDD. All seemed fine until I tried to compile some code and gcc segfaulted. I found that gcc itself is not corrupt by replacing it, but one of its dependencies must have been damaged along the way, so I need to rebuild from scratch, which is a considerable effort). It's a good idea in any case.

@SullivanChrisJ
Copy link
Copy Markdown

I ran the executable directly and got this error:

Configuration error: devices.[0] channels.[0]: unknown modulation

Configuration file is below (renamed to .txt from .conf in order to upload it).

rtl_airband.txt

@gabeschine
Copy link
Copy Markdown
Collaborator Author

I ran the executable directly and got this error:

Configuration error: devices.[0] channels.[0]: unknown modulation

Configuration file is below (renamed to .txt from .conf in order to upload it).

rtl_airband.txt

What's your cmake command? You're only referencing "nfm", which makes me wonder if support for it was accidentally omitted in the compile command.

@SullivanChrisJ
Copy link
Copy Markdown

Yes of course. Insufficient coffee. Thanks. Running now.

@SullivanChrisJ
Copy link
Copy Markdown

SullivanChrisJ commented Mar 4, 2025

It ran all day yesterday and collected 1216 mp3 files. All processed successfully except 1, which faulted with the error:

[Errno 1094995529] Invalid data found when processing input: '/media/ve3nrt/2025/03/VE3GSR_TX_20250303_210601.mp3'

after the call to av.open.

    options = {
        'probesize': '5000000',        # Increase probing size (default is 5000000 bytes)
        'analyzeduration': '10000000', # Increase analysis duration (in microseconds)
    }

    # Open the MP3 file using PyAV
    try:
        container = av.open(filename, options=options)
    except av.InvalidDataError as e:
        raise ValueError(e)

Most of the errors came from calls to demux previously, and those are gone. I also tried PySoundFile and it found no errors. The file will also open and play with Audacity. I'll probably just move the decoding back to PySoundFile and move on, but if you'd like a copy of the file, which is only a couple of seconds, I can send it to you. I can't attach it here and mp3 isn't supported. Thank you for this update. It helps a lot.

Chris

@rough316
Copy link
Copy Markdown

rough316 commented Mar 4, 2025

I am running this on my feed now, and so far its been great, no crashes. Making progress taking the mp3 stream and converting to text with whisper.cpp. The MP3 files being correct really helps things.

@gabeschine gabeschine marked this pull request as ready for review March 6, 2025 13:21
@gabeschine gabeschine merged commit 4845a52 into rtl-airband:main Mar 9, 2025
@SullivanChrisJ
Copy link
Copy Markdown

Today I noticed a problem when trying to read some recordings with pysoundfile, which don't appear with PyAV, so I think there is still something amiss. While pyav had the one glitch which soundfile didn't. Soundfile silently discards the tail end of a significant number of recordings. The file was recording on a Pi 4B with 64-bit Raspberry Pi OS bookworm lite and RTL-SDR V3, while the following test was done on an Intel X86-64 with Manjaro Linux.

If you know of a good mp3 analysis tool I'd be happy to try it. The ones I've tried just look at the id3 headers.

Python 3.13.2 (main, Feb  5 2025, 08:05:21) [GCC 14.2.1 20250128] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> mp3 = '/media/ve3nrt/2025/03/VE3YRA_TX_20250308_143456.mp3'
>>> import soundfile as sf
>>> audio, sample_rate = sf.read(mp3)
>>> audio.size
174412
>>> audio.size/8000
21.8015
# See definition below if needed.
>>> from dragnet.audio_io import read_mono_mp3_to_float32_array
>>> a, sr =  read_mono_mp3_to_float32_array(mp3)
>>> a.size/8000
50.76

50.76 is the number of seconds I expected in the file. At this point, I will probably use pyav and if it raises the error I saw earlier will then punt to libsoundfile.

If needed, here is the definition of the pyav function.

import av
import numpy as np

def read_mono_mp3_to_float32_array(filename):
    """
    Reads a mono MP3 file and returns the audio data as a NumPy float32 array.
    Raises an error if the input file is not mono.

    Parameters:
        filename (str): Path to the input MP3 file.

    Returns:
        audio_array (np.ndarray): Mono audio data array of shape (samples,).
        sample_rate (int): Sample rate of the audio.

    Raises:
        ValueError: If the input MP3 file is not mono.
    """
    options = {
        'probesize': '5000000',        # Increase probing size (default is 5000000 bytes)
        'analyzeduration': '10000000', # Increase analysis duration (in microseconds)
    }

    # Open the MP3 file using PyAV
    try:
        container = av.open(filename, options=options)
    except av.InvalidDataError as e:
        raise ValueError(e)

    stream = container.streams.audio[0]

    # Check if the audio is mono
    channels = stream.codec_context.channels
    if channels != 1:
        raise ValueError(f"Input file '{filename}' is not mono. Number of channels: {channels}")

    # Ignore errors in audio stream
    for stream in container.streams:
        if stream.type == 'audio':
            audio_stream = stream
            break

    # Retrieve audio properties
    sample_rate = audio_stream.codec_context.sample_rate

    # Initialize an empty list to hold the audio data
    audio_array = np.empty(0, dtype=np.float32)

    # Iterate over packets in the container
    for i, packet in enumerate(container.demux(stream)):
        try:
            for j, frame in enumerate(packet.decode()):

                # Convert the frame to a NumPy array
                try:
                    frame_data = frame.to_ndarray()[0, :]
                except av.InvalidDataError as e:
                    print(f"Frame decode error: {e} - skipping frame {j}")
                    continue

                if 's' in frame.format.name:  # Signed integer
                    frame_data = (frame_data/32767).astype(np.float32)
                elif 'f' in frame.format.name:  # Floating point
                    frame_data = (frame_data).astype(np.float32)
                else:
                    raise ValueError(f"Unsupported sample format: {frame.format.name}")

                # Append mono audio data
                audio_array = np.concatenate((audio_array, frame_data))  # Access the mono channel

        except av.InvalidDataError as e:
            samples = int(packet.duration * packet.time_base * sample_rate)
            position = audio_array.size / sample_rate
            print(f"Packet {i} decode error: {e} forcing {samples} zero samples at position {position} file {filename}")
            audio_array = np.concatenate((audio_array, np.zeros(samples, dtype=np.float32)))  # Access the mono channel

        # Concatenate all frames into a single NumPy array along the time axis
    return audio_array, sample_rate

@gabeschine
Copy link
Copy Markdown
Collaborator Author

Today I noticed a problem when trying to read some recordings with pysoundfile, which don't appear with PyAV, so I think there is still something amiss. While pyav had the one glitch which soundfile didn't. Soundfile silently discards the tail end of a significant number of recordings. The file was recording on a Pi 4B with 64-bit Raspberry Pi OS bookworm lite and RTL-SDR V3, while the following test was done on an Intel X86-64 with Manjaro Linux.

If you know of a good mp3 analysis tool I'd be happy to try it. The ones I've tried just look at the id3 headers.

Python 3.13.2 (main, Feb  5 2025, 08:05:21) [GCC 14.2.1 20250128] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> mp3 = '/media/ve3nrt/2025/03/VE3YRA_TX_20250308_143456.mp3'
>>> import soundfile as sf
>>> audio, sample_rate = sf.read(mp3)
>>> audio.size
174412
>>> audio.size/8000
21.8015
# See definition below if needed.
>>> from dragnet.audio_io import read_mono_mp3_to_float32_array
>>> a, sr =  read_mono_mp3_to_float32_array(mp3)
>>> a.size/8000
50.76

50.76 is the number of seconds I expected in the file. At this point, I will probably use pyav and if it raises the error I saw earlier will then punt to libsoundfile.

I'm not familiar with the analysis you performed. That said, VBR is turned on for mp3 outputs (see

lame_set_VBR(lame, vbr_mtrh);
), so I'm not sure if the math you did above still holds. Just making sure you're aware of that.

That said, it might simplify things - or avoid issues - to turn VBR off. Something to consider.

@SullivanChrisJ
Copy link
Copy Markdown

SullivanChrisJ commented Mar 10, 2025

Both examples (soundfile and pyav) are used to decode the mp3. The recording has a sample rate of 8000. As it is decode from mp3 to audio, the VBR shouldn't matter, and the output sample rate is fixed at 8000, both would normally (and usually do) result in an output (variables 'audio' and 'a' respectively) of the same number of samples. In the above I divided by 8000 to get the time in seconds and they are vastly different.

My application is take mp3 recordings of a repeater output, and cf32 recordings of anything I pick up on the repeater input, demodulate the latter and put them into a single wave file for a day's recordings, slewing when there are long periods of silence and creating markers so I can see the time-of-day and other information about the recording. There are other receivers which also provide cf32 files if they happen to pick up the signal.

Yesterday there were 504 cf32 and 734 mp3 files recorded. I noticed that a significant number of the mp3 recordings were truncated, and while investigated I noticed the discrepancy in the output of the 2 libraries, and that libsoundfile doesn't seem to be reading the file correctly 100% of the time.

I don't know what is to blame, although both pyav and libsoundfile are widely used. All I know is it is a discrepancy, and something isn't right somewhere. If I find out I'll create another issue, I guess. At this point I have a workaround but I'll do more checking if I run into any more problems.

Cheers,
Chris

@gabeschine gabeschine deleted the gs/fix-lame-usage branch March 10, 2025 01:12
@gabeschine
Copy link
Copy Markdown
Collaborator Author

Of course, you're looking at sample rate against the # of samples. My brain substituted file size.

Please do continue investigating. If there's another issue here, I'd like to know about it. Thanks for your testing and your time.

@SullivanChrisJ
Copy link
Copy Markdown

No problem. Thank you so much for fixing the the lame issue. It has has vastly improved the integrity of my project. My kluge of using pyav first then libsoundfile has worked around the truncation problem with libsoundfile and the container problem with pyav. Great that your work has been merged into the charlie-foxtrot repo. The results are already paying dividends for locating interference on our repeater system.

@rough316
Copy link
Copy Markdown

lame_set_VBR(lame, vbr_mtrh);

I commented this out to get to CBR, and it seems like VLC and other players start faster, and generally work better.

Two other things I have done to help with the sound quality of voice:

  1. lame_set_quality(lame, 2);
  2. lame_set_exp_nspsytune(lame, 1);

lame_set_quality = 2 is better clarity, and the nspsytune enables speech optimizations, the default is tuned for music.

LAME uses two psychoacoustic models: GPSYCHO (default) and NSPSY (noise-shaping, activated by --nspsytune). These models analyze the input audio to decide which parts can be discarded or compressed more aggressively without noticeable loss in perceived quality.

syehorov added a commit to syehorov/RTLSDR-Airband that referenced this pull request Mar 13, 2025
commit 21a8caf
Author: charlie-foxtrot <13514783+charlie-foxtrot@users.noreply.github.com>
Date:   Sun Mar 9 23:24:57 2025 -0700

    URL and License updates (rtl-airband#503)

    * Pull in GPLv2 from https://github.com/microtony/RTLSDR-Airband/
    * Update repo URLs
    * make this a #minor release

commit 4845a52
Merge: e7cd0ec 2adbdca
Author: Gabe Schine <gabeschine@users.noreply.github.com>
Date:   Sun Mar 9 13:40:48 2025 -0400

    Merge pull request rtl-airband#499 from gabeschine/gs/fix-lame-usage

    Fix MP3 output corruption

commit 2adbdca
Author: gabeschine <gabe@schine.net>
Date:   Sun Mar 9 05:02:17 2025 -0700

    try PLATFORM=native

commit 902e2c3
Author: gabeschine <gabe@schine.net>
Date:   Sun Mar 9 04:50:26 2025 -0700

    attempt compile flags for rpiv2 on ubuntu-arm

commit 9ddf1c0
Author: gabeschine <gabe@schine.net>
Date:   Sat Mar 8 17:24:24 2025 -0800

    fix arm ubuntu version 20.04 -> 22.04

commit 8fae41e
Author: gabeschine <gabe@schine.net>
Date:   Sat Mar 8 12:32:59 2025 -0800

    add macos-14 (arm/m1) os to ci_build.yml

commit 2557c6d
Author: gabeschine <gabe@schine.net>
Date:   Sat Mar 8 12:30:59 2025 -0800

    run platform_build.yml on ubuntu-20.04-arm instead of rpi3b (not available)

commit 3df33db
Author: gabeschine <gabe@schine.net>
Date:   Sat Mar 8 12:30:41 2025 -0800

    also run ci_build.yml on ubuntu-arm64

commit d40f357
Author: gabeschine <gabe@schine.net>
Date:   Sat Mar 8 12:21:37 2025 -0800

    fix ci: remove unavailable macos-12 runner from matrix

commit 8917d0e
Merge: bd22b29 ed8b648
Author: gabeschine <gabe@schine.net>
Date:   Sun Mar 2 06:23:30 2025 -0800

    Merge branch 'gs/fix-lame-usage' of github.com:gabeschine/RTLSDR-Airband into gs/fix-lame-usage

    * 'gs/fix-lame-usage' of github.com:gabeschine/RTLSDR-Airband:
      teach scripts/find_version to work in git submodule

commit bd22b29
Author: gabeschine <gabe@schine.net>
Date:   Sun Mar 2 06:22:42 2025 -0800

    scripts/find_version: use readable check instead of file-specific check

commit ed8b648
Author: gabeschine <gabe@schine.net>
Date:   Thu Feb 27 15:48:16 2025 -0500

    teach scripts/find_version to work in git submodule

commit 4306311
Author: gabeschine <gabe@schine.net>
Date:   Wed Feb 26 07:44:01 2025 -0800

    associate lame_t with outputs, not channels

commit 309b7be
Author: gabeschine <gabe@schine.net>
Date:   Tue Feb 25 12:49:09 2025 -0800

    write lametag to MP3 files

commit ef10d9d
Author: gabeschine <gabe@schine.net>
Date:   Tue Feb 25 10:32:28 2025 -0800

    fix .devcontainer: install pre-commit using apt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants