Skip to content

Instantly share code, notes, and snippets.

@paskozdilar
Last active August 21, 2024 14:14
Show Gist options
  • Save paskozdilar/6095fe73c80ad21fda3f518177699149 to your computer and use it in GitHub Desktop.
Save paskozdilar/6095fe73c80ad21fda3f518177699149 to your computer and use it in GitHub Desktop.
Detect true mp3 bitrate
#!/usr/bin/env bash
set -euo pipefail
function main() {
# Check for argument
if [ $# -ne 1 ]
then
echo "usage: $0 INFILE"
exit 1
fi
# Define bitrates to check
local INFILE="$1"
local BITRATES="320 256 224 192 160 128 112 96 80 64 56 48 40 32"
# Check if file exists
if ! [ -f "$INFILE" ]
then
echo "file not found: $INFILE"
exit 1
fi
# Remove temporary files on exit
trap 'rm -f .tmp*.wav .tmp*.mp3' EXIT
# Check if lame and sox commands exist
for cmd in lame sox
do
if ! which "$cmd" >/dev/null
then
echo "command not found: $cmd"
exit 1
fi
done
# Decode original file to wav and invert amplitude
decode "$INFILE" .tmp.src.wav -1
# Decode file to bitrate and compare differences
for BITRATE in $BITRATES
do
# compress "$INFILE" .tmp.mp3 "$bitrate"
compress "$INFILE" .tmp.mp3 "$BITRATE"
decode .tmp.mp3 .tmp.wav
printf "%3s: %s\n" "$BITRATE" \
"$(compare .tmp.src.wav .tmp.wav \
2>&1 \
| grep 'RMS.*amplitude' \
| awk '{print $3}')"
done
}
# Compress mp3 file with given constant bit rate
function compress() {
local INFILE="$1"
local OUTFILE="$2"
local BITRATE="$3"
lame \
--quiet \
-q 0 \
"$INFILE" \
-b "$BITRATE" \
"$OUTFILE"
}
# Decode mp3 file into wav
function decode() {
local INFILE="$1"
local OUTFILE="$2"
local VOLUME="${3-1}" # set to -1 to invert signal
lame \
--quiet \
-q 0 \
"$INFILE" \
--decode \
.tmp.decode.wav
# resample to avoid compare issues
sox \
--volume "${VOLUME}" \
.tmp.decode.wav \
--rate 44100 \
"$OUTFILE" >/dev/null 2>&1
}
# Compare two wav files, assume one is inverted
function compare() {
local FILE1="$1"
local FILE2="$2"
sox \
--combine mix \
"$FILE1" "$FILE2" \
--null \
stat
}
main "$@"
@ggeorgovassilis
Copy link

That's a clever approach! But I think it doesn't work reliably. I recorded (myself :-) ) with Audacity on the viewsonic UA-2x2 in stereo at 96kHz and 24 bits, exported mp3s with 320k, 128k and 56k bit rates respectively. Then I re-encoded the exports at 320k and ran the detector. Question: when looking for a jump in errors, should I look for an absolute or relative increase?

Findings:
(note: I removed some bit rates from the script for extra speed)

  • there are large jumps in the original export at 320k - the largest relative jump is actually between 64k and 48k at 10x.
  • in all cases there seems to be a large jump between 64k and 48k
  • the 128k upsample shows the largest jump between 320k and 256k

original export 320k:
320: 0.000687
256: 0.001389
192: 0.004593
160: 0.009964
128: 0.011276
96: 0.012421
64: 0.013214
48: 0.159500
32: 0.212189

original export 128k:
320: 0.000458
256: 0.001389
192: 0.004608
160: 0.007614
128: 0.010010
96: 0.013367
64: 0.011200
48: 0.152161
32: 0.195770

original export 56k:
320: 0.000122
256: 0.000198
192: 0.003357
160: 0.005798
128: 0.006119
96: 0.006058
64: 0.008987
48: 0.148300
32: 0.197037

upsampled from 128k:
320: 0.000580
256: 0.000977
192: 0.004700
160: 0.007645
128: 0.009598
96: 0.013794
64: 0.013245
48: 0.152756
32: 0.201569

upsampled from 56k:
320: 0.000214
256: 0.000290
192: 0.003311
160: 0.005997
128: 0.006073
96: 0.006439
64: 0.012100
48: 0.149002
32: 0.197800

@paskozdilar
Copy link
Author

I have had similar issues.

I think the issue might be the metric I'm using - the maximum delta. While the sound might be "more similar" in summary, there might be some peaks in the difference that causes the metric to jump higher anyway.

I wonder if changing grep "Maximum delta" to grep "RMS.*amplitude" would make the output better. I'll try it when I come home - feel free to try it too. I've modified the gist.

@ggeorgovassilis
Copy link

Thanks. Ran it, doesn't seem to change much - there's still a noticeable increase at 56k.

@paskozdilar
Copy link
Author

I suppose that this isn't a good metric then.

There will always be some loss, and linear increase is expected in case of genuine bitrate.
In case of upscaled files, the loss should be lesser until the target bitrate is reached, then it should jump higher.

The question is - lesser than what?


It might be required to gather some data on what a typical sound loss looks like, so that we may compare the difference to that curve.
I guess that's too much for Bash. I might try doing that in Python some time later.

I'll keep this gist updated :)

@ggeorgovassilis
Copy link

ggeorgovassilis commented Aug 21, 2024

I plotted the spectrogram of the 320k and the "320k->downsampled to 56k -> upsampled to 320k" files, took a screenshot and subtracted the images from each other. I'm looking for discernable patterns. The "blocks" are misleading, that's just me croaking scales. The faint, pink, belt in the middle is low-frequency noise (<100 Hz) on the left channel - that seems to be more present in the 320k file than in the upsampled one.

320-56-difference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment