• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Encoding and re-encoding to mp3 – does the quality degrade?

Veri

Master Contributor
Joined
Feb 6, 2018
Messages
9,597
Likes
12,039
Mp3 is lossy. This by definition means it "loses" information each encode.

Encoding with high enough bitrate can mitigate/lessen the impact, but you're still going lossy on lossy. So the answer to whether it degraded is "yes".
 

Soniclife

Major Contributor
Forum Donor
Joined
Apr 13, 2017
Messages
4,508
Likes
5,436
Location
UK
If for example, I take an mp3 file, encode it to wav, and then re-encode it to mp3. Will the resulting file be different/degraded compared to the original mp3?
Yes, it will degrade.
There might be edge cases where it won't, but not in reality.
 
OP
Fluffy

Fluffy

Addicted to Fun and Learning
Joined
Sep 14, 2019
Messages
856
Likes
1,425
Mp3 is lossy. This by definition means it "loses" information each encode.

Encoding with high enough bitrate can mitigate/lessen the impact, but you're still going lossy on lossy. So the answer to whether it degraded is "yes".
That logic appears sound on the surface, but I can dig down further and ask why would it be, actually? MP3 applies some psychoacoustic-based algorithm to determine the best way to encode the sound file at a specified bitrate without losing apparent fidelity. If the source file was already encoded as mp3, then the encoding step from the converted wav (reminder – this is a wav of a lossy file) doesn't need to make any changes in order to compress the sound to fit the bitrate (given that it's encoding to the same bitrate).

The real question is, if the algorithm is blindly applying its compression on whatever soundwave it receives, or does it first check to see if the acoustic model has already been applied – meaning it doesn't need to re-apply it.

I need to leave the house now, but later I'll try to do an experiment comparing the original mp3 with the re-encoded one using Deltawave. If you are correct, then it should give some level of mismatching from the original.
 

Vincent Kars

Addicted to Fun and Learning
Technical Expert
Joined
Mar 1, 2016
Messages
790
Likes
1,583
The real question is, if the algorithm is blindly applying its compression on whatever soundwave it receives, or does it first check to see if the acoustic model has already been applied – meaning it doesn't need to re-apply it.
How could it know? It just sees a file and compresses it.
 
Last edited:

scott wurcer

Major Contributor
Audio Luminary
Technical Expert
Joined
Apr 24, 2019
Messages
1,501
Likes
2,822
One question might be does it reach a practical limit or is it like Alvin Lucier's "I am Sitting in a Room"?

 

somebodyelse

Major Contributor
Joined
Dec 5, 2018
Messages
3,745
Likes
3,032
@Fluffy while you're doing the comparison you could try multiple compression/decompression cycles to see whether it gets progressively worse, or levels out. Extra points for mixing different codecs (mp3->aac->opus etc.)
 

bravomail

Addicted to Fun and Learning
Joined
Oct 19, 2018
Messages
817
Likes
461
That logic appears sound on the surface, but I can dig down further and ask why would it be, actually? MP3 applies some psychoacoustic-based algorithm to determine the best way to encode the sound file at a specified bitrate without losing apparent fidelity. If the source file was already encoded as mp3, then the encoding step from the converted wav (reminder – this is a wav of a lossy file) doesn't need to make any changes in order to compress the sound to fit the bitrate (given that it's encoding to the same bitrate).

The real question is, if the algorithm is blindly applying its compression on whatever soundwave it receives, or does it first check to see if the acoustic model has already been applied – meaning it doesn't need to re-apply it.

I need to leave the house now, but later I'll try to do an experiment comparing the original mp3 with the re-encoded one using Deltawave. If you are correct, then it should give some level of mismatching from the original.

WAV original, example "mymusic" -> to MP3 (with some losses) "mymusik"
MP3 v1 to WAV v2 - no losses "mymusik"
WAV v2 to MP3 v2 - some more losses (changes) compared to original WAV "mimusik"
etc
 

zermak

Senior Member
Joined
Jun 2, 2019
Messages
373
Likes
251
Location
Italy
I have done it for you using the 2L original CD version of Chromatic Fantasia and Fugue in D minor, BWV 903: Fantasia by Christian Grøvlen.
I've used LAME MP3 encoder version 3.100 with the V2 compression option.
Here is the DeltaWave report comaring the origianl CD versus the first audio file encoded in mp3:
Code:
DeltaWave v1.0.46, 2019-11-06T18:38:10.2776392+01:00
Reference:  2L-139_01_stereo_01.cd.flac[L] 18380292 samples 44100Hz 16bits, stereo, MD5=00
Comparison: 2L-139_01_stereo_01.cd_mp3_V2_1st_enc.mp3[L] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Settings:
    Gain:True, Remove DC:True
    Non-linear Gain EQ:False    Non-linear Phase EQ: False
    EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -160dB
    Correct Drift:True, Precision:30
    Non-Linear drift Correction:False
    Upsample:False, Window:Hann
    Spectrum Window:Hann, Spectrum Size:131072
    Spectrogram Window:Hann, Spectrogram Size:16384, Spectrogram Steps:2048
    Dither:False
    Trim Silence:False
    Enable Simple Waveform Measurement: False

Discarding Reference:  Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s

Initial peak values Reference: -0,27dB   Comparison: -0,205dB
Initial RMS values Reference: -21,065dB   Comparison: -21,066dB

Null Depth=9,466dB
X-Correlation offset: -1105 samples
Drift computation quality, #1: Excellent (0,2μs)


Trimmed 0 samples ( 0,00ms) front, 0 samples ( 0,00ms end)


Final peak values Reference: -0,27dB   Comparison: -0,205dB
Final RMS values Reference: -21,065dB   Comparison: -21,065dB

Gain= 12,0412dB (4x) DC=0 Phase offset=-25,056378ms (-1104,986 samples)
Difference (rms) = -62,46dB [-63,31dBA]
Correlated Null Depth=64,46dB [59,59dBA]
Clock drift: 0 ppm


Files are NOT a bit-perfect match (match=2,49%) at 16 bits
Files match @ 49,9718% when reduced to 10,53 bits


---- Phase difference (full bandwidth): 58,0297418504023°
    0-10kHz: 14,34°
    0-20kHz: 52,15°
    0-24kHz: 58,03°
Timing error (rms jitter): 3,6μs

RMS of the difference of spectra: -122,290087627673dB
gn=0,249999907854441, dc=0, dr=-1,85E-09, of=-1104,9862595922

DONE!

Signature: ca94b3a98c2e5ce4f78417879be6ed8c

Here is the original versus the second encoding of the lossy file.
Code:
DeltaWave v1.0.46, 2019-11-06T18:51:08.4092936+01:00
Reference:  2L-139_01_stereo_01.cd.flac[L+R] 18380292 samples 44100Hz 16bits, stereo, MD5=00
Comparison: 2L-139_01_stereo_01.cd_mp3_V2_2nd_enc.mp3[L+R] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Settings:
    Gain:True, Remove DC:True
    Non-linear Gain EQ:False    Non-linear Phase EQ: False
    EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -160dB
    Correct Drift:True, Precision:30
    Non-Linear drift Correction:False
    Upsample:False, Window:Hann
    Spectrum Window:Hann, Spectrum Size:131072
    Spectrogram Window:Hann, Spectrogram Size:16384, Spectrogram Steps:2048
    Dither:False
    Trim Silence:False
    Enable Simple Waveform Measurement: False

Discarding Reference:  Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s

Initial peak values Reference: -1,647dB   Comparison: -1,636dB
Initial RMS values Reference: -22,678dB   Comparison: -22,679dB

Null Depth=227,177dB
X-Correlation offset: -1105 samples
Drift computation quality, #1: Excellent (0,19μs)


Trimmed 0 samples ( 0,00ms) front, 0 samples ( 0,00ms end)


Final peak values Reference: -1,647dB   Comparison: -1,635dB
Final RMS values Reference: -22,678dB   Comparison: -22,678dB

Gain= 12,0412dB (4x) DC=0 Phase offset=-25,055219ms (-1104,935 samples)
Difference (rms) = -62,37dB [-63,18dBA]
Correlated Null Depth=61,77dB [59,02dBA]
Clock drift: 0 ppm


Files are NOT a bit-perfect match (match=1,95%) at 16 bits
Files match @ 50,0007% when reduced to 10,64 bits


---- Phase difference (full bandwidth): 58,3831349907483°
    0-10kHz: 17,31°
    0-20kHz: 52,49°
    0-24kHz: 58,38°
Timing error (rms jitter): 8,8μs

RMS of the difference of spectra: -123,584671109206dB
gn=0,249998683347809, dc=0, dr=-3,745E-09, of=-1104,9351705568

DONE!

Signature: 8a7aecbb29fcc3afef38885bc4ba005b

And at last the comparison between the first encoded file of the origianl versus the second encoding of the lossy file.
Code:
DeltaWave v1.0.46, 2019-11-06T18:57:44.2966421+01:00
Reference:  2L-139_01_stereo_01.cd_mp3_V2_1st_enc.mp3[L+R] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Comparison: 2L-139_01_stereo_01.cd_mp3_V2_2nd_enc.mp3[L+R] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Settings:
    Gain:True, Remove DC:True
    Non-linear Gain EQ:False    Non-linear Phase EQ: False
    EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -160dB
    Correct Drift:True, Precision:30
    Non-Linear drift Correction:False
    Upsample:False, Window:Hann
    Spectrum Window:Hann, Spectrum Size:131072
    Spectrogram Window:Hann, Spectrogram Size:16384, Spectrogram Steps:2048
    Dither:False
    Trim Silence:False
    Enable Simple Waveform Measurement: False

Discarding Reference:  Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s

Initial peak values Reference: -1,629dB   Comparison: -1,636dB
Initial RMS values Reference: -22,679dB   Comparison: -22,679dB

Null Depth=274,305dB
X-Correlation offset: 0 samples
Drift computation quality, #1: Excellent (0,09μs)


Trimmed 0 samples ( 0,00ms) front, 0 samples ( 0,00ms end)


Final peak values Reference: -1,629dB   Comparison: -1,636dB
Final RMS values Reference: -22,679dB   Comparison: -22,679dB

Gain= 12,0412dB (4x) DC=0 Phase offset=0,000221ms (0,01 samples)
Difference (rms) = -66,42dB [-67,06dBA]
Correlated Null Depth=77,1dB [69,47dBA]
Clock drift: 0 ppm


Files are NOT a bit-perfect match (match=8,58%) at 16 bits
Files match @ 49,9992% when reduced to 12,07 bits


---- Phase difference (full bandwidth): 42,9494859546678°
    0-10kHz: 10,58°
    0-20kHz: 33,47°
    0-24kHz: 42,95°
Timing error (rms jitter): 1,9μs

RMS of the difference of spectra: -134,670908172942dB
gn=0,249999969427652, dc=-4,398296917148E-14, dr=-5,38E-10, of=0,0097649679

DONE!

Signature: a0e6fb66b7befdab9e030ee527ab46c4
In every step you can hear what's missing in the encoded one (DeltaWave allows it); probably difficult in a normal listening.

About file sizes:
Code:
Original WAV            73.522.314 bytes
MP3 V2 first 'pass'         8.216.350 bytes
MP3 V2 second 'pass'     8.195.026 bytes
MP3 V2 third 'pass'         8.162.241 bytes
You lose little information each encoding of the previous already encoded file.

I'll upload a few spectrograms of the deltas if you care. Maybe the 3rd pass compared to the 2nd pass.
 

Veri

Master Contributor
Joined
Feb 6, 2018
Messages
9,597
Likes
12,039
That logic appears sound on the surface, but I can dig down further and ask why would it be, actually? MP3 applies some psychoacoustic-based algorithm to determine the best way to encode the sound file at a specified bitrate without losing apparent fidelity. If the source file was already encoded as mp3, then the encoding step from the converted wav (reminder – this is a wav of a lossy file) doesn't need to make any changes in order to compress the sound to fit the bitrate (given that it's encoding to the same bitrate).

The real question is, if the algorithm is blindly applying its compression on whatever soundwave it receives, or does it first check to see if the acoustic model has already been applied – meaning it doesn't need to re-apply it.

It doesn't work that way. Psychoacoustic models will still be applied to already compressed material. If it's a really good algorithm, like a modern AAC codec, or Opus, it will not significantly further degrade the samples because it's smart that way. Depending on the bit rate of course. MP3 is less good in this regard since it's an older codec.

In any case, if you keep compressing something like mp3 into mp3 into mp3 ... you really will end up with a progressively more degraded file.
 

BillG

Major Contributor
Joined
Sep 12, 2018
Messages
1,699
Likes
2,268
Location
Auckland, New Zealand
If it's a really good algorithm, like a modern AAC codec, or Opus, it will not significantly further degrade the samples because it's smart that way.

My understanding from some reading I did awhile ago was that it requires numerous (more than 10, but I forget the exact number) reencodes of AAC before the degradation is apparent to the average listener.
 
OP
Fluffy

Fluffy

Addicted to Fun and Learning
Joined
Sep 14, 2019
Messages
856
Likes
1,425
I'll upload a few spectrograms of the deltas if you care. Maybe the 3rd pass compared to the 2nd pass.
Thanks! Very informative. I would like to see these spectrograms please.

I wonder if this degradation is audible. I can't really tell the difference between WAV and mp3 at 320kbps, so I wonder how worse could it get if you encode a second time.


@scott wurcer I thought about that too! Definitely would be an interesting experiment to make. There is actually an example of a very similar concept done with youtube streaming:
 
OP
Fluffy

Fluffy

Addicted to Fun and Learning
Joined
Sep 14, 2019
Messages
856
Likes
1,425
Ran my own experiment with DW, and indeed there is generational degradation. I tried re encoding a file about 20 times, and by the end it sounded noticeably worse (at 320kbps at all stages).

Also tried comparing the delta of AAC encoded file compared to the lossless and the one from MP3 compared to lossless. They are both at about the same level, but have different characteristics. The delta from mp3 sounded like it contains more transients and peaks, while the delta from AAC sounds closer to smooth noise. Maybe it's due to what amirm said in a different post:

Shorter windows (frames of audio) are used for transients and longer ones for more steady-state signals. Shorter windows are less efficient but better preserve transient response (i.e. pre-echo).
 
Top Bottom