Yes, it will degrade.If for example, I take an mp3 file, encode it to wav, and then re-encode it to mp3. Will the resulting file be different/degraded compared to the original mp3?
That logic appears sound on the surface, but I can dig down further and ask why would it be, actually? MP3 applies some psychoacoustic-based algorithm to determine the best way to encode the sound file at a specified bitrate without losing apparent fidelity. If the source file was already encoded as mp3, then the encoding step from the converted wav (reminder – this is a wav of a lossy file) doesn't need to make any changes in order to compress the sound to fit the bitrate (given that it's encoding to the same bitrate).Mp3 is lossy. This by definition means it "loses" information each encode.
Encoding with high enough bitrate can mitigate/lessen the impact, but you're still going lossy on lossy. So the answer to whether it degraded is "yes".
How could it know? It just sees a file and compresses it.The real question is, if the algorithm is blindly applying its compression on whatever soundwave it receives, or does it first check to see if the acoustic model has already been applied – meaning it doesn't need to re-apply it.
That logic appears sound on the surface, but I can dig down further and ask why would it be, actually? MP3 applies some psychoacoustic-based algorithm to determine the best way to encode the sound file at a specified bitrate without losing apparent fidelity. If the source file was already encoded as mp3, then the encoding step from the converted wav (reminder – this is a wav of a lossy file) doesn't need to make any changes in order to compress the sound to fit the bitrate (given that it's encoding to the same bitrate).
The real question is, if the algorithm is blindly applying its compression on whatever soundwave it receives, or does it first check to see if the acoustic model has already been applied – meaning it doesn't need to re-apply it.
I need to leave the house now, but later I'll try to do an experiment comparing the original mp3 with the re-encoded one using Deltawave. If you are correct, then it should give some level of mismatching from the original.
DeltaWave v1.0.46, 2019-11-06T18:38:10.2776392+01:00
Reference: 2L-139_01_stereo_01.cd.flac[L] 18380292 samples 44100Hz 16bits, stereo, MD5=00
Comparison: 2L-139_01_stereo_01.cd_mp3_V2_1st_enc.mp3[L] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -160dB
Correct Drift:True, Precision:30
Non-Linear drift Correction:False
Upsample:False, Window:Hann
Spectrum Window:Hann, Spectrum Size:131072
Spectrogram Window:Hann, Spectrogram Size:16384, Spectrogram Steps:2048
Dither:False
Trim Silence:False
Enable Simple Waveform Measurement: False
Discarding Reference: Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s
Initial peak values Reference: -0,27dB Comparison: -0,205dB
Initial RMS values Reference: -21,065dB Comparison: -21,066dB
Null Depth=9,466dB
X-Correlation offset: -1105 samples
Drift computation quality, #1: Excellent (0,2μs)
Trimmed 0 samples ( 0,00ms) front, 0 samples ( 0,00ms end)
Final peak values Reference: -0,27dB Comparison: -0,205dB
Final RMS values Reference: -21,065dB Comparison: -21,065dB
Gain= 12,0412dB (4x) DC=0 Phase offset=-25,056378ms (-1104,986 samples)
Difference (rms) = -62,46dB [-63,31dBA]
Correlated Null Depth=64,46dB [59,59dBA]
Clock drift: 0 ppm
Files are NOT a bit-perfect match (match=2,49%) at 16 bits
Files match @ 49,9718% when reduced to 10,53 bits
---- Phase difference (full bandwidth): 58,0297418504023°
0-10kHz: 14,34°
0-20kHz: 52,15°
0-24kHz: 58,03°
Timing error (rms jitter): 3,6μs
RMS of the difference of spectra: -122,290087627673dB
gn=0,249999907854441, dc=0, dr=-1,85E-09, of=-1104,9862595922
DONE!
Signature: ca94b3a98c2e5ce4f78417879be6ed8c
DeltaWave v1.0.46, 2019-11-06T18:51:08.4092936+01:00
Reference: 2L-139_01_stereo_01.cd.flac[L+R] 18380292 samples 44100Hz 16bits, stereo, MD5=00
Comparison: 2L-139_01_stereo_01.cd_mp3_V2_2nd_enc.mp3[L+R] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -160dB
Correct Drift:True, Precision:30
Non-Linear drift Correction:False
Upsample:False, Window:Hann
Spectrum Window:Hann, Spectrum Size:131072
Spectrogram Window:Hann, Spectrogram Size:16384, Spectrogram Steps:2048
Dither:False
Trim Silence:False
Enable Simple Waveform Measurement: False
Discarding Reference: Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s
Initial peak values Reference: -1,647dB Comparison: -1,636dB
Initial RMS values Reference: -22,678dB Comparison: -22,679dB
Null Depth=227,177dB
X-Correlation offset: -1105 samples
Drift computation quality, #1: Excellent (0,19μs)
Trimmed 0 samples ( 0,00ms) front, 0 samples ( 0,00ms end)
Final peak values Reference: -1,647dB Comparison: -1,635dB
Final RMS values Reference: -22,678dB Comparison: -22,678dB
Gain= 12,0412dB (4x) DC=0 Phase offset=-25,055219ms (-1104,935 samples)
Difference (rms) = -62,37dB [-63,18dBA]
Correlated Null Depth=61,77dB [59,02dBA]
Clock drift: 0 ppm
Files are NOT a bit-perfect match (match=1,95%) at 16 bits
Files match @ 50,0007% when reduced to 10,64 bits
---- Phase difference (full bandwidth): 58,3831349907483°
0-10kHz: 17,31°
0-20kHz: 52,49°
0-24kHz: 58,38°
Timing error (rms jitter): 8,8μs
RMS of the difference of spectra: -123,584671109206dB
gn=0,249998683347809, dc=0, dr=-3,745E-09, of=-1104,9351705568
DONE!
Signature: 8a7aecbb29fcc3afef38885bc4ba005b
DeltaWave v1.0.46, 2019-11-06T18:57:44.2966421+01:00
Reference: 2L-139_01_stereo_01.cd_mp3_V2_1st_enc.mp3[L+R] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Comparison: 2L-139_01_stereo_01.cd_mp3_V2_2nd_enc.mp3[L+R] 18382464 samples 44100Hz 16bits, stereo, MD5=00
Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -160dB
Correct Drift:True, Precision:30
Non-Linear drift Correction:False
Upsample:False, Window:Hann
Spectrum Window:Hann, Spectrum Size:131072
Spectrogram Window:Hann, Spectrogram Size:16384, Spectrogram Steps:2048
Dither:False
Trim Silence:False
Enable Simple Waveform Measurement: False
Discarding Reference: Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s
Initial peak values Reference: -1,629dB Comparison: -1,636dB
Initial RMS values Reference: -22,679dB Comparison: -22,679dB
Null Depth=274,305dB
X-Correlation offset: 0 samples
Drift computation quality, #1: Excellent (0,09μs)
Trimmed 0 samples ( 0,00ms) front, 0 samples ( 0,00ms end)
Final peak values Reference: -1,629dB Comparison: -1,636dB
Final RMS values Reference: -22,679dB Comparison: -22,679dB
Gain= 12,0412dB (4x) DC=0 Phase offset=0,000221ms (0,01 samples)
Difference (rms) = -66,42dB [-67,06dBA]
Correlated Null Depth=77,1dB [69,47dBA]
Clock drift: 0 ppm
Files are NOT a bit-perfect match (match=8,58%) at 16 bits
Files match @ 49,9992% when reduced to 12,07 bits
---- Phase difference (full bandwidth): 42,9494859546678°
0-10kHz: 10,58°
0-20kHz: 33,47°
0-24kHz: 42,95°
Timing error (rms jitter): 1,9μs
RMS of the difference of spectra: -134,670908172942dB
gn=0,249999969427652, dc=-4,398296917148E-14, dr=-5,38E-10, of=0,0097649679
DONE!
Signature: a0e6fb66b7befdab9e030ee527ab46c4
Original WAV 73.522.314 bytes
MP3 V2 first 'pass' 8.216.350 bytes
MP3 V2 second 'pass' 8.195.026 bytes
MP3 V2 third 'pass' 8.162.241 bytes
That logic appears sound on the surface, but I can dig down further and ask why would it be, actually? MP3 applies some psychoacoustic-based algorithm to determine the best way to encode the sound file at a specified bitrate without losing apparent fidelity. If the source file was already encoded as mp3, then the encoding step from the converted wav (reminder – this is a wav of a lossy file) doesn't need to make any changes in order to compress the sound to fit the bitrate (given that it's encoding to the same bitrate).
The real question is, if the algorithm is blindly applying its compression on whatever soundwave it receives, or does it first check to see if the acoustic model has already been applied – meaning it doesn't need to re-apply it.
If it's a really good algorithm, like a modern AAC codec, or Opus, it will not significantly further degrade the samples because it's smart that way.
Thanks! Very informative. I would like to see these spectrograms please.I'll upload a few spectrograms of the deltas if you care. Maybe the 3rd pass compared to the 2nd pass.
Shorter windows (frames of audio) are used for transients and longer ones for more steady-state signals. Shorter windows are less efficient but better preserve transient response (i.e. pre-echo).