Is their normalisation just a plain and simple level adjustment or does it have compensatory psychoacoustic/other treatment?
Just a curious question, nothing more.
I thought I already answered this question, but ok, let's try again
First, I assume that we don't count the limiter engaging at -1 dBFS as "compensatory psychoacoustic/other treatment".
With that,
as far as I can tell, no, they don't do any compensatory psychoacoustic/other treatment. This is my conclusion from the following process applied to a few tracks:
- Capture a digital output of Spotify client when playing a track with normalization disabled. Let's call the resulting track "normOff".
- Capture a digital output of Spotify client when playing the track with normalization enabled. Let's call the resulting track "normOn".
- Align those tracks.
- Find a suitable sample in those tracks to calculate the level difference between them. Suitable means:
- not too quiet, in order to have good precision,
- and not too loud, in order to avoid samples that could be affected by clipping in "normOff" or by limiting in "normOn".
- Calculate the level difference and simply reduce the volume of the louder track by that amount using SoX and its "vol" command:
Code:
sox input.wav output.wav vol 0.xxx
- Generate a null of the tracks using SoX:
Code:
sox -m -v 1 input1.wav -v -1 input2.wav null.wav
- Check the level of the null and generate its spectrogram, again using SoX:
Code:
sox null.wav -n stats
sox null.wav -n spectrogram -X 4 -y 513 -o null.png
From that process, if "normOff" was louder than "normOn", the null is mostly perfect*, sometimes with a more or less occasional bleep indicating the clipping in "normOff" discussed earlier. Btw, Spotify actaully
recommends to master tracks below -1 dBTP (True Peak) to avoid those bleeps.
On the other hand, if "normOn" was louder than "normOff", the null is sometimes perfect and sometimes not. When it is not, the difference is only in parts where the track is loud, indicating the use of the limiter in "normOn". And usually changing the normalization level to "Quiet" and repeating the process results in perfect null.
*) perfect null means RMS below -90 dBFS, peak below -80 dBFS. The output of the client is 16 bit, so those values indicate only a difference in dither that was applied.
I would like to know more about the audio processing that Streamers employ. The commercial temptation is to one-up the competition.
In my case, if they want to
catch-up to the competition, they should use album normalization everywhere and not use limiter, just play the album that much quieter so it does not clip.