• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

What exactly happens to a master after it's been sent to Spotify?

tccalvin

Active Member
Joined
Oct 13, 2024
Messages
108
Likes
93
I was reading ASR’s thread on inter-sample overs, and one of the things that caught my attention was that peaks over 0dBFS are not just caused by oversampling but also by lossy encoding, among other things.

I use Spotify to listen to most of my music, and I’ve always kept audio normalization set to “quiet,” but I never really understood why until now. (I just read somewhere long ago that it’s good practice and didn’t question it.) So all’s well that ends well, right? Well…

I had a thought: we can prevent clipping at the decoding stage, but what about what happens before that? On this page, Spotify recommends submitting masters with peaks at -1dBTP to -2dBTP to avoid distortion. But what happens if you don’t do that? What if you submit a master that has peaks at, say, -0.1dBFS?

I don’t really know much about lossy encoding processes, so I’m asking out of ignorance. Will the encoded version of the above-mentioned master clip all over and be forever ruined with no way to fix it? Or does Spotify have some safeguards in place to lower the volume before encoding (assuming that’s what’s required to prevent clipping)? Has this ever been tested?

I apologize if this question has been asked before or if it doesn’t make sense due to my lack of knowledge about the topic.

On the other hand, if this is a valid concern, I happen to have some… less-than-ideal masters that were submitted to Spotify (totally not made by me 5 years ago…) that could be used for testing. Although I’d need some help to get reasonably accurate results.

Thanks for your time and patience. :)
 
Hey all, forgive me for artificially boosting the thread.

I'm thinking of removing my old masters from Spotify next year. I wanted to give this thread another chance while the songs are still up—just in case anyone is interested in the topic but didn’t stumble upon this thread at the right time (assuming the topic hasn’t been discussed before).

I did some testing with a 4-track EP I published on Spotify in 2020. I don’t remember the exact peak value at which I bounced the final master (it was close to 0dBFS), but a quick look in Audacity confirmed that 100% of the samples were below 0dBFS. At the time, I most likely didn’t use True Peak metering while limiting because I didn’t even know what it meant. :)

Since I couldn’t find a way to download the compressed files from Spotify—even though I own the songs—the next best thing I could think of was recording Spotify’s digital output using a virtual interface. I used a virtual I/O software for macOS called BlackHole, set at a 44.1kHz sampling rate. The bit depth was 32-bit, with no option to change it. I recorded the output in Audacity at 44.1kHz, 32-bit float. Meanwhile, Spotify was set to output a fixed 320kbps stream (dynamic quality was set to "off"), with volume normalization disabled.

---

Here are the results:

Schermata 2024-11-20 alle 13.05.40.png


At first glance, there do appear to be some boosted peaks, resulting in clipped samples. It’s not terrible, but I’ve seen masters that are hotter than these with an even narrower dynamic range, so those might fare worse.

Unfortunately, I’m not confident that this testing procedure produces entirely accurate results. First, since I don’t know how Spotify’s encoding works, I have no idea if these clipped samples could be recovered by enabling volume normalization during decoding. Second, I’m unsure whether recording at 32 bits might alter the results compared to 16 bits, which (as far as I know) is Spotify’s output bit depth. Finally, I’m concerned that either BlackHole or Audacity might be “reclocking” the signal, potentially causing phase shifts relative to the samples in the original encoded/decoded files. There may be additional issues I haven’t thought of, but these are the main ones on my radar.

---

I’m hoping someone more knowledgeable can correct me, point me to a properly conducted test, or help shed some light on this issue.

As always, thank you for your time and patience!
 
On this page, Spotify recommends submitting masters with peaks at -1dBTP to -2dBTP to avoid distortion. But what happens if you don’t do that? What if you submit a master that has peaks at, say, -0.1dBFS?
Then it will get distorted when normalization is disabled or not available (e.g., in webplayer, I believe)

I don’t really know much about lossy encoding processes, so I’m asking out of ignorance. Will the encoded version of the above-mentioned master clip all over and be forever ruined with no way to fix it? Or does Spotify have some safeguards in place to lower the volume before encoding (assuming that’s what’s required to prevent clipping)? Has this ever been tested?
Encoders work in float and the encoded form doesn't even store samples, so there is nothing to clip. You should be able to test it yourself in Audacity:
  • generate a tone,
  • amplify it to something > 0 dBFS,
  • export as Ogg Vorbis
  • import the file back (but in the file dialog you have to select "FFmpeg-compatible files" instead of "All files")
  • zoom-out the y-axis (if that's still possible in the current versions)
I get this (top is the generated signal, bottom is the imported file):

vorbis.clip.test.png


(assuming the topic hasn’t been discussed before)
Possibly here:

I have no idea if these clipped samples could be recovered by enabling volume normalization during decoding.
You could capture the output with normalization enabled and compare, probably with DeltaWave.

Quite some time ago I did something like that with some Dua Lipa track and got this (top is normalization disabled, bottom - enabled, files were volume matched obviously):

norm_on_vs_off.png
 
Last edited:
The lossy encoding can push the peaks over 0dB without clipping.

If you allow them to loudness-normalize it, I assume they are checking the encoded file with the new-higher peaks and it should be turned down for 0dB peaks, or lower if necessary to hit their LUFS target. Most tracks get turned-down anyway.

If you bypass loudness-normalization and play it a full digital volume, you can clip your DAC.

I've never seen a listen test showing that the slight clipping caused by lossy compression and then decompressing back to PCM is audible. If you hear lossy compression artifacts you're probably hearing something else.

...And when you had-off your music to a streaming service you're no longer in control. And as a listener we can't control which mix/master/production they are feeding us.
 
Many thanks for your feedback, @danadam and @DVDdoug . You guys are helping me learn.

Quite some time ago I did something like that with some Dua Lipa track and got this:

View attachment 408157

That looks pretty conclusive to me. :)

I could try the same with my masters but since I only get single clipped samples I doubt the waves would look much different.

I've never seen a listen test showing that the slight clipping caused by lossy compression and then decompressing back to PCM is audible. If you hear lossy compression artifacts you're probably hearing something else.

Oh, no worries about that. I can’t even tell the difference between a FLAC and a 160kbps compressed file. My trust in my ears is at an all time low. That said, it’s still nice to learn the ins and outs of streaming services in order to make the best use of them. And to get rid of those red lines, if nothing else. :)
 
Back
Top Bottom