That -30dB noise floor is over the entire spectrum and if most of it is in the lower part of the spectrum it isn't the audible noise floor for mids/upper mids.
Consider that the dynamic range of human can handle is about 70-80dB.
This is not the same as being able to hear -5dB at 3kHz and 130dB (pain limit).
Play peaks at 100dB SPL and at 30dB peaks it is all silence one hears.
Play peaks at 120dB and about 40dB is the softest sound one can hear... that will be lower again when some time has passed without such loud peaks.
But yes, 16bits, dithered is more than enough and can reach over 96dB dynamic range. Think of dither as DSD in the smallest bit.
For recordings it pays to have 144dB (theoretically as the practical noise floor is in the way) as one can have a LOT of headroom recording loud sounds without the need of watching the peak meters not reaching 0dB and having to redo the recording after apologizing to the artists