• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

How much dynamic range is too much

If you want to mess around with a practical real-world example I created one here.
That's a nice little article you put together there, Mike.

IMO you are establishing that this branch of the thread topic is largely academic. BTW I am much more likely to hear microphone hiss than digital format noise floor.
 
Yes, that was well understood, I’m familiarized with stochastic functions: when the amplitude signal reaches 0 at the -6 times the number of bits, the probability of a 1 is biased by half the amplitude of the wave more or less, then the repetition of samples will cause a repetition pattern modeling the wave.

That implies if I’m correct that high frequencies are better represented by the dithering, as the frequency of peaks is greater. Am I right?
Intuitively, the opposite seems right to me. Lower frequencies have more sample points along each crest, offering more chances (sample points) to bias the dither (more chances for a randomly picked 0 to become a 1). Also less quantization error because more dice rolls give a smoother approximation and higher confidence estimate of the relative amplitude.

Put differently, if only a single sample point LSB is a 1, it cannot be differentiated from the random process that created the dither. But if, say, 7 of 8 consecutive samples all have an LSB of 1, it's looking less random and more likely you have encoded a signal whose amplitude is some fraction of the LSB. Not necessarily 7/8 of the LSB, of course.

But this is really a question for @j_j , or likely, one that he's already answered somewhere here at ASR.
 
First, you are extremely unlikely to encounter any recorded music with as much as 70 dB of dynamic range from the softest to the loudest notes, at least with acoustic instruments.

Second, many, like myself, who are quite bothered by noise do not think LP's sound better than digital formats.
In fact this is precisely what one can do with 96 dB of dynamic range and compressing to 70 dB: have softer peaks.

I suppose that from digital recording one can emulate vynil in barely all of it's characteristics ..
 
Intuitively, the opposite seems right to me. Lower frequencies have more sample points along each crest, offering more chances (sample points) to bias the dither (more chances for a randomly picked 0 to become a 1). Also less quantization error because more dice rolls give a smoother approximation and higher confidence estimate of the relative amplitude.

Put differently, if only a single sample point LSB is a 1, it cannot be differentiated from the random process that created the dither. But if, say, 7 of 8 consecutive samples all have an LSB of 1, it's looking less random and more likely you have encoded a signal whose amplitude is some fraction of the LSB. Not necessarily 7/8 of the LSB, of course.

But this is really a question for @j_j , or likely, one that he's already answered somewhere here at ASR.
Yes, I understand your argument, I was thinking in peaks more than shapes.

Groups of more 1s than more 0s will be differentiated than groups with opposite composition when number of bits of each group are enough to weight the probability.

This require large groups so longer wavelength
 
Yes, I understand your argument, I was thinking in peaks more than shapes.

Groups of more 1s than more 0s will be differentiated than groups with opposite composition when number of bits of each group are enough to weight the probability.
Exactly. And DSD or 1-bit audio is also based on this same concept. Sample a boolean value bazillions of times per second, then take weighted averages over multiple samples to determine the amplitude of a much lower frequency encoded wave over that period.
 
That implies if I’m correct that high frequencies are better represented by the dithering, as the frequency of peaks is greater. Am I right?
If I'm not missing something myself, the frequency doesn't matter if you're using flat dither (i.e. TPDF white noise). The dither noise has equal energy at all frequencies, so all frequencies have the same dynamic range.
 
If I'm not missing something myself, the frequency doesn't matter if you're using flat dither (i.e. TPDF white noise). The dither noise has equal energy at all frequencies, so all frequencies have the same dynamic range.
If you take the simple case of randomizing the LSB with equal probability 0 or 1, and consider how it can capture signals smaller than the LSB, it seems that lower frequency signals can be captured with greater accuracy because each sample point gives you a probabilistic representation or indication of the signal, and lower frequencies have more sample points per wavelength.

However, I am not an expert in this... just applying my intuition based on my math knowledge.
 
bazillions of times per second
So fun, I thought that a bazillion was a tape error from your post but after googling it is like our “tropecientos” term in spanish to arbitrary large numbers. It sounds hilarious.

Apparently both arguments (more samples frequencies weighted by frequency of appearance and low frequencies weighted by length of groups) are same balanced, as other member noted below it seem to be frequency independent

Post edited: imagine just a frequency of Nyquist limit, it will be represented by a 0 and a 1 in alternative pattern. With full random choice there’s an absence of pattern but biased by the probability of the sine wave the random pattern present a semi-regular appearence of 1010, 10101010 and groups of the frequency
 
Last edited:
lower frequencies have more sample points per wavelength.
This is true, but if you allow the higher frequencies the same number of sample points there is no difference. Why should only a single cycle be considered?
 
This is true, but if you allow the higher frequencies the same number of sample points there is no difference. Why should only a single cycle be considered?
Since we're talking about signals smaller than the LSB, and each LSB is randomized, each set of sample points gives only a probabilistic indication of any lower level signal. Thus the more sample points, the greater the confidence whether any low level signal exists, and if so the accuracy of its encoding.

I think I see your point. Consider any window in time of N samples. Within this window, low frequency signals have more points in each cycle (higher confidence), but fewer cycles. High frequency signals have fewer points in each cycle (lower confidence), but more cycles. Either way, we have a total of N samples and overall confidence could be the same.

To draw an analogy to A/B testing, where each test is the LSB of a "sample" that might be random. A low frequency signal is like a test subject who does 2 sessions each consisting of 10 tests. A high frequency signal is like a test subject who does 10 sessions each consisting of 2 tests. Either way you have the results of 20 tests (sample points), same confidence either way.

I don't know whether that's how it actually works with the bandwidth of tiny signals that can be encoded through dither, but I see the point you're making, seems intuitive.
 
I have this Beethoven version, both 24-bit and 16-bit, bought from https://www.eclassical.com/labels/bis/beethoven-the-nine-symphonies-2.html

According to the short-term LUFS the quietest parts are in the 2nd movement at around 9:30 and 14:30. Here's the graph for 9th- and 14th-minute:
loudness.png


I took 10-second snippets starting from 9:39 and 14:39 from both 24-bit and 16-bit (in attachment). You can see that the difference is only in the shaped dither:
fft.lin.png


If someone wants something that gets even quieter, there's Okko Kamu, Lahti S.O. / Sibelius - Symphony No. 1 and 2:
sibelius.kamu-lahti.1.1.snip.png

sibelius.kamu-lahti.2.2.snip.png
 

Attachments

Last edited:
Either way, we have a total of N samples and overall confidence could be the same.
Here's a brief experimental demonstration using 16-bit flat TPDF dither:
100Hz at -120dBFS:
TPDF_16bit_100Hz.png
10kHz at -120dBFS:
TPDF_16bit_10kHz.png
Parameters: sample rate = 44.1kHz; DFT length = 256k samples; window = flat top. Averaged over four minutes.

Just for fun, here's a 3.5kHz sine at -150dBFS with noise shaped dither (still 16-bit!):
shaped_wan9_16bit_3.5kHz.png
 
doesn’t the softest 26dB simply disappear into the noise.
I recorded the noise in my room using UMIK1. That's what REW shows for this file, in dBFS:
wav-spectrum-dbfs.png


The levels are:
  • -75.8 dBFS (RMS), -81.7 dBFS (C), -88.6 dBFS (A)
(This is without using the calibration file.)

To check the audibility of signals at/below noise floor (at least for single tones) I generated a file with some beeps at different frequencies:
beeps.png


It starts with 2 seconds of silence, then 125 Hz, 250 Hz, 500 Hz, etc.

I mixed that with the room noise at -90 dBFS (peak) and -100 dBFS (peak), so in the first case the beeps should be at (about) dBFS (A) level of the noise and in the second case 10 dB below. Then I amplified the results (noise+beeps) by 30 dB, to bring it to the audibility level somewhat. The files are in attachment, you can listen to them yourself.

For completeness, here's what REW shows when directly using UMIK1, this time with the calibration file, in dBFS:
umik1-with-cal-spectrum-dbfs.png


and in dB SPL:
umik1-with-cal-spectrum.png


and the logger:
umik1-with-cal.png


I don't know how trustworthy the absolute numbers are, but that's as quiet as it gets here.
 

Attachments

Last edited:
I mixed that with the room noise at -90 dBFS (peak) and -100 dBFS (peak), so in the first case the beeps should be at (about) dBFS (A) level of the noise and in the second case 10 dB below.
FWIW, I can hear a 2kHz-5kHz sine chirp down to approx. -4dBSPL (RMS, or -1dBSPL peak) in my listening room (at night with the HVAC off—it doesn't take much to mask it).
 
Last edited:
Since we're talking about signals smaller than the LSB, and each LSB is randomized, each set of sample points gives only a probabilistic indication of any lower level signal. Thus the more sample points, the greater the confidence whether any low level signal exists, and if so the accuracy of its encoding.

I think I see your point. Consider any window in time of N samples. Within this window, low frequency signals have more points in each cycle (higher confidence), but fewer cycles. High frequency signals have fewer points in each cycle (lower confidence), but more cycles. Either way, we have a total of N samples and overall confidence could be the same.

To draw an analogy to A/B testing, where each test is the LSB of a "sample" that might be random. A low frequency signal is like a test subject who does 2 sessions each consisting of 10 tests. A high frequency signal is like a test subject who does 10 sessions each consisting of 2 tests. Either way you have the results of 20 tests (sample points), same confidence either way.

I don't know whether that's how it actually works with the bandwidth of tiny signals that can be encoded through dither, but I see the point you're making, seems intuitive.
Is more or less what I answered in my last post but better exposed, finally one can draw statistically any function (under the Nyquist limit) because intuitiverly you're shaping both frequency and phase in the Fourier space.

Vectors with idenitcal norm has more accurate representation on the phase space if the wavelength is low and poorer in the frequency one, and viceversa.

1010101010101010101010101010101010101.... can perfectly shape a sine wave


101010101000101010101010111010101000101010

Same wave in which I picked randomly some 0 to 1 and some 1 to 0

Your idea of low frequency so high wavelength:


11111111000000001111111100000000111111110000000011111111000000....

Randomly some bits

11101111000100001110111110000000111110110000000011011111000001
 
Here's a brief experimental demonstration using 16-bit flat TPDF dither:
100Hz at -120dBFS:
View attachment 433368
10kHz at -120dBFS:
View attachment 433369
Parameters: sample rate = 44.1kHz; DFT length = 256k samples; window = flat top. Averaged over four minutes.

Just for fun, here's a 3.5kHz sine at -150dBFS with noise shaped dither (still 16-bit!):
View attachment 433370
Got it, unless the answer didn't goes to me, I'm learning a lot in this thread

Always surpirsed about how seriously are questions took in ASR, despite the fact I'm here since more than a year
 
I have this Beethoven version, both 24-bit and 16-bit, bought from https://www.eclassical.com/labels/bis/beethoven-the-nine-symphonies-2.html

According to the short-term LUFS the quietest parts are in the 2nd movement at around 9:30 and 14:30. Here's the graph for 9th- and 14th-minute:
View attachment 433351

I took 10-second snippets starting from 9:39 and 14:39 from both 24-bit and 16-bit (in attachment). You can see that the difference is only in the shaped dither:
View attachment 433353

If someone wants something that gets even quieter, there's Okko Kamu, Lahti S.O. / Sibelius - Symphony No. 1 and 2:
View attachment 433359
Wonderful representation, never thought that dither could be shown in a time function, this thread is enormously formative
 
Wonderful representation, never thought that dither could be shown in a time function, this thread is enormously formative
I'm not sure what you mean by "in a time function". That graph shows FFT magnitude, so it's a function of frequency.

Shaped dither is adapted to frequency? Is dependent on the sample or is shpaed in the same form constantly?
Maybe there are some advanced algorithms which change the noise shape dynamically, depending on content, but usually the shape is constant. Here are options available in SoX:

The goal is to move the noise from the band where the ear is most sensitive to higher frequencies.
 
Shaped dither is adapted to frequency? Is dependent on the sample or is shpaed in the same form constantly?
Generally constant. For audio, noise shaping filters are usually roughly based on the absolute threshold of hearing.

The particular filter in my example is the 9-tap filter from this paper: R. A. Wannamaker, "Psychoacoustically Optimal Noise Shaping," J. AES, vol. 40, no. 7/8, July 1992.
 
Last edited:
Back
Top Bottom