• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Why is 16-bit undithered a "rough" wave, while 24-bit is smooth?

I like to think that I understand how digital audio works pretty well, but please help me understand this:

I was reading these measurements from Stereophile:


Figure 6 shows an undithered 16-bit tone, whereas figure 7 shows the same with 24 bit:
"[T]he M51's reproduction of an undithered 16-bit tone at exactly –90.31dBFS was essentially perfect (fig.6), with a symmetrical waveform and the Gibbs Phenomenon "ringing" on the waveform tops well defined. With 24-bit data, the M51 produced a superbly defined sinewave (fig.7). "


Figure 6:

View attachment 340354


Figure 7:

View attachment 340355

Why are the two waves so different between 16-bit and 24-bit?

They look different because of resolution. OTOH this is not a big deal and it is mostly exploited because of marketing.
 
The graphs are somewhat misleading because at the last bit of 24-bit, it should look similar to the last bit of 16-bits. Below that is quantization noise. But it looks like the reviewer is testing something specific about sending an un-dithered signal through the device that I don't quite understand.

edit: Oy I forgot that there are many more steps in 24-bit than 16-bit.

In practice, the difference I've found between 16-bit and 24-bit is that with 24-bit, you can record audio at line level (around -18dbFS RMS) and not worry about quantization distortion creeping into the mix as multiple recorded elements are layered, compressed and loudened during mixing, and then loudness maximized during mastering. At 24-bit, preamp noise is going to be the bigger concern :)
 
Last edited:
Why are the two waves so different between 16-bit and 24-bit?

At such a low level, there's almost no resolution left at 16-bit. The measurement shows that the DAC is working well.

This is what an undithered 16-bit 1 kHz sinewave looks like in Adobe Audition, at -90.31 dB:

1 khz 16-bit audition.png


1. Generate a 0.006s 1 kHz sinewave, volume 0dB (max), at 32-bit, 44.1 kHz.

2. Use effects, amplitude, amplify/fade, constant amplification and -90.31 dB.

3. Edit, Convert Sample Type, 16-bit, dither disabled.

4. Convert sample type back to 32-bit.

5. Amplify it by 85 dB, and cut it to match the Stereophile measurement.

6. Use Effects, Filters, FFT filter and apply a brickwall filter at 20500 Hz (to match the M51 measurements).
 
Last edited:
I like to think that I understand how digital audio works pretty well, but please help me understand this:

I was reading these measurements from Stereophile:


Figure 6 shows an undithered 16-bit tone, whereas figure 7 shows the same with 24 bit:
"[T]he M51's reproduction of an undithered 16-bit tone at exactly –90.31dBFS was essentially perfect (fig.6), with a symmetrical waveform and the Gibbs Phenomenon "ringing" on the waveform tops well defined. With 24-bit data, the M51 produced a superbly defined sinewave (fig.7). "


Figure 6:

View attachment 340354


Figure 7:

View attachment 340355

Why are the two waves so different between 16-bit and 24-bit?
A belated response: I have been performing this test for a long time, because with the 16-bit data it reveals DAC linearity problems.

In the twos-complement encoding used by 16-bit digital audio, –1 least significant bit (LSB) is represented by 1111 1111 1111 1111, digital zero by 0000 0000 0000 0000, and +1 LSB by 0000 0000 0000 0001. If the waveform at exactly -90.31dBFS is symmetrical, this indicates that changing all 16 bits in the digital word gives exactly the same change in the analog output level as changing just the LSB.

With 24-bit data, the test reveals if the DAC's analog noisefloor is low enough for the sinewave not to be obscured by noise.

John Atkinson
Technical Editor, Stereophile
 
A belated response: I have been performing this test for a long time, because with the 16-bit data it reveals DAC linearity problems.

In the twos-complement encoding used by 16-bit digital audio, –1 least significant bit (LSB) is represented by 1111 1111 1111 1111, digital zero by 0000 0000 0000 0000, and +1 LSB by 0000 0000 0000 0001. If the waveform at exactly -90.31dBFS is symmetrical, this indicates that changing all 16 bits in the digital word gives exactly the same change in the analog output level as changing just the LSB.

With 24-bit data, the test reveals if the DAC's analog noisefloor is low enough for the sinewave not to be obscured by noise.

John Atkinson
Technical Editor, Stereophile
So sorry in advance for this stupidly basic question, John:
if every dac uses the digital map captured as input for a reconstruction filter ( sincx function or similar), according to Shannon the reconstructed analog wave should almost perfectly resemble the original sine wave, as long as the sampling is 2x tha max frecuency captured ( which is the case in the graph), no matter the bit rate, only slightly disturbed by the noise floor implicit in the bit rate used. Then, why this is not seen in the 16 bit graph and only in 24 bits? Just because of the short distance between the signal @-90 db and the noise floor @-98 db? If so, isn’t that noise random, thus leading to random jagged reconstruction instead of the symmetrical gibbs ringing shown in these graphs?
 
If so, isn’t that noise random, thus leading to random jagged reconstruction instead of the symmetrical gibbs ringing shown in these graphs?
Generally speaking, quantization noise is not at all random! It can be highly correlated to the signal.

This is why low-level signals generally tend to require dither to help them make it over this nonlinearity, similar to what high-frequency AC bias does for tape. This is what then turns quantization noise into uncorrelated noise. If the signal is left undithered, this does not happen.

BTW, you can generate some high-level test signals that have the neat property of being "self-dithering", i.e. if you remove the signal you will actually get a residual resembling white noise (which will be lower in amplitude than when adding dither). For example, in a sine wave, the frequency needs to be chosen such that each cycle is being sampled at slightly different points for as long as possible before the pattern starts to repeat.
1 kHz straight makes a very bad choice because the pattern repeats every 441 samples at 44.1 kHz or even every 48 samples at 48 kHz. (And periodicity in time domain equates to sampling in frequency domain, so quantiztation noise ends up forming a bunch of spikes every 44100/441 = 100 Hz.)
With prime multiples of a fraction of 1 Hz in the 1 kHz vicinity we have so far routinely made it to 10-20 second pattern lengths at 16 bit, even into the low hundreds of seconds at times. For example, 999.91 Hz or 1000.03 Hz fit the bill, but some prime fractions of 1 kHz have also shown good results. The finite resolution ultimately limits pattern lengths to below what you would theoretically expect (which would potentially be into the millions of years) as slightly different values end up being rounded to be the same. Finding the actual maximum would make an interesting mathematical problem.
 
So sorry in advance for this stupidly basic question, John:
if every dac uses the digital map captured as input for a reconstruction filter ( sincx function or similar), according to Shannon the reconstructed analog wave should almost perfectly resemble the original sine wave, as long as the sampling is 2x tha max frecuency captured ( which is the case in the graph), no matter the bit rate, only slightly disturbed by the noise floor implicit in the bit rate used. Then, why this is not seen in the 16 bit graph and only in 24 bits? Just because of the short distance between the signal @-90 db and the noise floor @-98 db? If so, isn’t that noise random, thus leading to random jagged reconstruction instead of the symmetrical gibbs ringing shown in these graphs?
With undithered 16-bit data representing a signal at exactly -90.31dBFS, there are only 3 DC voltage levels in the reconstructed analog signal: +1, 0. -1. By definition,these cannot represent a sinewave. However, with the NAD M66's very low noisefloor you can see the leading edges of the transitions between these levels overlaid with the minimum-phase impulse response of the reconstruction filter. See figs.12 & 15 at https://www.stereophile.com/content/nad-m66-streaming-preamplifier-measurements-page-2.

John Atkinson
Technical Editor, Stereophile
 
So the steps are likely much smaller...
The "steps" are the same size whether 16-bit or 24-bit sample size; 24-bit just has more of them. A -90 dB sine wave, while at the limit of 16-bit accuracy, would have many more bits to work with if captured in 24-bit. If you measured a -140 dB sine wave in 24-bit, you'd see a similar distortion as in the original post.
 
The "steps" are the same size whether 16-bit or 24-bit sample size; 24-bit just has more of them. A -90 dB sine wave, while at the limit of 16-bit accuracy, would have many more bits to work with if captured in 24-bit. If you measured a -140 dB sine wave in 24-bit, you'd see a similar distortion as in the original post.
That depends upon the output level (actually the lsb level, since we're being pedantic). If the DAC's maximum output is 4 V for 16 or 24 bits, assuming the same buffer gain, then the 24-bit DAC will have much smaller steps. In practice, at least in my decades of design, that is most often the case, as creating a DAC with 256 times the output voltage is usually impractical. The analogy falls apart somewhat with delta-sigma designs, but no matter the architecture the effective lsb size is usually much smaller for a 24-bit DAC vs. a 16-bit DAC. At least in my experience.
 
If the dithered 16bit signal has decreased the SNR to 93db, what should the reasonable SNR of the DAC be, knowing that the random noises add to the sq.rt of their sum.
Is there any justification to use 20bit 120db SNR to decode 16bit data?
 
Last edited:
If the dithered 16bit signal has decreased the SNR to 93db, what should the reasonable SNR of the DAC be, knowing that the random noises add to the sq.rt of their sum.
Is there any justification to use 20bit 120db SNR to decode 16bit data?

Increasing precision reduces errors in intermediate computations.
 
If a dithered 16bit gets submitted to computation yielding 24bits, are the extra 8 bits polluters or goodies?
From the 24bit yielded, if the upper 16bits are used only, is it still dithered?
 
If a dithered 16bit gets submitted to computation yielding 24bits, are the extra 8 bits polluters or goodies?
They start-out as zeros, whether the 16-bit audio is dithered or not. As soon as as you do any processing (even a small volume change), they are likely to be filled with "data" but it's usually just rounding errors.

If you dither the 24-bit audio, the "new" least significant bits will contain noise, and of course the original 16-bit dither noise will still exist too.

From the 24bit yielded, if the upper 16bits are used only, is it still dithered?
The dither noise remains the same in the 24-bit file.

With most processing, I'm not convinced that up-sampling helps with quality. But, there are exceptions. For example, if you are doing something like a fade-out (or other level reduction) and you leave it at 24-bits, then the fade-out can continue down to around -144dB whereas the 16-bit audio will go dead silent at -96dB. (But your're not likely to hear anything at -90dB anyway.) Or mixing is done buy addition so it can effectively increase bit depth. But converting to 24-bits before summing doesn't help. The result has to be stored at a higher bit depth.

Most processing is done it 32-bit (or sometimes 64-bit) floating point. Since floating point virtually has no limits, intermediate calculations (such as summation or multiplication) can go over 0dB without clipping.
 
If I understood, the extra rounding bits needs to be neglected. 16bit input should remain 16bit output. A video on YouTube by a serious DAC manufacturer says ,DACs more than 16bits don't sound good. I suppose he does includes the extra bits.
My ears are too old to distinguish subtle differences in high frequencies, so I rely what others hear.
I will use src4190 to convert 44.1khz to 96khz and use 16,18,and 20bit dacs from those that are not sought after and run only 16bits. The src4190 adds dither, so I will output 24bits and load in dacs only the first 16bits. If dither is needed, it will be switched to 16bits output.
 
If I understood, the extra rounding bits needs to be neglected. 16bit input should remain 16bit output. A video on YouTube by a serious DAC manufacturer says ,DACs more than 16bits don't sound good. I suppose he does includes the extra bits.
My ears are too old to distinguish subtle differences in high frequencies, so I rely what others hear.
I will use src4190 to convert 44.1khz to 96khz and use 16,18,and 20bit dacs from those that are not sought after and run only 16bits. The src4190 adds dither, so I will output 24bits and load in dacs only the first 16bits. If dither is needed, it will be switched to 16bits output.
Maybe neither here nor there but dither is often (usually?) added to 16 bit recordings at the mastering stage, so choice of DAC is sort of moot in those cases.
 
I simulated on LTSPICE +/-32 step integer sine wave.
If the DAC is 16bit perfection, then the minimum distortion with 44.1khz is 1.4% and 1.1% with 96khz, 1.07% with 192khz.
That is, 0db min. -96db ,-60db min.-36db.
This means that 16bit DACs with lower distortion have less than 16bit monotonicity. Example, AD1851 has THD+N 0.9%(0.7% THD) with 14bit monotonicity.
This why higher bit rates that have very linear 16 major bits, don't sound as good.
The TDA1541A with 3x crown can go down to -47db THD +N less than 0.4% THD, because the selected ones happen to be nonlinear, this what I conclude.
 
I simulated on LTSPICE +/-32 step integer sine wave.
If the DAC is 16bit perfection, then the minimum distortion with 44.1khz is 1.4% and 1.1% with 96khz, 1.07% with 192khz.
That is, 0db min. -96db ,-60db min.-36db.
This means that 16bit DACs with lower distortion have less than 16bit monotonicity. Example, AD1851 has THD+N 0.9%(0.7% THD) with 14bit monotonicity.
This why higher bit rates that have very linear 16 major bits, don't sound as good.
The TDA1541A with 3x crown can go down to -47db THD +N less than 0.4% THD, because the selected ones happen to be nonlinear, this what I conclude.
Distortion can come from many other things than nonmonotonicity, like a bow in the INL/DNL curves, or distortion in the output buffer regardless of how perfect the actual DAC's steps.
 
We are dealing with the lowest 6 bits +sign bit to describe +/-32 levels.
Why a DAC with 4 good and 2 lowest bits corrupt, does reproduce lower distortion than the perfect 6bit ones?
At this level, 0.5%-2%, the I/V distortion is negligible.
INL/DNL are useful with all bits monotonic I suppose.
 
Back
Top Bottom