• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Dither questions

q3cpma

Major Contributor
Joined
May 22, 2019
Messages
3,060
Likes
4,418
Location
France
Hello,

as I make and use my own music player, I have two questions about dither. Knowing that currently, I only output 16 bits PCM and use float32 for replaygain scaling:
* Since I use TPDF dither in the range [-1, +1], I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?
* I plan on making 24 and 32 bit output possible, for further processing; is dither even necessary at that point? The quantization error should be quite small.
* No need for float64, right?
 
Last edited:

boXem

Major Contributor
Audio Company
Joined
Jun 19, 2019
Messages
2,018
Likes
4,901
Location
Europe
Hello,

as I make and use my own music player, I have two questions about dither. Knowing that currently, I only output 16 bits PCM and use float32 for replaygain scaling:
* Since I use TPDF dither in the range [-1, +1]. I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?
* I plan on making 24 and 32 bit output possible, for further processing; is dither even necessary at that point? The quantization error should me quite small.
* No need for float64, right?
Hello,
* my understanding is that you should avoid clipping the dithered signal, otherwise the dithering is no more TPDF, so you need to have the headroom necessary for it.
* not sure about the theory here, but I believe to remember that the effects of the quantization errors are not directly proportional to the errors themselves. So I would dither even on 24 and 32 bits. Once code is written for 16 bits, 24 and 32 bits should be quite straightforward, why spending time to suppress something working?
* wouldn't using float64 during the conversion to integer allow better accuracy? Something like int16_value = (int16)((float64)float32_value * 65534 + dither). I am an embedded specialist, we avoid float like hell since it eats a lot of resources, so I am frankly rusted on the float subject.

Did you play with FFTs of the output of your player?
 
OP
q3cpma

q3cpma

Major Contributor
Joined
May 22, 2019
Messages
3,060
Likes
4,418
Location
France
Hello,
* my understanding is that you should avoid clipping the dithered signal, otherwise the dithering is no more TPDF, so you need to have the headroom necessary for it.
But it means that I must make my input be between 1 and MAX - 1, which doesn't seem perfect. On the other, 0 and MAX samples seem rare, which should make it OK.

* not sure about the theory here, but I believe to remember that the effects of the quantization errors are not directly proportional to the errors themselves. So I would dither even on 24 and 32 bits. Once code is written for 16 bits, 24 and 32 bits should be quite straightforward, why spending time to suppress something working
As you said, I'd like to avoid doing float32 or float64 multiplication for every sample as it can be heavy on uarchs without SIMD.
I'm interested about that bit on the "effects" of quantization, as this runs contrary to logic (at least for me).

* wouldn't using float64 during the conversion to integer allow better accuracy? Something like int16_value = (int16)((float64)float32_value * 65534 + dither). I am an embedded specialist, we avoid float like hell since it eats a lot of resources, so I am frankly rusted on the float subject.
Well, that's the goal, but it would be a lot slower, especially on 32 bit uarchs like ARMv7. As binary32 float can perfectly represent 5 decimal digit integers and binary64 15, I'll probably use binary64 for higher bit depths.

Did you play with FFTs of the output of your player?
No, just did some average of absolute error to convince myself that dither is useful.
 
Last edited:

boXem

Major Contributor
Audio Company
Joined
Jun 19, 2019
Messages
2,018
Likes
4,901
Location
Europe
But it means that I must make my input be between 1 and MAX - 1, which doesn't seem perfect. On the other, 0 and MAX samples seem rare, which should make it OK.


As you said, I'd like to avoid doing float32 or float64 multiplication for every sample as it can be heavy on uarchs without SIMD.
I'm interested about that bit on the "effects" of quantization, as this runs contrary to logic (at least for me).


Well, that's the goal, but it would be a lot slower, especially on 32 bit uarchs like ARMv7.


No, just did some average of absolute error to convince myself that dither is useful.
Sorry for the stupid question, but what's the point of float32 instead of int32 for volume control?

I played a bit with octave doing FFTs to compare dithered and undithered signals and the results were quite spectacular.
 
OP
q3cpma

q3cpma

Major Contributor
Joined
May 22, 2019
Messages
3,060
Likes
4,418
Location
France
Sorry for the stupid question, but what's the point of float32 instead of int32 for volume control?

I played a bit with octave doing FFTs to compare dithered and undithered signals and the results were quite spectacular.
Well, there's no volume control in my program, it's just the result of replaygain. I could take the resulting gain, left shit it, store it in an uint32, then multiply the sample and right shift the resulting int64, but it does seem to get complicated.
 
OP
q3cpma

q3cpma

Major Contributor
Joined
May 22, 2019
Messages
3,060
Likes
4,418
Location
France
Right. But other than scaling to provide enough headroom for dither to overflow the normal range, I don't see anything else you can do here.
Yeah. This isn't the most important thing, honestly, but it could save me a branch for each sample to do "sample * 0.999... + dither" for each. Will probably do it like this.
 
Last edited:

pozz

Слава Україні
Forum Donor
Editor
Joined
May 21, 2019
Messages
4,036
Likes
6,827

mansr

Major Contributor
Joined
Oct 5, 2018
Messages
4,685
Likes
10,705
Location
Hampshire
If the input is well-behaved, it is extremely unlikely for the dither to clip on several consecutive samples. I wouldn't worry about it.
 

mansr

Major Contributor
Joined
Oct 5, 2018
Messages
4,685
Likes
10,705
Location
Hampshire
Sorry for the stupid question, but what's the point of float32 instead of int32 for volume control?
Float lets you reduce the amplitude without losing precision. Of course, you still have to convert to integer for output to the DAC, so it's a bit pointless. As intermediate format between processing stages, it does make sense, though.
 

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,772
Likes
6,203
Location
Berlin, Germany
Hello,

as I make and use my own music player, I have two questions about dither. Knowing that currently, I only output 16 bits PCM and use float32 for replaygain scaling:
* Since I use TPDF dither in the range [-1, +1], I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?
* I plan on making 24 and 32 bit output possible, for further processing; is dither even necessary at that point? The quantization error should be quite small.
* No need for float64, right?
IMHO, the "right way" would be like this:
- convert input to float for gain change.
- apply gain (with saturation one bit below +FS and -FS at equivalent bit depth), all in the float domain. Use symmetrical span (-FS and +FS are the same absolute value of 32767, in case of 16bits).
- add one bit of dither (float).
- quantise to integer output.
There is always a gain applied and that gain must always be < 1.0 for a full scale input, the exact limit being 32766/32767 for 16bit, this makes sure you have one bit of headroom left for the dithering.
You can of course optimize for the special case of gain = 1 and then skip directly over the whole floating point block, simply making output=input.

Acoustically, I won't bother though. When you have fullscale samples in the stream, something is wrong anyway (outright clipping, or at least a risk of intersample overs) and the imperfect dither is irrelevant.
 
Last edited:
OP
q3cpma

q3cpma

Major Contributor
Joined
May 22, 2019
Messages
3,060
Likes
4,418
Location
France
Thanks, I'm convinced that considering that the input (audio data and replaygain peak/gain values) is well formed is the sane way of doing it.

That's what I'm doing in fine, that headroom multiplier value was calculated to be maximum while still reducing the extreme values by one after rounding:
Code:
#define INT16_HEADROOM_MULT (32767.497f / 32768.f)
static inline int16_t triangle_dither(float x)
{
	static float prev_rand = 0.f;
	const float rand = _drand48();
	const float tmp = x * INT16_HEADROOM_MULT + rand - prev_rand;
	prev_rand = rand;
	return lrintf(tmp);
}

void apply_gain(int16_t *restrict buf, size_t nbsample, float gain)
{
	if (gain == 1.f)
		return;

	for (size_t i = 0; i < nbsample; ++i)
		buf[i] = triangle_dither(buf[i] * gain);
}

Though I'm still not certain at all that dithering is needed for 24 or 32 bits output.
 
Last edited:

boXem

Major Contributor
Audio Company
Joined
Jun 19, 2019
Messages
2,018
Likes
4,901
Location
Europe
Thanks, I'm convinced that considering that the input (audio data and replaygain peak/gain values) is well formed is the sane way of doing it.

That's what I'm doing in fine, that headroom multiplier value was calculated to be maximum while still reducing the extreme values by one after rounding:
Code:
#define INT16_HEADROOM_MULT (32767.497f / 32768.f)
static inline int16_t triangle_dither(float x)
{
    static float prev_rand = 0.f;
    const float rand = _drand48();
    const float tmp = x * INT16_HEADROOM_MULT + rand - prev_rand;
    prev_rand = rand;
    return lrintf(tmp);
}

void apply_gain(int16_t *restrict buf, size_t nbsample, float gain)
{
    if (gain == 1.f)
        return;

    for (size_t i = 0; i < nbsample; ++i)
        buf[i] = triangle_dither(buf[i] * gain);
}

Though I'm still not certain at all that dithering is needed for 24 or 32 bits output.
I fear this is still clipping. I would do
Code:
#define INT16_HEADROOM_MULT (32766.f/32767.f)
 
OP
q3cpma

q3cpma

Major Contributor
Joined
May 22, 2019
Messages
3,060
Likes
4,418
Location
France
I fear this is still clipping. I would do
Code:
#define INT16_HEADROOM_MULT (32766.f/32767.f)
I experimented, it works because of the lrint (or trunc if you're dirty). I actually started with the same as yours, then tried .5 which didn't work, then tried .499 which did but produced 32767.50000f for 32768.f, which made me worry I'd be relying on the default rounding direction too much.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
* Since I use TPDF dither in the range [-1, +1], I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?
I've seen some dither plugins keep digital silence in the original data unchanged (which can actually make a file converter creating smaller flac files due to the absence of additional noise). I've also seen dither plugins skip dithering at full scale to avoid clipping, so I don't think there is anything wrong here.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
Tried to to reduce the amplitude of the resulting waveform without skipping dither or changing the desired gain. When the input samples are negative, bias the dither towards positive, and vice versa.
C#:
    double Dither(double sam)
    {
        var pdf = ran.NextDouble() * -4 + 3;
        while (pdf > 2d / 3)
            pdf--;
        return Math.Round(sam + (sam > 0 ? pdf : -pdf));
    }

The white waveform includes 0.48Hz and 480Hz tones. The blue ones are 60dB reduced with dither. The left one is my version and the right one is unshaped TPDF dither from Adobe Audition. Audio files in attachment.
dw.PNG


Differences are easily visible in time domain, in frequency domain they are more or less the same and my algorithm is not audibly quieter. Tried with some real music and doesn't seem to introduce obvious artifacts like RPDF as well. Hopefully RNGs in different languages won't affect the results. It's C#, both the RNG and rounding return double so it doesn't make too much sense to use float or int in the middle. Integer conversion is done outside of the method to reduce unnecessary casting.
C#:
                            while (sampleCnt-- > 0)
                            {
                                double sam = br.ReadInt16();
                                sam *= .001;
                                bw.Write((short)Dither(sam));
                            }

P.S. Audacity and some other software's TPDF dither are +/-1 LSB but slightly shaped (blue) but I am comparing my dither with the non-shaped one in Audition (white).
audacity.png
 

Attachments

  • myDither.zip
    677.6 KB · Views: 81

earlevel

Addicted to Fun and Learning
Joined
Nov 18, 2020
Messages
551
Likes
779
Maybe someone running into this thread will appreciate this:

 
Top Bottom