Dither questions

q3cpma · Aug 12, 2020

Hello,

as I make and use my own music player, I have two questions about dither. Knowing that currently, I only output 16 bits PCM and use float32 for replaygain scaling:
* Since I use TPDF dither in the range [-1, +1], I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?
* I plan on making 24 and 32 bit output possible, for further processing; is dither even necessary at that point? The quantization error should be quite small.
* No need for float64, right?

boXem · Aug 12, 2020

q3cpma said:
Hello,

as I make and use my own music player, I have two questions about dither. Knowing that currently, I only output 16 bits PCM and use float32 for replaygain scaling:
* Since I use TPDF dither in the range [-1, +1]. I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?
* I plan on making 24 and 32 bit output possible, for further processing; is dither even necessary at that point? The quantization error should me quite small.
* No need for float64, right?

Hello,
* my understanding is that you should avoid clipping the dithered signal, otherwise the dithering is no more TPDF, so you need to have the headroom necessary for it.
* not sure about the theory here, but I believe to remember that the effects of the quantization errors are not directly proportional to the errors themselves. So I would dither even on 24 and 32 bits. Once code is written for 16 bits, 24 and 32 bits should be quite straightforward, why spending time to suppress something working?
* wouldn't using float64 during the conversion to integer allow better accuracy? Something like int16_value = (int16)((float64)float32_value * 65534 + dither). I am an embedded specialist, we avoid float like hell since it eats a lot of resources, so I am frankly rusted on the float subject.

Did you play with FFTs of the output of your player?

q3cpma · Aug 12, 2020

Fred Jacquot said:
Hello,
* my understanding is that you should avoid clipping the dithered signal, otherwise the dithering is no more TPDF, so you need to have the headroom necessary for it.

But it means that I must make my input be between 1 and MAX - 1, which doesn't seem perfect. On the other, 0 and MAX samples seem rare, which should make it OK.

* not sure about the theory here, but I believe to remember that the effects of the quantization errors are not directly proportional to the errors themselves. So I would dither even on 24 and 32 bits. Once code is written for 16 bits, 24 and 32 bits should be quite straightforward, why spending time to suppress something working

As you said, I'd like to avoid doing float32 or float64 multiplication for every sample as it can be heavy on uarchs without SIMD.
I'm interested about that bit on the "effects" of quantization, as this runs contrary to logic (at least for me).

* wouldn't using float64 during the conversion to integer allow better accuracy? Something like int16_value = (int16)((float64)float32_value * 65534 + dither). I am an embedded specialist, we avoid float like hell since it eats a lot of resources, so I am frankly rusted on the float subject.

Well, that's the goal, but it would be a lot slower, especially on 32 bit uarchs like ARMv7. As binary32 float can perfectly represent 5 decimal digit integers and binary64 15, I'll probably use binary64 for higher bit depths.

Did you play with FFTs of the output of your player?

No, just did some average of absolute error to convince myself that dither is useful.

boXem · Aug 12, 2020

q3cpma said:
But it means that I must make my input be between 1 and MAX - 1, which doesn't seem perfect. On the other, 0 and MAX samples seem rare, which should make it OK.

As you said, I'd like to avoid doing float32 or float64 multiplication for every sample as it can be heavy on uarchs without SIMD.
I'm interested about that bit on the "effects" of quantization, as this runs contrary to logic (at least for me).

Well, that's the goal, but it would be a lot slower, especially on 32 bit uarchs like ARMv7.

No, just did some average of absolute error to convince myself that dither is useful.

Sorry for the stupid question, but what's the point of float32 instead of int32 for volume control?

I played a bit with octave doing FFTs to compare dithered and undithered signals and the results were quite spectacular.

pkane · Aug 12, 2020

q3cpma said:
But it means that I must make my input be between 1 and MAX - 1, which doesn't seem perfect. On the other, 0 and MAX samples seem rare, which should make it OK.

Maybe just skip applying dither to samples that would exceed -1..1 range?

q3cpma · Aug 12, 2020

pkane said:
Maybe just skip applying dither to samples that would exceed -1..1 range?

That's basically what clipping does, while keeping a bit of the noise when it doesn't clip.

pkane · Aug 12, 2020

q3cpma said:
That's basically what clipping does, while keeping a bit of the noise when it doesn't clip.

Right. But other than scaling to provide enough headroom for dither to overflow the normal range, I don't see anything else you can do here.

q3cpma · Aug 12, 2020

Fred Jacquot said:
Sorry for the stupid question, but what's the point of float32 instead of int32 for volume control?

I played a bit with octave doing FFTs to compare dithered and undithered signals and the results were quite spectacular.

Well, there's no volume control in my program, it's just the result of replaygain. I could take the resulting gain, left shit it, store it in an uint32, then multiply the sample and right shift the resulting int64, but it does seem to get complicated.

q3cpma · Aug 12, 2020

pkane said:
Right. But other than scaling to provide enough headroom for dither to overflow the normal range, I don't see anything else you can do here.

Yeah. This isn't the most important thing, honestly, but it could save me a branch for each sample to do "sample * 0.999... + dither" for each. Will probably do it like this.

pozz · Aug 12, 2020

@bennetng Any advice here?

mansr · Aug 12, 2020

If the input is well-behaved, it is extremely unlikely for the dither to clip on several consecutive samples. I wouldn't worry about it.

mansr · Aug 12, 2020

Fred Jacquot said:
Sorry for the stupid question, but what's the point of float32 instead of int32 for volume control?

Float lets you reduce the amplitude without losing precision. Of course, you still have to convert to integer for output to the DAC, so it's a bit pointless. As intermediate format between processing stages, it does make sense, though.

KSTR · Aug 12, 2020

q3cpma said:
Hello,

as I make and use my own music player, I have two questions about dither. Knowing that currently, I only output 16 bits PCM and use float32 for replaygain scaling:
* Since I use TPDF dither in the range [-1, +1], I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?
* I plan on making 24 and 32 bit output possible, for further processing; is dither even necessary at that point? The quantization error should be quite small.
* No need for float64, right?

IMHO, the "right way" would be like this:
- convert input to float for gain change.
- apply gain (with saturation one bit below +FS and -FS at equivalent bit depth), all in the float domain. Use symmetrical span (-FS and +FS are the same absolute value of 32767, in case of 16bits).
- add one bit of dither (float).
- quantise to integer output.
There is always a gain applied and that gain must always be < 1.0 for a full scale input, the exact limit being 32766/32767 for 16bit, this makes sure you have one bit of headroom left for the dithering.
You can of course optimize for the special case of gain = 1 and then skip directly over the whole floating point block, simply making output=input.

Acoustically, I won't bother though. When you have fullscale samples in the stream, something is wrong anyway (outright clipping, or at least a risk of intersample overs) and the imperfect dither is irrelevant.

q3cpma · Aug 12, 2020

Thanks, I'm convinced that considering that the input (audio data and replaygain peak/gain values) is well formed is the sane way of doing it.

That's what I'm doing in fine, that headroom multiplier value was calculated to be maximum while still reducing the extreme values by one after rounding:

Code:

#define INT16_HEADROOM_MULT (32767.497f / 32768.f)
static inline int16_t triangle_dither(float x)
{
	static float prev_rand = 0.f;
	const float rand = _drand48();
	const float tmp = x * INT16_HEADROOM_MULT + rand - prev_rand;
	prev_rand = rand;
	return lrintf(tmp);
}

void apply_gain(int16_t *restrict buf, size_t nbsample, float gain)
{
	if (gain == 1.f)
		return;

	for (size_t i = 0; i < nbsample; ++i)
		buf[i] = triangle_dither(buf[i] * gain);
}

Though I'm still not certain at all that dithering is needed for 24 or 32 bits output.

boXem · Aug 12, 2020

q3cpma said:
Thanks, I'm convinced that considering that the input (audio data and replaygain peak/gain values) is well formed is the sane way of doing it.

That's what I'm doing in fine, that headroom multiplier value was calculated to be maximum while still reducing the extreme values by one after rounding:

Code:

#define INT16_HEADROOM_MULT (32767.497f / 32768.f) static inline int16_t triangle_dither(float x) { static float prev_rand = 0.f; const float rand = _drand48(); const float tmp = x * INT16_HEADROOM_MULT + rand - prev_rand; prev_rand = rand; return lrintf(tmp); } void apply_gain(int16_t *restrict buf, size_t nbsample, float gain) { if (gain == 1.f) return; for (size_t i = 0; i < nbsample; ++i) buf[i] = triangle_dither(buf[i] * gain); }

Though I'm still not certain at all that dithering is needed for 24 or 32 bits output.

I fear this is still clipping. I would do

Code:

#define INT16_HEADROOM_MULT (32766.f/32767.f)

q3cpma · Aug 12, 2020

Fred Jacquot said:
I fear this is still clipping. I would do

Code:

#define INT16_HEADROOM_MULT (32766.f/32767.f)

I experimented, it works because of the lrint (or trunc if you're dirty). I actually started with the same as yours, then tried .5 which didn't work, then tried .499 which did but produced 32767.50000f for 32768.f, which made me worry I'd be relying on the default rounding direction too much.

bennetng · Aug 12, 2020

q3cpma said:
* Since I use TPDF dither in the range [-1, +1], I currently clip when the dither over/underflows the sample, but is that the "right way" of doing it?

I've seen some dither plugins keep digital silence in the original data unchanged (which can actually make a file converter creating smaller flac files due to the absence of additional noise). I've also seen dither plugins skip dithering at full scale to avoid clipping, so I don't think there is anything wrong here.

bennetng · Aug 24, 2020

Tried to to reduce the amplitude of the resulting waveform without skipping dither or changing the desired gain. When the input samples are negative, bias the dither towards positive, and vice versa.

C#:

    double Dither(double sam)
    {
        var pdf = ran.NextDouble() * -4 + 3;
        while (pdf > 2d / 3)
            pdf--;
        return Math.Round(sam + (sam > 0 ? pdf : -pdf));
    }

The white waveform includes 0.48Hz and 480Hz tones. The blue ones are 60dB reduced with dither. The left one is my version and the right one is unshaped TPDF dither from Adobe Audition. Audio files in attachment.

Differences are easily visible in time domain, in frequency domain they are more or less the same and my algorithm is not audibly quieter. Tried with some real music and doesn't seem to introduce obvious artifacts like RPDF as well. Hopefully RNGs in different languages won't affect the results. It's C#, both the RNG and rounding return double so it doesn't make too much sense to use float or int in the middle. Integer conversion is done outside of the method to reduce unnecessary casting.

C#:

                            while (sampleCnt-- > 0)
                            {
                                double sam = br.ReadInt16();
                                sam *= .001;
                                bw.Write((short)Dither(sam));
                            }

P.S. Audacity and some other software's TPDF dither are +/-1 LSB but slightly shaped (blue) but I am comparing my dither with the non-shaped one in Audition (white).

q3cpma · Aug 24, 2020

Not a bad idea, though the amount of branches is increased.

earlevel · Dec 12, 2020

Maybe someone running into this thread will appreciate this:

Dither questions

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Master Contributor

Major Contributor

Master Contributor

Major Contributor

Major Contributor

Слава Україні

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Attachments

Major Contributor

Addicted to Fun and Learning

Similar threads