Alternative method for measuring distortion

solderdude · Feb 17, 2020

signal --> adding 1LSB dither (for 16 bit) --> encoding in 16 bit.

Serge Smirnoff · Feb 17, 2020

solderdude said:
signal --> adding 1LSB dither (for 16 bit) --> encoding in 16 bit.

Yes, this is in more details.

xr100 · Feb 17, 2020

Serge Smirnoff said:
But I'm talking here about different levels of information - engineering and semantic (you pointed to the appropriate article by Warren Weaver above). In order to understand better the difference between those levels I suggested the following example above: if you have a series of 32bit values of unknown origin and you need to convert them to 16bit values, will you apply noise before rounding? What is recommended by math in this case?

APPLIED mathematics? Without knowing more, there isn't an answer. In the case of LPCM audio applications, dither. The "noisy channel" is 16-bit LPCM, the ultimate receiver is the ear/brain system.

The "semantic" question is one of whether "white noise" constitutes "music" or not.

In this context, other "layers" aren't relevant... yes, one could move down the "stack" and consider a collection of 0's and 1's are that are lossless coded in FLAC and in this case the original LPCM is recovered in the FLAC decoder. Or even CD-DA...

Or the multiple layers between a software audio player (e.g. running on a PC) and a hard drive, and so on. The hard drive's own controller doesn't "care" whether the data is LPCM audio or MALWARE, its only job is to return the requested data without error, and in turn the user (i.e. the OS) doesn't "care" what the hard drive does, e.g. error correction, whether it has "quietly" reallocated sectors, etc... up to the point that it cannot return error-free data... and so on, back to the software player, which "talks" to the OS at the file system level, and then decides what to do with the incoming data based on, say, the file's header, etc., and ultimately back via the OS, with further layers to the DAC per se, which "knows" nothing of the rest, its job is to convert the incoming LPCM audio signal (e.g. in I2S format) to analogue; e.g. whether it got there via optical S/PDIF or USB is immaterial.

Talking of hard drives, a quick search yields this:
Patent: Hard disk drive head-disk interface dithering. (Assignee: Western Digital.)

Anyway, the point is that all of these "layers," which could go on ad infinitum, are simply abstracted away.

Serge Smirnoff · Feb 21, 2020

xr100 said:
APPLIED mathematics? Without knowing more, there isn't an answer. In the case of LPCM audio applications, dither. The "noisy channel" is 16-bit LPCM, the ultimate receiver is the ear/brain system.

I would mostly agree here. The applied math can not be used in this case, because we don't know the application area of the signal. And the best strategy for quantizing unknown signal is rounding as it provides the best Signal-to-Quantization-Noise Ratio. There are several other quantizing methods [wiki], which can be used depending on the nature of signal and its properties (prob.distr.function). And this is the point where my reasoning of psychoacoustic nature of dithering in audio originates from.

The choice of the particular quantizer for audio signals (linearized by means of dither) is determined by the application area of the signal - perception by ear/brain. Our perception of sound has very special properties studied by psychoacoustics:

(1) We are sensitive to signal quantization errors in quiet music passages. This is the primary reason for linearizing quantizer below-LSB. “Preserving below-LSB signal” would not be required if such artifacts were unnoticeable/unimportant for hearing/brain, exactly like absence of some sound components due to psychoacoustic encoding is unnoticeable (both quantizing and psy.encoding are lossy data reduction algorithms). Quantization errors of low signals are very noticeable and dithering improves performance of the quantizer in this area. So, the linearization is required by psychoacoustics, not by math.

There is another moment here. Audio signals that have high noise floor (or just naturally dithered) can be safely quantized without dithering. If all audio would have been noisy dithering would not be required at all. This means that for some signals (clean) dithering is required and for others (noisy) is not, which is an indirect indication that this operation is not universal/mathematical. One can object that in this case the dithering happened naturally and exists in the signal anyway. The problem with this objection is that math can not recognize if the signal is noisy or clean. This can be recognized only by Humans. From math point of view these signals are equal.

(2) In case of above-LSB signals we are sensitive to new spectral components of quantized signal and less sensitive to broadband noise in it (especially if it is shaped according to hearing properties). Dithering of the quantizer transforms standard/annoying errors into the ones less objectionable for hearing at the expense of worsening SQNR of quantized signal. This is another indication that dithering is a psy operation. From math point of view it degrades SQNR of quantized signal.

According to the arguments above I define dithering of quantizer in audio as psychoacoustic operation, not math one. Linearization of quantizer in audio is required by psychoacoustic properties of hearing. In this sense the dithering can be considered as simple psychoacoustic trick, which helps to reduce annoyance of quantization errors for hearing at the expense of slight increase of these errors. Psycoacoustic encoding do the same but to grater extent, using more knowledge of perception and degrading the signal even more on math level. These cases are similar. In many other areas of signal processing dithering of quantizers is not required (but in some - also required, depending on particular application).

xr100 said:
The "semantic" question is one of whether "white noise" constitutes "music" or not.

Agree with the one note. For Humans the meaning of sound closely relates to psychoacoustic properties of hearing. For example sensation of dissonances and consonances is determined by critical bands of hearing, which in turn are determined by mechanical properties of basilar membrane in the inner ear. So, semantics of sound includes also psychoacoustics (ear/brain). As a result we have at least two levels of audio information - engineering (without meaning) and semantic (with psychoacoustics).

xr100 said:
In this context, other "layers" aren't relevant... yes, one could move down the "stack" and consider a collection of 0's and 1's are that are lossless coded in FLAC and in this case the original LPCM is recovered in the FLAC decoder. Or even CD-DA...

Or the multiple layers between a software audio player (e.g. running on a PC) and a hard drive, and so on. The hard drive's own controller doesn't "care" whether the data is LPCM audio or MALWARE, its only job is to return the requested data without error, and in turn the user (i.e. the OS) doesn't "care" what the hard drive does, e.g. error correction, whether it has "quietly" reallocated sectors, etc... up to the point that it cannot return error-free data... and so on, back to the software player, which "talks" to the OS at the file system level, and then decides what to do with the incoming data based on, say, the file's header, etc., and ultimately back via the OS, with further layers to the DAC per se, which "knows" nothing of the rest, its job is to convert the incoming LPCM audio signal (e.g. in I2S format) to analogue; e.g. whether it got there via optical S/PDIF or USB is immaterial.
.................................
Anyway, the point is that all of these "layers," which could go on ad infinitum, are simply abstracted away.

Applying your example to audio signals we could similarly say the following. Operating on the semantic level of audio signal (mixing, applying effects, ...) we can abstract away the engineering level only if corresponding operations on this level do not degrade the signal on the semantic level (properly coded in 32bit arithmetics). In other words, engineering level of audio information can be abstracted away only being fully transparent for semantic level.

xr100 said:
Talking of hard drives, a quick search yields this:
Patent: Hard disk drive head-disk interface dithering. (Assignee: Western Digital.)

Yes, another case of using dithering alone, without relation to quantizing.

j_j · Feb 21, 2020

Serge Smirnoff said:
I would mostly agree here. The applied math can not be used in this case, because we don't know the application area of the signal.

Why not? Preserving the maximum amount of information remains a thing, despite your failure to admit "information" is a measurable quantity.

And the best strategy for quantizing unknown signal is rounding as it provides the best Signal-to-Quantization-Noise Ratio.

Except that it's a nonlinear process, and you CAN NOT APPLY SNR TO NONLINEAR PROCESSES in the same way you can to a linear process.

You are not making a meaningful measurement UNTIL YOU LINEARIZE THE QUANTIZER and that is DITHER.

This is simple, 1950's mathematics.

If you round, and don't dither, you are doing it wrong. You are throwing away information. You are throwing away information that can be measured, and that can also be heard.

At this point, you're very close to trolling. You haven't learned thing, now, have you? Why do you think that even printer drivers use dither and diffusion? Tell me that now.

You are simply rejecting the basic mathematical facts, and teaching people the wrong thing as a result.

And, of course, let us remember that "distortion" implies a nonlinearity, and "noise" does not necessarily do so. A linear process like equalization does not involve nonlinearities, but may affect SNR if you aren't smart about how you measure.

Serge Smirnoff · Feb 21, 2020

j_j said:
You are not making a meaningful measurement UNTIL YOU LINEARIZE THE QUANTIZER and that is DITHER.

Linearization of quantizer by dithering results in increase of quantization error (SQNR). We need some reason for such increase. The reason is in psychoacoustics - increased error is less audible and more pleasant for hearing.

SIY · Feb 21, 2020

Serge Smirnoff said:
Linearization of quantizer by dithering results in increase of quantization error (SQNR). We need some reason for such increase. The reason is in psychoacoustics - increased error is less audible and more pleasant for hearing.

No to all of these, since your premise is incorrect.

I think you need to actually listen to what JJ and others have been telling you- you do NOT understand dither, and until you do, you're going to continue to flail and come up with nonsensical conclusions. I think the problem is that not only don't you understand dither, you don't understand that you don't understand.

solderdude · Feb 21, 2020

Serge Smirnoff said:
The reason is in psychoacoustics

The reason is to prevent loss of information.
The shaping of the dither can fall under psychoacoustics.

Serge Smirnoff said:
increased error is less audible and more pleasant for hearing

You obviously havent heard 8 bit audio with and without dither.
Without a shadow of a doubt dithering is beneficial.
Yes, at the cost of noise which to most people is far less objectionable.

thewas · Feb 21, 2020

Even the simple examples of dithering in image quantization shows that it is prevents loss of information
https://en.wikipedia.org/wiki/Dither

SIY · Feb 21, 2020

When I spent years getting data on molecular vibrations from interferometer signals well below the quantization limit, I had no idea I was using psychoacoustics. I'm glad someone set me straight.

xr100 · Feb 21, 2020

Serge Smirnoff said:
I would mostly agree here. The applied math can not be used in this case, because we don't know the application area of the signal.

Why would this situation exist, of not knowing the application, and even if it did, then of what relevance would it be to audio?

Serge Smirnoff said:
Applying your example to audio signals we could similarly say the following. Operating on the semantic level of audio signal (mixing, applying effects, ...) we can abstract away the engineering level only if corresponding operations on this level do not degrade the signal on the semantic level (properly coded in 32bit arithmetics). In other words, engineering level of audio information can be abstracted away only being fully transparent for semantic level.

Considering mixing/effects as part of the "channel," they are not "abstracted away."

32-bit float is not strictly sufficient for mixing given 24-bit fixed point output, albeit whether or not it matters is another question. It is definitely not sufficient for IIR filters. As for "effects," taking reverb as an example, the "algorithm" and "settings" are most important; intelligibility can be highly compromised. Ditto, of course, concert hall acoustics etc. in the purely physical world.

xr100 · Feb 21, 2020

solderdude said:
You obviously havent heard 8 bit audio with and without dither.

I previously uploaded some 4-bit truncated files which Serge downloaded...

j_j · Feb 21, 2020

Serge Smirnoff said:
Linearization of quantizer by dithering results in increase of quantization error (SQNR). We need some reason for such increase. The reason is in psychoacoustics - increased error is less audible and more pleasant for hearing.

Please stop with this false claim. The reason is PRESERVATION OF INFORMATION. This shows, precisely, why linearization is necessary, because you PRESERVE INFORMATION by adding noise. According to information theory that can happen when you have a nonlinearity.

It's time for you to drop your false, incorrect, wrong, and otherwise misteaching about the FACTS surrounding sampling and dither. Just stop. You don't understand quantization, you don't understand the meaning of a linear system, and you reject information theory outright. You're simply spreading bad information.

STOP.

xr100 · Feb 21, 2020

Serge Smirnoff said:
So, the linearization is required by psychoacoustics, not by math.

Maths "proves" that dither linearises the quantiser.

Serge Smirnoff said:
This means that for some signals (clean) dithering is required and for others (noisy) is not, which is an indirect indication that this operation is not universal/mathematical. One can object that in this case the dithering happened naturally and exists in the signal anyway.

It doesn't matter where dither "comes from," as long as it has the required characteristics. How is "self-dither" not an indication that dither is universally needed to linearise the quantiser?

Serge Smirnoff said:
Linearization of quantizer in audio is required by psychoacoustic properties of hearing.

It is required in many other applications.

It may not be for a wall-mounted consumer digital thermometer of dubious accuracy and limited precision, and where the reading applies to the location of the sensor--but the device might be satisfactory to give the user an indication of the room temperature. If said device is a simple "on/off" thermostat, then it should have some hysteresis, too (i.e. avoiding turning on/off constantly.)

(This can be extrapolated to a much larger scale in terms of weather/climate, where the name of the game (presumably) is combining data from multiple sources, smoothing and correcting inc. "fudge factors." In the case of "weather," the information required by the vast majority of the populace is an accurate prediction of whether there is a high probability of rain tomorrow, albeit said predictions are now readily available to the public as a frequently updated hourly map showing the predicted rainfall locations, but either way, NOT a reading of the temperature at location X to a gazillion significant figures.)

Serge Smirnoff said:
In this sense the dithering can be considered as simple psychoacoustic trick

It is not a "simple trick," it linearises the quantiser. Yes, generating TPDF dither is straightforward enough in DSP, and on the face of it would appear to be an exceptionally expedient way of achieving such a profound result. But, it is actually exactly what is required, and formally, not so straightforward.

BTW, I had a look around your website, and downloaded a couple of test "WAVE" files. I only briefly considered them; but I suspect they were, in fact, not dithered?

j_j · Feb 21, 2020

SIY said:
When I spent years getting data on molecular vibrations from interferometer signals well below the quantization limit, I had no idea I was using psychoacoustics. I'm glad someone set me straight.

Ditto when pulling data from an accelerometer that is self-dithered by self-noise that's way, way below the noise by integrating over 85 minutes (yes, really, 1 M fft of 200Hz sampled data downsampled from 96000. (the sound of a cpu going whirrrr whirrr in the background should be inferred)

In that case the dither is the device self-noise, and yes, Virginia, you can tell line frequency from the vibrations of all the motors, most at 3600/N minus a bit. (N = 1 to 8, integer only) You can even see it vary slightly from cooking time in the evening to dead of night

All of this for data well below the ADC (which is 20 bits) and the device self-noise (18 bits). Given the self-noise is not white, in fact some dither was included from an (intentionally large) resistor in the buffer amp.

Linearization is mathematical, not 'psychoacoustic', and repeating otherwise remains wrong.

xr100 · Feb 21, 2020

j_j said:
Ditto when pulling data from an accelerometer that is self-dithered by self-noise.

What was this accelerometer used to measure?

j_j · Feb 21, 2020

xr100 said:
What was this accelerometer used to measure?

The ground.

xr100 · Feb 21, 2020

j_j said:
The ground.

Crickey.

exaudio · Feb 22, 2020

Serge Smirnoff said:
The applied math can not be used in this case...

...the linearization is required by psychoacoustics, not by math...

...math can not recognize if the signal is noisy or clean. This can be recognized only by Humans. From math point of view these signals are equal.

Math does allow us to quantify the information content of a signal. Claude Shannon showed us how by giving us a formula for entropy.

I thought I would try using Shannon's formula to calculate the entropy of a dithered file and compare it to a non-dithered version of the same signal. I made two wave files with 10 seconds of 1kHz sinus sampled at 48kHz with 8 bit sample depth--one with dither and one without. Both wave files are exactly the same size, 468.8 KiB, and both wave files have the same number of samples: 48,000 samples per sec x 10 sec = 480,000 samples.

Despite the two files containing the same number of samples and being exactly the same size, by calculating their information entropy we can see from a purely mathematical perspective that the dithered file contains more information. Using Mathematica to calculate the information entropy the dithered file had a higher entropy of 3.83704 vs the non-dithered file's 3.12737. The dithered file's higher entropy tells us it contains more information. (Mathematica uses log base e when calculating entropy).

Another way we can compare the information content between the two files is to compress them using a lossless compression like zip, or since they're audio files, flac. Taking the same two wave files from above and converting them to flac (level 8) results in the dithered file being larger (171.0 KiB) than the non-dithered file which compressed to 146.3 KiB. That the dithered flac file is larger than the non-dithered flac file is another way that math tells us the dithered file contains more information. No psycho-acoustic interpretation required--just math.

I'll attach the wave files I used. Note the size difference of the zipped files. The non-dithered file is much smaller because it contains less information. The files will unzip/inflate to the same size.

xr100 · Feb 22, 2020

exaudio said:
hat the dithered flac file is larger than the non-dithered flac file is another way that math tells us the dithered file contains more information. No psycho-acoustic interpretation required--just math.

I'll attach the wave files I used. Note the size difference of the zipped files. The non-dithered file is much smaller because it contains less information. The files will unzip/inflate to the same size.

Cool stuff, but the dither itself is added randomness ("information") for the entropy/lossless coding?

Alternative method for measuring distortion

Grand Contributor

Active Member

Addicted to Fun and Learning

Active Member

Major Contributor

Active Member

Grand Contributor

Grand Contributor

Master Contributor

Grand Contributor

Addicted to Fun and Learning

Addicted to Fun and Learning

Major Contributor

Addicted to Fun and Learning

Major Contributor

Addicted to Fun and Learning

Major Contributor

Addicted to Fun and Learning

Member

Attachments

Addicted to Fun and Learning

Similar threads