High Resolution Audio: Does It Matter?

j_j · Jun 29, 2022

sax512 said:
If you put an impulse (which is not band limited by its nature) through an anti-aliasing filter you remove the inaudible frequencies (and only those), like I was saying.
Is there an audible difference?
As you say.. ***that*** is the question.

You can suit yourself, but removing the inaudible parts via the cochlear filter, which has very different cutoff frequencies and shape, as opposed to the antialiasing filter which certainly has very different shape and characteristics changes the waveshape OF THE IN BAND PART.

And therein, changing the wave shape (although the frequency content is ALMOST the same) can have different phase as a result of the hair-cell reaction, and can trigger "first firing" differently.

Nonlinearities are a fun.

j_j · Jun 29, 2022

DonH56 said:
A question on the band-limited impulse response: does Gibbs apply, and would that lead to audible artifacts? I ran into Gibbs again in my day job (at about 30 GHz, not audio) and got to wondering if it really mattered for this (audio) case...

Yes and no. Ultimately it devolves to actual wave shape after the cochlear filter, which may well differ, certainly does at extreme cases. It's that 'first detection' that matters, and the "leakage" in the filters (small for the FIR, not so much for the most HF cochlear filters) that also can matter.

voodooless · Jun 29, 2022

Bathrone said:
* A 16bit depth is not perfectly suitable because the human human hearing has around 120db of dynamic range

Yeah, you’ll be able to test this only a few times though. 120 dBSPL is right there at the pain threshold and will damage your ears permanently. So practically, the real threshold is much lower.

Bathrone said:
I would tend to reject noise floor arguments which try to suggest yeah well add the noise floor in and its around the ball park too because arguing like 40db is lost to the listening location doesnt account to a quiet listening room with closed back headphones.

Even those are not 0 dB silent. There will always be significant noise left. At best 20 dB of isolation roughly? Still leaves us at 100dB of dynamic range if we go by your 120dB number. That is awfully close to the 96dB 16 bits give you. And even then you really need a very quiet environment to begin with and play really, really loud.

Dynamic range is never really an issue. Dynamic range of music is only about 10 to 20 dB realistically.

And then we haven’t even covered masking. Loud sounds mask more silent ones. With a 100dB sine sound source, you won’t hear another at say 40dB anymore. The fields of psychoacoustics that you Jon Snowed about has known this for decades and used these principles to create lossy audio compression codecs with a lot of success.

Frank Dernie · Jun 29, 2022

Bathrone said:
This is a subject close to my heart. I have significant research and time into this field.

When you write significant research and time I suspect you are not referring to decades of full time work after a good scientific foundation in school and university.

Your narrative looks more like the sort of confusion one may suffer following lots of hours on google with maybe not much of a scientific education as a foundation??

There is no doubt 16-bit is not enough to cover all audible sounds, but in a practical world we have level controls on recorders and volume controls on our reproducing equipment, so we can adjust the level of the sound we want to capture when we record it, depending what it is.
No sane recording company would release a recording which could not be reproduced on any domestic or most professional replay equipment - nobody would be able to play it. Whilst good hifi can be capable of 16-bit linearity most of the kit, car stereo, Alexa and Siri driven speakers for example, that 90% of people listen to music on can not - and they, not hifi enthusiasts, are the market for the recordings. 16-bit is too much for them.

Even most classical music has less than 16 bits of dynamic range and music has been for years compressed to a greater or lesser extent before being inscribed on the recording medium, both to take account of the medium's limitations, like LPs or "musical" requirements of the artists.

So whilst 16-bit isn't theoretically enough unless you want to record both a clap of thunder and birdsong without adjusting level, and have reproduction equipment which could reproduce it if you did - which is probably just the biggest pro monitors made - the issue is moot.

The old chestnut about timing keeps being raised by the scientifically hard of understanding on the internet time and time again despite it never having been technically valid.

I am disappointed that the internet has resulted in the spread of more disinformation than fact. It is irritating when it concerns a subject one knows a bit about and very confusing when trying to find genuine info on subjects one doesn't know much about.

It was better when we just had books and libraries.

Frank Dernie · Jun 29, 2022

voodooless said:
Dynamic range is never really an issue. Dynamic range of music is only about 10 to 20 dB realistically.

Depends on the music and classical music is not popular, interms of recordings sold, but 50dB dynamic range between the quiet bits and loud bits is fairly common IME, even with unamplified vocal music never mind a Bruckner symphony.

voodooless · Jun 29, 2022

Frank Dernie said:
Depends on the music and classical music is not popular, interms of recordings sold, but 50dB dynamic range between the quiet bits and loud bits is fairly common IME, even with unamplified vocal music never mind a Bruckner symphony.

Sure there are always exceptions

The 96 dB is still enough though.

Let's not forget that one can always add noise-shaped dither to increase the apparent dynamic range to about 115 to 120 dB where the ear is most sensitive.

Frank Dernie · Jun 29, 2022

voodooless said:
Sure there are always exceptions The 96 dB is still enough though.

Let's not forget that one can always add noise-shaped dither to increase the apparent dynamic range to about 115 to 120 dB where the ear is most sensitive.

Absolutely.
96dB is plenty and more than most, if not all, domestic stereos could reproduce anyway.
In a quiet listening room of 30dB the loud bits would be 126dB and not many systems would be able to get anywhere near that, even if the owner wanted to.

j_j · Jun 29, 2022

Here's some study material. https://www.aes-media.org/sections/pnw/ppt/jj/adc.ppt

And in regard to your 120dB number: https://www.aes-media.org/sections/pnw/ppt/other/limitsofhearing.ppt

Given that the average extremely quiet place in the world these days runs above 20dB SPL, well, we can all do the math. Also, try doing some math looking at the efficiency and power handling of even very good loudspeakers, and look at the actual peak levels that can be achieved by "hi fidelity" speakers.

j_j · Jun 29, 2022

Frank Dernie said:
Absolutely.
96dB is plenty and more than most, if not all, domestic stereos could reproduce anyway.
In a quiet listening room of 30dB the loud bits would be 126dB and not many systems would be able to get anywhere near that, even if the owner wanted to.

Indeed. With pretty much any driver that's not intended for PA use, hitting 120dB SPL is chancy. I know a few that can do it. So that's a maximum of 120-8dB (two ears, not one, for noise level, and possibly 10 instead of 8 but I'm not going to measure your eardrum for you), that means 110 to 112 dB SPL from noise floor to the best you can do, at close range, with things turned way too far up.

Now, let's talk about room noise. If your room noise (properly measured, also considering a variety of things like the masking level of the noise as a function of frequency) is under 25dB you live in a really quiet place. I've been in a room that was NC8, wideband, used to work in it. It takes a lot of money and work to achieve that, even in a quiet building.

So, keep your average listening level under 80dB and your peak under 105, and save your hair cells for tomorrow, not that Frank needed to know that, but some folks obviously have, given some of the settings I've seen these days. The hearing aid industry is going to be really big in another 10 years.

Deleted member 16543 · Jun 29, 2022

j_j said:
You can suit yourself, but removing the inaudible parts via the cochlear filter, which has very different cutoff frequencies and shape, as opposed to the antialiasing filter which certainly has very different shape and characteristics changes the waveshape OF THE IN BAND PART.

And therein, changing the wave shape (although the frequency content is ALMOST the same) can have different phase as a result of the hair-cell reaction, and can trigger "first firing" differently.

Nonlinearities are a fun.

They sure are. I still maintain that the cochlear filter will most likely have the exact output (neuron firing sequence) under the stimuli of a non-band limited sound vs. its band limited representation, if the band limiting doesn't affect amplitude and phase below 20 kHz (which we can surely do, but it's worth repeating not everybody does correctly in their converters, so.. buyers beware).

The interesting thing to me is that people will stick to their wrong reasons why we *need* more than 44.1 sampling rate, when the only possibly true reason why we *might* need it (although it's a remote chance we actually do, in my opinion) can be argued by only a handful of people.

So it seems we'll have to suffer incursions in threads like this one by people like our new friend for a very looong time.
Maybe our great-grandchildren will get to live in a world without this problem.. Maybe.

j_j · Jun 29, 2022

sax512 said:
They sure are. I still maintain that the cochlear filter will most likely have the exact output (neuron firing sequence) under the stimuli of a non-band limited sound vs. its band limited representation, if the band limiting doesn't affect amplitude and phase below 20 kHz (which we can surely do, but it's worth repeating not everybody does correctly in their converters, so.. buyers beware).

You can maintain all you want, but two different filters, one minimum phase, one constant delay, each one of which has different bandwidth and shape, do not provide the same wave shape as the single minimum phase filter.

On top of that, you need to look at the shape of the high frequency cochlear filters. I know I already said that.

So, you an maintain, but you're wrong.

On top of that, now analyze those two signals (one both filters, one only the cochlear filter) and see when you hit a threshold. You're still oversimplifying, and again, you're still ignoring the nonlinearity. Convinced or not, you should look at the stacked filters vs. the single filter. Any convolution (other than an impulse) makes the signal longer. Math. Simple.

Mnyb · Jun 29, 2022

96dB is not exactly true it's better where it matters due to widespread use of shaped dither the last >25 years

Deleted member 16543 · Jun 29, 2022

j_j said:
You can maintain all you want, but two different filters, one minimum phase, one constant delay, each one of which has different bandwidth and shape, do not provide the same wave shape as the single minimum phase filter.

On top of that, you need to look at the shame of the high frequency cochlear filters. I know I already said that.

So, you an maintain, but you're wrong.

On top of that, now analyze those two signals (one both filters, one only the cochlear filter) and see when you hit a threshold. You're still oversimplifying, and again, you're still ignoring the nonlinearity. Convinced or not, you should look at the stacked filters vs. the single filter. Any convolution (other than an impulse) makes the signal longer. Math. Simple.

Math does say that by putting a minimum phase filter after a wide band signal, vs. adding a linear phase brickwall filter before the minimum phase one, you do get differences in the output.
My point is that the differences have all content above the cut-off frequency of the brickwall filter. if the cut-off is 20 kHz, the difference has a spectrum that's all confined in the ultrasonic region.
This may not make the outputs exactly equivalent, mathematically speaking. But it does make them sonically equivalent.
That's all I'm saying.

And if there is a compression factor (as in the cochlea), the difference in the harmonics generated is, again, all above the brickwall cut-off frequency.
Unless one can prove that, say, a two 20 + 22 kHz tone signal can generate an actually perceived 2 kHz signal due to non-linearities of the ear.
To this day I'm not aware of anybody that was able to prove that (but it is a possibility, remote as it may be in my opinion).

krabapple · Jun 29, 2022

Bathrone said:
This is a subject close to my heart. I have significant research and time into this field. High Res is necessary because

Hi res is useful during recording/production, and for audio processing at home. It is unnecessary for a delivery format.

Bathrone said:
Yes ofcourse, thats been done. I realise the first one was it Maye or Mayor or something Ill look it up later, that showed to the engineering society that it was statistically good. Then later, a whole series of papers were released to the engineering society showing the contrary where serious and fair criticisms were made of the original paper. I belive also from memory too there was a meta analysis of all the papers.

It was Meyer & Moran (2007). As for the rest, you'd best stop relying on memory, mate. Yours is obviously sketchy.

Yes, there was a meta-analysis (Reiss, 2016) and its results, too, are rather controversial and their significance has been dissected at length. One hint: meta analysis results are highly dependent on which studies and data are included or excluded.

You might want to also lookup this j-j fellow you're debating with. Then apologize and resign with all humility.

Bathrone said:
I support your rationalism and though - I agree on your intent and method - the data does show CD is inadaquete. Ill see about references

Yeah, we can a hardly wait. No one's ever tilted at this windmill before. Have at it, Quixote.

CapMan · Jun 29, 2022

I was enjoying Audioscience Review until I read this thread

I think I’ll just go away and enjoy listening to some music.

Sal1950 · Jun 29, 2022

CapMan said:
I was enjoying Audioscience Review until I read this thread

I think I’ll just go away and enjoy listening to some music.

OK, Bye

j_j · Jun 29, 2022

sax512 said:
This may not make the outputs exactly equivalent, mathematically speaking. But it does make them sonically equivalent.

This is where you err. Since the OUTPUT of either the single or joint filter is what drives the inner hair cell, THAT is what ***DETERMINES*** the sonic effect at the detector. There is no more filter after that, so anything that leaked through can have an effect.

This is what you need to understand. While you may never hear a 21kHz tone, it is indeed possible that a bit of 21 kHz content leaking past the cochlear filter COULD (not "does") affect the detection. After all, it's that movement that triggers the detector. There is nothing to gain at this point by making a small incremental change that will absolutely be clear.

Nonlinearities suck.

Consider how FIR's work (constant delay FIR's, not the general case). In a very real sense your filter has energy outside the cutoff (talking about an LPF) that is REMOVED by being antiphase in the second, symmetric half of the filter. If your "detector" detects something BEFORE the second half comes along, whoops, the nonlinearity bit you.

Try this. Make yourself a really sharp antialias filter, with say 1dB ripple and 90dB rejection in Matlab. Make it at 96khz.

NOW analyze that impulse sample by sample, using a 64 tap Hann windowed FFT. Move that sample by sample along the impulse response. Look what you get. Maybe THIS will show you what I'm talking about. The ear does not have do, and does NOT, consider the entirety of a filter on an impulse, it detects on the part of it that corresponds to the CURRENT TIME with it's impulse response width, NOT the whole filter length.

Hence my willingness to move to 64, where any presently conceivable mechanism can be ignored. Going to 96, or 128, or 192, etc, simply makes storage a lot harder. Remember when you double the sampling rate, you now require filters that are twice as long, at twice the rate, for 4x the calculations in an FIR. Furthermore, many IIR's that work in single precision just fine require double when you raise the sampling rate.

Somewhere I have a nice photo of sliding an analysis window reasonable fit to the HF cochlear filter length along a very tight antialias filter. It's been a while, and nobody's debated this for quite a while, but it shows quite graphically how this can go wrong.

j_j · Jun 29, 2022

krabapple said:
Hi res is useful during recording/production, and for audio processing at home. It is unnecessary for a delivery format.

This is absolutely true, both parts.

CapMan · Jun 29, 2022

Sal1950 said:
OK, Bye

Oh bless - what is the purpose of our audio systems? I assume we are all music lovers who want to enjoy and appreciate the talents of our preferred musicians.

Or maybe it really is just about the science . Pretty sure the performers on the recordings didn’t have that in mind

voodooless · Jun 29, 2022

CapMan said:
Or maybe it really is just about the science .

It’s science that made it possible to enjoy all that fabulous music.

High Resolution Audio: Does It Matter?

Major Contributor

Major Contributor

Grand Contributor

Master Contributor

Master Contributor

Grand Contributor

Master Contributor

Major Contributor

Major Contributor

Deleted member 16543

Guest

Major Contributor

Major Contributor

Deleted member 16543

Guest

Major Contributor

Major Contributor

Grand Contributor

Major Contributor

Major Contributor

Major Contributor

Grand Contributor

Similar threads