• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

MQA creator Bob Stuart answers questions.

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,595
Likes
25,494
Location
Alfred, NY
Regarding the first. Amir is of that opinion as contained in his following post: https://www.audiosciencereview.com/forum/index.php?threads/high-resolution-audio-does-it-matter.11/

Also, Sergei linked to the following metastudy:
http://www.aes.org/e-lib/browse.cfm?elib=18296

Does that not look reputable to you?

Regarding pre and post ringing, perhaps my wording wasn't as technically accurate as would be ideal. I am learning as I go here. I'm not saying that analog filters show precausal characteristics in the analog domain. Clearly we are talking about digital signal processing here. A better way to say that perhaps would be that linear phase low pass filters, in band limiting the signal, remove upper harmonics that can result in measurable pre and post ringing on sharp transients when played back. That is known as the gibbs phenomenon. I may not have that technically 100% correct and I'm not asserting that it would be audible or that the ringing would occur in normal transients in music (as opposed to square waves) or anything else. Just that it is something that happens with band limited signal processing. At least that much can be said. Well that's what our old friend Monty Montgomery says: https://wiki.xiph.org/Videos/Digital_Show_and_Tell#Bandlimitation_and_timing .

Again, none of the above is nearly sufficient for the types of assertions that Sergei has been making, as far as I can tell.

Last first: no, I do not consider meta-analysis to be generally reliable. If the data from experiments (rather than data torturing) are not clear, unambiguous, and replicable, IMO things are trending toward Langmuir's pathological science definitions. I fully understand that not everyone agrees with this POV.

I've read Amir's post and this discussion of the Stuart paper (which in retrospect seems to anticipate his commercial endeavor). I remain unconvinced that with correct dithering and more reasonable transition bands, the audibility would persist. It would be nice if Stuart extended this research to setups which aren't deliberately oriented toward causing a difference.

Well, @nscrivener we agree on the bottom line, so here's a toast in your honor!
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,927
Likes
38,002
Last first: no, I do not consider meta-analysis to be generally reliable. If the data from experiments (rather than data torturing) are not clear, unambiguous, and replicable, IMO things are trending toward Langmuir's pathological science definitions. I fully understand that not everyone agrees with this POV.

I've read Amir's post and this discussion of the Stuart paper (which in retrospect seems to anticipate his commercial endeavor). I remain unconvinced that with correct dithering and more reasonable transition bands, the audibility would persist. It would be nice if Stuart extended this research to setups which aren't deliberately oriented toward causing a difference.

Well, @nscrivener we agree on the bottom line, so here's a toast in your honor!
+1

I think one should always be very wary of meta-studies. It isn't anywhere close to doing a blind test with 30 participants and then adding more sessions until you have 3000 participants doing the same exact test. Yet that is what meta-studies attempt to approach or to claim. That a bunch of studies with limited clarity in conclusions can be put together as if one big test or study was done with more in depth conclusions possible. They aren't even the same test just in the same general direction.
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
Last first: no, I do not consider meta-analysis to be generally reliable. If the data from experiments (rather than data torturing) are not clear, unambiguous, and replicable, IMO things are trending toward Langmuir's pathological science definitions. I fully understand that not everyone agrees with this POV.

I've read Amir's post and this discussion of the Stuart paper (which in retrospect seems to anticipate his commercial endeavor). I remain unconvinced that with correct dithering and more reasonable transition bands, the audibility would persist. It would be nice if Stuart extended this research to setups which aren't deliberately oriented toward causing a difference.

Well, @nscrivener we agree on the bottom line, so here's a toast in your honor!

Thanks for the toast!

Fair enough for you to take a cautionary approach. I am aware of the issues. Social science in particular is dogged with flawed statistical analysis and issues with reproducibility. The scientific community needs to address this sort of thing and introduce things like a broad requirement for pre registration of studies (where the authors wish to seek publication).

Having said that, the author of the meta-analysis looks to have taken a fairly thorough, critical appraisal. It seems reasonable to accept, at least in a cautionary way, that there may be some differences that are audible with higher resolution formats, particularly where the listeners are trained.

Regarding the comment about transition bands. The general commentary on that seems to be that steeper/higher order filters cause more ripple in the audible band. At a 44.1khz sample rate, if we want to maintain a maximum reproducible frequency of 20khz, we need a fairly narrow transition band requiring a steep filter. Or we have a tradeoff with a shallower filter that lets more aliasing artifacts through. So it seems to me that a more reasonable transition band likely goes hand in hand with a higher sample rate.

[EDIT] - I now realise you were referring to the criticisms of the Stuart paper, in that they used non-optimal parameters for their 16 bit 44khz audio, in that the dither was non optimal and they chose an unusually high cut-off frequency for the filter. If that's so, it seems fair to demand optimal CD quality parameters in any comparison with higher definition formats. Having said that, the meta-study considered 18 different studies that met criteria for inclusion so the claim of audibility is based on more than just the Stuart paper.
 
Last edited:

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,595
Likes
25,494
Location
Alfred, NY
So it seems to me that a more reasonable transition band likely goes hand in hand with a higher sample rate.

On the recording end, yes, just like 24 bits is actually an advantage there, as opposed to playback. But recording at a higher sample rate, then downsampling for mastering for playback is pretty much standard- or was the standard while CD was king.
 

Guermantes

Senior Member
Joined
Feb 19, 2018
Messages
486
Likes
562
Location
Brisbane, Australia
In light of the discussion here, it will be interesting to see what responses to Stuart and Craven's paper ensue in the AES journal.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,927
Likes
38,002
Thanks for the toast!

Fair enough for you to take a cautionary approach. I am aware of the issues. Social science in particular is dogged with flawed statistical analysis and issues with reproducibility. The scientific community needs to address this sort of thing and introduce things like a broad requirement for pre registration of studies (where the authors wish to seek publication).

Having said that, the author of the meta-analysis looks to have taken a fairly thorough, critical appraisal. It seems reasonable to accept, at least in a cautionary way, that there may be some differences that are audible with higher resolution formats, particularly where the listeners are trained.

Regarding the comment about transition bands. The general commentary on that seems to be that steeper/higher order filters cause more ripple in the audible band. At a 44.1khz sample rate, if we want to maintain a maximum reproducible frequency of 20khz, we need a fairly narrow transition band requiring a steep filter. Or we have a tradeoff with a shallower filter that lets more aliasing artifacts through. So it seems to me that a more reasonable transition band likely goes hand in hand with a higher sample rate.

[EDIT] - I now realise you were referring to the criticisms of the Stuart paper, in that they used non-optimal parameters for their 16 bit 44khz audio, in that the dither was non optimal and they chose an unusually high cut-off frequency for the filter. If that's so, it seems fair to demand optimal CD quality parameters in any comparison with higher definition formats. Having said that, the meta-study considered 18 different studies that met criteria for inclusion so the claim of audibility is based on more than just the Stuart paper.

You know this is what is conventionally talked about with higher sampling rates is the wider transition band. It is wider. 48 khz has a 4 khz wide band, and usually 96 khz will have an 8 khz wide transition band. But in terms of per octave steepness of the filter those are the same. It doesn't have to be that way, but almost everyone does it this way. There are a few others, but the best known exception is Lavry pro converters. At higher sample rates he maintains about 30 khz of bandwidth before rolling off the response. So at 96 khz he does have a wider transition band, and it is gentler filtering. In fact even in MQA, they rather roll off at 35 khz IIRC even when unfolding to higher rates.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
It's fair to say that in carefully controlled conditions, some trained listeners can discriminate high resolution audio from standard resolution audio, on some material.

It's also fair to say that anti-aliasing filters can exhibit measurable characteristics of pre and post ringing when fed test signals that include very sharp transients (i.e. square waves).

Where I have a problem is the extent to which you draw conclusions from the above.

Even if I was to grant your premise, essentially the arguments that MQA have put forward regarding the perceptual effect of "smearing" in the time domain, which I don't, because I don't think that the reasons for the differences noted above have been established, at least not with any certainty, there is absolutely no doubt whatsoever that there are other factors influencing the quality of digital audio playback that are many orders of magnitude more important than the incremental improvements that might come from increased sample rates, bit rates, and different low-pass filtering schemes. Take a very well recorded and mastered track encoded in 16 bit 44khz PCM and compare it to an average track in any high resolution format. There is no doubt that the former will sound better. Possibly much better. I have many albums in my collection that demonstrate that. It's also true that factors like speaker selection and room acoustics have huge effects on playback quality, also orders of magnitude more significant. I'd almost go so far as to say that you could probably pick just about anything else in the playback chain to focus on and have a greater chance of getting audible improvements. (Ok maybe that's pushing the point a little too far.)

I like your carefully formulated and fairly weighted response. I agree with everything you said, with the exception of "the extent" statement. I care about all the links in the chain of delivery. It just so happened that we've been discussing a lot the digital formats and anti-aliasing filters lately. It is not my focus overall. Only a focus of this conversation.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
+1

I think one should always be very wary of meta-studies. It isn't anywhere close to doing a blind test with 30 participants and then adding more sessions until you have 3000 participants doing the same exact test. Yet that is what meta-studies attempt to approach or to claim. That a bunch of studies with limited clarity in conclusions can be put together as if one big test or study was done with more in depth conclusions possible. They aren't even the same test just in the same general direction.

I trust the 2016 meta-study more than I would an enlarged 2014 study. Why? Because of the https://en.wikipedia.org/wiki/Central_limit_theorem.

Different groups used varied test setups, equipment, audio fragments, participants selection criteria etc., each of which could be confounding to specific experiments.

The variety in values of these potentially confounding variables makes their influence on the overall result weaker, and the values of the measured variable more systematically obeying the expected statistical distribution.

Just one study with the same number of participants and trials would be more subject to systematic errors. And countless counter-arguments, akin to the allegedly "wrong" choice of dithering method in the 2014 study.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,181
Location
UK
It's fair to say that in carefully controlled conditions, some trained listeners can discriminate high resolution audio from standard resolution audio, on some material.
Technically, to be 'science' there has to be a hypothesis about why that should be otherwise it can't be taken any further.

If people really were able to tell the difference, what would be the mechanism by which they were doing it?

If it was the ripple of the filters, then that wouldn't really be a property of "high resolution audio" as such..? If it was the effect of bandwidth limiting, then it would be more useful to say that listeners could hear the effects of bandwidth limiting and then that could be investigated further. Or if listeners were hearing the dither or quantisation noise then that could be investigated further etc.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
I believe it does. Why 192? What's so special about this number? Do you have other theory explaining this?



I wasn't satisfied with these kinds of partial explanation. Wanted to find something that explains the maximum number of strange and controversial phenomena of the audiophile world.



I'm thinking about how to demonstrate to the members of this forum, in a a simple and reproducible way, that it is in fact is related - in the hearing system that is. I agree that in the idealized DSP domain it isn't related much.

Not easily reproducible at home. But simple and scientifically reproducible: https://www.nature.com/articles/srep31754.

Executive summary: a poor lab rat is administered with an audio signal that only lasts about half a microsecond, with peak at about ~250 dB SPL; rat's outer hair cells corresponding to high frequencies go kaput.

srep31754-f1.jpg


Not its stated goal, yet the experiment proves that a very short transient affects the hearing system. In this case, in a dramatic way, with the gruesome result getting "photographed" by the rat's cochlea.

If we scale down the signal's peak, let's say to 120 dB SPL, the mechanism transmitting the energy of the wave to hair cells will still work: the mechanics of this is well-understood.

So, the transient could be heard in reality, yet precision of its timing on a digital record would depend on the sampling rate.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,257
Likes
17,249
Location
Riverview FL
So, the transient could be heard in reality, yet precision of its timing on a digital record would depend on the sampling rate.

I can't hear digits.

What kind of tweeter do we need?
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,181
Location
UK
Not easily reproducible at home. But simple and scientifically reproducible: https://www.nature.com/articles/srep31754.

Executive summary: a poor lab rat is administered with an audio signal that only lasts about half a microsecond, with peak at about ~250 dB SPL; rat's outer hair cells corresponding to high frequencies go kaput.

srep31754-f1.jpg


Not its stated goal, yet the experiment proves that a very short transient affects the hearing system. In this case, in a dramatic way, with the gruesome result getting "photographed" by the rat's cochlea.

If we scale down the signal's peak, let's say to 120 dB SPL, the mechanism transmitting the energy of the wave to hair cells will still work: the mechanics of this is well-understood.

So, the transient could be heard in reality, yet precision of its timing on a digital record would depend on the sampling rate.
How and why did you associate this experiment with high resolution audio? If presented with this I wouldn't have made the connection!

You're saying that if a very short pulse of energy is enough to destroy some mechanical sensor hardware then that proves that the sensor and the processor it is attached to would have registered the pulse at lower levels?
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,181
Location
UK
Re. the destructive pulse experiment: it seems like a fair point, sure enough. But do we know that the 'processor' isn't responding to multiple sensor outputs simultaneously? i.e. individual sensors may be stimulated (in the 'skirts' of their resonances - if that's how it works).

Is the brain guaranteed to register this as a short impulse at all, and not something different? Just as a narrow spectral spike of yellow is seen by the eye as indistinguishable from simultaneous spikes at red and green, is a short high amplitude impulse 'seen' by the ear as something else comprised of lower frequencies?

Maybe we should create a new sampling 'paradigm' that doesn't attempt to reproduce these high amplitude ultrasonic signals faithfully, but instead substitutes what the ear would need in order to reproduce their effects (just as a video recording doesn't reproduce a narrow spectral spike at yellow, but substitutes red and green). Or am I straying into the 'voodoo' area of audio?

High amplitude X Rays would affect vision at some point I imagine. To simulate it, we could add the blue haze, red spots and eventual fade to black to the video recording :).
 
Last edited:

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,153
Likes
36,886
Location
The Neitherlands
So, the transient could be heard in reality, yet precision of its timing on a digital record would depend on the sampling rate

I see a risetime of about 50ns. This belongs to a frequency of about 5MHz... ?
I can see how that could be heard, transmitted through air, be recorded faithfully, be reproduced faithfully with an SPL of 120dB and is relevant to audio reproduction.
Even if the sharp rise would be smoothed and one sinewave half would encompass 0.5us this would mean a frequency of 1MHz (that is 1000kHz).

See no reason to torture any animals for this kind of research... it probably has a reason/function.
 
Last edited:

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
Not easily reproducible at home. But simple and scientifically reproducible: https://www.nature.com/articles/srep31754.

Executive summary: a poor lab rat is administered with an audio signal that only lasts about half a microsecond, with peak at about ~250 dB SPL; rat's outer hair cells corresponding to high frequencies go kaput.

srep31754-f1.jpg


Not its stated goal, yet the experiment proves that a very short transient affects the hearing system. In this case, in a dramatic way, with the gruesome result getting "photographed" by the rat's cochlea.

If we scale down the signal's peak, let's say to 120 dB SPL, the mechanism transmitting the energy of the wave to hair cells will still work: the mechanics of this is well-understood.

So, the transient could be heard in reality, yet precision of its timing on a digital record would depend on the sampling rate.

The woo is strong with this one
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,595
Likes
25,494
Location
Alfred, NY
The woo is strong with this one

And it still keeps trying to somehow relate sampling rate with temporal precision, showing that it is incapable of reducing its erroneous thinking. That's a poorly programmed bit of AI. The continual use of Wikipedia is suggestive.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,741
Likes
10,484
Location
North-East
Not easily reproducible at home. But simple and scientifically reproducible: https://www.nature.com/articles/srep31754.

Executive summary: a poor lab rat is administered with an audio signal that only lasts about half a microsecond, with peak at about ~250 dB SPL; rat's outer hair cells corresponding to high frequencies go kaput.

srep31754-f1.jpg


Not its stated goal, yet the experiment proves that a very short transient affects the hearing system. In this case, in a dramatic way, with the gruesome result getting "photographed" by the rat's cochlea.

If we scale down the signal's peak, let's say to 120 dB SPL, the mechanism transmitting the energy of the wave to hair cells will still work: the mechanics of this is well-understood.

So, the transient could be heard in reality, yet precision of its timing on a digital record would depend on the sampling rate.

Do you realize that 240dBSPL sound translates into 200 atmospheres of pressure at the ear? Do you really think the delicate ear structures will not get destroyed by this pressure, regardless of how short or long the impulse might be? I'd be surprised if the mouse didn't suffer from concussion. At 120dB, the pressure is only 0.0002 of an atmosphere, and we know that level of sound can be damaging to human hearing even with a short exposure.
 

pjug

Major Contributor
Forum Donor
Joined
Feb 2, 2019
Messages
1,779
Likes
1,563
And it still keeps trying to somehow relate sampling rate with temporal precision, showing that it is incapable of reducing its erroneous thinking.

@Sergei is caught in a naive trap. Bob Stuart himself says so: https://www.stereophile.com/content/mqa-questions-and-answers-tutorial-temporal-errors-audio :

Let's just cover what we don't mean when we are talking about blurring in the time domain. Too often people immediately assume we have fallen into the naïve trap that imagines the time-base of a single digital channel to be quantised at the sample rate. To quote Stanley Lipshitz (from [3]):

"One often misunderstood aspect of sampled-data systems is the question of their time resolution—can they resolve details that occur between samples, such as a time impulse or step? ... time resolution is in fact infinitely fine for signals band-limited in conformity with the sampling theorem, and is completely independent of precisely where the samples happen to fall with respect to the time waveform . . ."

Stanley suggests the time-base resolution is 'infinite.' Of course that is only true if we are actually asking the question: 'What is the limit of resolution of relative phase of a continuous sinewave below Fs/2 in a uniformly-sampled channel employing TPDF dither?' Even then our ability to prove the precision, like all measurements, depends on signal/noise ratio.

If there is no dither involved and the samples are quantised, there is an approximate estimate of time-base resolution which is possibly relevant to brief impulsive signals (which by definition do not benefit from dither).

Limit = 1/((Fs x Pi x 2((n-1))) where n is number of bits.

For 44.1kHz 16-bit data this resolves to 220ps, not to 22.7µs = (Fs–1).
 

pozz

Слава Україні
Forum Donor
Editor
Joined
May 21, 2019
Messages
4,036
Likes
6,829
I'm coming in late to this thread, but there doesn't seem to be a good argument for MQA. If hardware and other updates are required it's simpler to push for a 96/24 PCM standard. This is realistic to some degree even for smaller studios and home producers. Otherwise the pool of music becomes severely limited.

Calling the audibility question moot and focusing on accuracy alone doesn't show MQA sampling/reproduction techniques to be a major, epoch-making improvement. At best it's an innovation that could help retheorize this area in the academy (the tacit point being that if timing accuracy is the goal, an even more precise and more practical format is within reach). The effort should be focused there, not industry change.

Kind of reminds me of DAT tapes, to be honest.
 
Top Bottom