MQA creator Bob Stuart answers questions.

Sergei · Jun 14, 2019

solderdude said:
Testing a hypothesis is always good.

Its in human nature to only look for evidence that concurs with their line of thinking, rarely to evidence that disagrees with their line of thinking.
(also an observation, not a personal remark b.t.w. but a general one that applies to everyone including me)

Do you plan to test this yourself one day, in order to gather evidence, or look for it in literature ?

I usually do both. Yet this isn't on my list of things to do yet. That's why I'm wondering if anyone has done it already, or knows about someone else who did.

Blumlein 88 · Jun 14, 2019

Costia said:
I highly doubt its due to the sampling rate. I would suspect its due to the recording and reproduction itself.
You mentioned reverberations yourself.
Why focus on reverberation details of a specific high SPL/db impulse signal, when I am quite sure we are still unable to reproduce the reverberation patterns of an opera house at home.
If you even slightly move your head you will instantly get a difference between a live performance and a speaker's reproduction.
I have been at several opera houses in Europe, and the difference between a live performance and a recording is obvious.
But i think it has a lot more to do with the acoustics of the hall and the positioning of sound sources than some slight differences in utrasonic impulse responses.

I don't remember saying that. Or even anything similar.

From one of Kunchur's papers, the one that shows the 5us discrimination:
https://asa.scitation.org/doi/pdf/10.1121/1.2998548?class=pdf

So even he's isn't sure why the discrimination happened.
It could be the 5us delay, but it could also be because in addition to the delay itself, the audio waveform was changed due to the equipment.
Which is why he also points out:

Edit:
Somewhat unrelated to the current theoretical discussion, but just want to mention this anyway.
Kunchur is having trouble figuring out whats going on with state of the art lab equipment that took him 2 years to set up.

So personally, I would conclude that claims by people who say they can tell the difference using consumer grade electronics are BS.

Actually Kunchers filters alter the 7 khz primary enough that alone is audible as a level difference. In fact if you look at the effects they track right down. Results become less certain when the filter caused a .2 db difference and disappeared when the filter didn't alter the primary 7khz tone more than .1 db. His test was detecting nothing more than that. It confused him because he thought jnd was enough not to corrupt the test. He applied the wrong criteria.

SIY · Jun 14, 2019

Blumlein 88 said:
Actually Kunchers filters alter the 7 khz primary enough that alone is audible as a level difference. In fact if look at the effects they track right down. Results become less certain when the filter caused a .2 db difference and disappeared when the filter didn't alter the primary 7khz tone more than .1 db. His test was detecting nothing more than that. It confused him because he thought jnd was enough not to corrupt the test. He applied the wrong criteria.

There's often problems when people work outside their areas of expertise and don't consult people who know what they're doing. For whatever reason, there's a tendency to believe "because I'm super smart at ABC, I am therefore smart with XYZ." I mentioned Pons and Fleischman earlier today, and they would be perfect examples (though on a larger scale than the minor embarrassment of Kunchur).

That's a big reason I have lots of correspondence with people who are true experts in electronics and acoustics, where I have zero training: it gives me a way of reality-checking and catching my own errors before I put them in print. Usually.

solderdude · Jun 14, 2019

Sergei said:
I usually do both. Yet this isn't on my list of things to do yet. That's why I'm wondering if anyone has done it already, or knows about someone else who did.

got it.

Costia · Jun 14, 2019

Sergei said:
The effects I described can be observed with high-quality headphones as well. This takes away the room acoustics considerations.

Quite the opposite. Using headphones insures that you can easily tell the difference.
Moving your head will change the sound in the live performance but won't change it with the headphones.
Edit: Room acoustics are part of what you hear, and they will be different live vs headphones or a room at home.

Sergei said:
You may want to read this: https://en.wikipedia.org/wiki/Ignaz_Semmelweis. See the parallels? That's how the inertia of an established paradigm works. We, individually, have a choice: either be like those guards who beat the "savior of mothers" to death, or be like Louis Pasteur, who refined the scientific knowledge, and proved Ignaz Semmelweis right.

You are only missing this part:

Despite various publications of results where hand washing reduced mortality to below 1%,

He had conclusive data, which you lack.

SIY · Jun 14, 2019

Waiting for invocation of Galileo in 3... 2... 1...

Blumlein 88 · Jun 14, 2019

Hey Copernicus was right, but he wasn't widely believed for about a century.

SIY · Jun 14, 2019

Blumlein 88 said:
Hey Copernicus was right...

After a year with the 49ers, defenses had figured him out. Flash in the pan. I have higher hopes for Lamar Jackson.

MRC01 · Jun 14, 2019

nscrivener said:
... If there are two sharp peaks in sound pressure separated by 5 microseconds (which was the threshold upper bound determined in our experiments), they will merge together and the essential feature (the presence of two distinct peaks rather than one blurry blob) is destroyed. ... Now the CD's lack of temporal resolution for complete fidelity is not systemic of the digital format in general: the problem is relaxed as one goes to higher sampling rates and by the time one gets to 192 kHz, the bandwidth and the ability to reproduce fine temporal details is likely to be adequate. ...

So essentially, he says human audio perceptual acuity is asymmetric in time and frequency.
This reminds me of a recent discussion here: https://www.audiosciencereview.com/forum/index.php?threads/time-vs-frequency-domain.6703/

Sergei · Jun 14, 2019

Blumlein 88 said:
Actually Kunchers filters alter the 7 khz primary enough that alone is audible as a level difference. In fact if you look at the effects they track right down. Results become less certain when the filter caused a .2 db difference and disappeared when the filter didn't alter the primary 7khz tone more than .1 db. His test was detecting nothing more than that. It confused him because he thought jnd was enough not to corrupt the test. He applied the wrong criteria.

I already agreed that this was a weak spot in the experiment design. He should have ABX-ed each participant's just-noticeable intensity level difference (JND) at 7 KHz 69 dB with as pure a sinusoid as he could muster, right at the hour of the individual test run.

However, is what you are describing a sole causation, or a mere correlation? To me, it wasn't proven either way.

Could all the participants listen to a pure 7 KHz sinusoid for 10 seconds, then wait, then listen to another 7 KHz pure sinusoid (or same one) for another 10 seconds, and notice (or not) 0.25 dB difference?

That's where it could have gotten interesting. I expect that at that frequency and intensity for some of them the JND measured that way would be closer to 1.0 dB, and for others closer to 0.2 dB.

Kuncher took an average as a gating factor. That made his experiment inconclusive in my view, yet not proving the opposite of what he asserted.

SIY · Jun 14, 2019

If you want to try to prove the hypothesis of a guy whose work was severely flawed, go for it. That will mean actually doing something beyond wikipedia and typing.

Sergei · Jun 14, 2019

MRC01 said:
So essentially, he says human audio perceptual acuity is asymmetric in time and frequency.
This reminds me of a recent discussion here: https://www.audiosciencereview.com/forum/index.php?threads/time-vs-frequency-domain.6703/

I'm not convinced at this time that differentiating, let's say, two 5 microsecond 120 dB pulses separated by 5 microseconds vs them merged into one 5 microsecond pulse transferring the same mechanical momentum is perceptually important.

I'm reasonably convinced that differentiating a presence from an absence of one such pulse is important.

Sergei · Jun 14, 2019

SIY said:
If you want to try to prove the hypothesis of a guy whose work was severely flawed, go for it. That will mean actually doing something beyond wikipedia and typing.

Exactly. That goes for all of us. Either proving or disproving would require more than just reading and talking.

SIY · Jun 14, 2019

Most of us aren't interested in trying to justify bad work. There's too many more important and relevant things that actually deserve attention. This is your white whale, it's up to you to harpoon it.

nscrivener · Jun 15, 2019

Sergei said:
Exactly. That goes for all of us. Either proving or disproving would require more than just reading and talking.

Shouldn't the burden of proof lie with those proposing an alternative to the prevailing norm?

Blumlein 88 · Jun 15, 2019

Sergei said:
I already agreed that this was a weak spot in the experiment design. He should have ABX-ed each participant's just-noticeable intensity level difference (JND) at 7 KHz 69 dB with as pure a sinusoid as he could muster, right at the hour of the individual test run.

However, is what you are describing a sole causation, or a mere correlation? To me, it wasn't proven either way.

Could all the participants listen to a pure 7 KHz sinusoid for 10 seconds, then wait, then listen to another 7 KHz pure sinusoid (or same one) for another 10 seconds, and notice (or not) 0.25 dB difference?

That's where it could have gotten interesting. I expect that at that frequency and intensity for some of them the JND measured that way would be closer to 1.0 dB, and for others closer to 0.2 dB.

Kuncher took an average as a gating factor. That made his experiment inconclusive in my view, yet not proving the opposite of what he asserted.

You are in the process of making Kuncher's mistake. JND is the difference in loudness you can perceive by ear. As in playing two tones close together and picking the louder or saying they sound the same. Depending upon the frequency you'll get something between about .7 and 1.2 db for the answer. It's been awhile since I read the paper, but I think he was assuming 1.2 db was the JND.

Yet, in blind testing .5 db loudness differences will result in almost universal recognition of a difference. A .25 db will be picked up significantly if not 100%. Only around .1 db will get you random results. If you look at the filters for the harmonics of 7 khz, they had an effect on the level of the 7 khz tone. Kunchur incorrectly believed keeping differences below 1 db wouldn't effect results.

So when you look at the results, they are almost exactly what you'd expect if you just played 7 khz tones that had a slight difference in level. So his test is invalidated right there. I don't have the files, I once made up some pure 7 khz tones and sent them to a few people. Level differences matched the effect of his filters, but mine were not filtered. Results were similar to Kuncher's.

All of this is why level matching by ear for comparison listening does only one thing. Confuses you. Because you likely don't get it closer than .5 to 1 db. You'll always hear a difference with those level differences even though they sound the same loudness to you.

From Kunchur's paper:

As per the earlier discussion in subsection 2.2, only the 7 kHz component exceeds its threshold of audibity. The sound-level changes in all components (individually or collectively) fall below their JNDs. For the shortest discriminable displacement of d = 2.3 mm, we have ΔLp ≈ −0.2 dB (a 5% intensity decrease) for both rms and 7 kHz fundamental levels (Tables I and II, and Fig. 5). The JND (for f ≥ 7 kHz and Lp=69 dB) is known from Jesteadt et al. [40] to be 0.7 dB (a 15% decrease in intensity). Even the 3 standard-error lower limit of this JND is 0.5 dB (an 11% decrease in intensity). Thus the level changes in the experiment (< 0.2 dB) appear to be subliminal and the discrimination might involve more than just spectral amplitude cues.

In the last portion with a level difference right about .2 db one listener was 10 for 10 and the others 4 for 10. At the point where the level was .5 db different results were around 8 or 9 of 10. Above that it was 100%. His test was fatally flawed for just this reason about mistaking JND and level differences that alter blind testing. It is a shame he expended so much effort. I was rather appalled that no on reading his write ups caught this obvious goof before publication. Equally appalling is the oversize attention his work gets and apparently will get into perpetuity. This very basic flaw kills its relevance right there. Strong is the will of the audiophile to want extra bandwidth to mean extra sound. More is better.

The other obvious tip off was his misunderstanding the timing resolution of digital audio to be the time between samples taken. Sound familiar?

Sergei · Jun 15, 2019

nscrivener said:
Shouldn't the burden of proof lie with those proposing an alternative to the prevailing norm?

Depends on the context IMHO. If there is a solid evidence for a phenomenon not yet explained by a prevailing norm, and no alternative explanation, I would say the burden is on the upholders of the prevailing norm. When an alternative explanation is proposed, then the burden splits. If the new explanation becomes the new prevailing norm, and there are no alternatives, only then the burden shifts all the way.

SIY · Jun 15, 2019

IOW, "I like to make blah blah, but I am unwilling to get off my butt and actually do anything to find out if any of my blah blah is true or relevant."

Blumlein 88 · Jun 15, 2019

Sergei said:
Depends on the context IMHO. If there is a solid evidence for a phenomenon not yet explained by a prevailing norm, and no alternative explanation, I would say the burden is on the upholders of the prevailing norm. When an alternative explanation is proposed, then the burden splits. If the new explanation becomes the new prevailing norm, and there are no alternatives, only then the burden shifts all the way.

I had a nice response written and erased it. Simply JUST NO. If you make a claim somewhat beyond what is thought to be the case, especially when those doubting your claim have many reasons to think your case holds no water it is up to the claimant to come up with something. And by come up with something I don't mean off the cuff half-baked hypothesis with associations with barely similar odd results saying okay up to you guys to prove otherwise.

Sergei · Jun 15, 2019

Blumlein 88 said:
You are in the process of making Kuncher's mistake. JND is the difference in loudness you can perceive by ear. As in playing two tones close together and picking the louder or saying they sound the same. Depending upon the frequency you'll get something between about .7 and 1.2 db for the answer. It's been awhile since I read the paper, but I think he was assuming 1.2 db was the JND.

Yet, in blind testing .5 db loudness differences will result in almost universal recognition of a difference. A .25 db will be picked up significantly if not 100%. Only around .1 db will get you random results. If you look at the filters for the harmonics of 7 khz, they had an effect on the level of the 7 khz tone. Kunchur incorrectly believed keeping differences below 1 db wouldn't effect results.

So when you look at the results, they are almost exactly what you'd expect if you just played 7 khz tones that had a slight difference in level. So his test is invalidated right there. I don't have the files, I once made up some pure 7 khz tones and sent them to a few people. Level differences matched the effect of his filters, but mine were not filtered. Results were similar to Kuncher's.

All of this is why level matching by ear for comparison listening does only one thing. Confuses you. Because you likely don't get it closer than .5 to 1 db. You'll always hear a difference with those level differences even though they sound the same loudness to you.

From Kunchur's paper:

As per the earlier discussion in subsection 2.2, only the 7 kHz component exceeds its threshold of audibity. The sound-level changes in all components (individually or collectively) fall below their JNDs. For the shortest discriminable displacement of d = 2.3 mm, we have ΔLp ≈ −0.2 dB (a 5% intensity decrease) for both rms and 7 kHz fundamental levels (Tables I and II, and Fig. 5). The JND (for f ≥ 7 kHz and Lp=69 dB) is known from Jesteadt et al. [40] to be 0.7 dB (a 15% decrease in intensity). Even the 3 standard-error lower limit of this JND is 0.5 dB (an 11% decrease in intensity). Thus the level changes in the experiment (< 0.2 dB) appear to be subliminal and the discrimination might involve more than just spectral amplitude cues.

In the last portion with a level difference right about .2 db one listener was 10 for 10 and the others 4 for 10. At the point where the level was .5 db different results were around 8 or 9 of 10. Above that it was 100%. His test was fatally flawed for just this reason about mistaking JND and level differences that alter blind testing. It is a shame he expended so much effort. I was rather appalled that no on reading his write ups caught this obvious goof before publication. Equally appalling is the oversize attention his work gets and apparently will get into perpetuity. This very basic flaw kills its relevance right there. Strong is the will of the audiophile to want extra bandwidth to mean extra sound. More is better.

The other obvious tip off was his misunderstanding the timing resolution of digital audio to be the time between samples taken. Sound familiar?

Thank you for taking the time to explain the JND vs Blind deltas so eloquently. Much appreciated. We both believe that the Kunchur's experiments were flawed because he took from a book what he believed was an accurate and relevant differentiation threshold for a pure tone, instead of measuring it in the context of his specific experiments, using exactly same equipment, participants, and blind testing protocol he used for the square waves.

However, I haven't found any evidence pointing to Kunchur's misunderstanding of the Sampling Theorem. Or that Bob Stuart and Peter Craven misunderstand it. Let me relay my view on this Theorem, if you will. I will use the definition and proof from https://ccrma.stanford.edu/~jos/mdft/Sampling_Theorem.html.

What functions are subject to the Sampling Theorem? For an audio signal, let's assume the Theorem's first two technical requirements of the function being continuous, and having a continuous Fourier transform, are satisfied. The third technical requirement is tougher to satisfy though: the function values of the Fourier transform have to be zero beyond the angular frequency argument value of |w| >= Pi / T, where T is the sampling interval in seconds.

Obviously, audio-practical sines with constrained frequencies and amplitudes, and their linear combinations with finite number of terms, satisfy the third technical requirement of the Sampling Theorem. By audio-practical I mean a sine that could be gradually attenuated toward zero amplitude at its beginning and end without perceptually changing the resulting sound.

Yet plenty of practically relevant functions don't satisfy the third requirement. For instance, look at https://web.calpoly.edu/~fowen/me318/FourierSeriesTable.pdf or https://www.utdallas.edu/~raja1/EE 3302 Fall 16/GaTech/fseriesdemo/help/theory.html.

Correspondingly, the Sampling Theorem simply doesn't technically apply to such practical functions. If we want to perceptually accurately sample the audio signals containing such functions as their components (e.g. such signals as some practically observed transients), we have to first approximate those "inconvenient" functions with other functions, perceptually equivalent, yet satisfying the third technical requirement of the Sampling Theorem.

The perceptually accurate approximations of transients may require a lower, equal, or higher sampling rate and bit depth, compared to the sampling rate and bit depth required for the representation of the audio-practical sines.

If the perceptually accurate "transients-friendly" sampling rate and bit depth, for a specific audio signal, a specific delivery system, and a specific individual, happen to be lower or equal to the perceptually accurate "sines-friendly" sampling rate and bit depth, we can just use the "sines-friendly" sampling rate and bit depth throughout, and the transients will be taken care of as well. Please note that such sampling rate doesn't necessarily have to be 44 KHz or more: some phone systems still happily use sampling rate of 8 KHz.

If, however, the combination of a specific audio signal, a specific delivery system, and a specific individual is such that the "transients-friendly" perceptually accurate sampling rate and bit depth turn out to be higher than the "sines-friendly" sampling rate and bit depth, we either have to use the "transients-friendly" sampling rate and bit depth throughout, or devise a scheme for encoding the sines and transients via different channels, each using the corresponding "friendly" sampling rate and bit depth.

It took some time for humankind to figure out the practical universal "sines-friendly" sampling rate and bit depth for music. Most people believed those to be 44 KHz and 16 bits, back in the 20th century. Nowadays, it is believed to be 48 KHz (to accommodate a wider transition region for a perceptually-friendlier antialias filter) and 24 bits (a perceptually-friendly 20 bits rounded up to the nearest multiple of 8 bits).

As to the practical universal "transients-friendly" sampling rate, the debate keeps raging. Judged by the highest sampling rate that audio professionals and buyers of high-resolution audio and video records voted for with their wallets, 192 KHz appears to be sufficient. There are hotly disputed indications that certain rare genres of music benefit from 384 KHz.

The practical universal "transients-friendly" bit depth is commonly believed to be either equal to, or lower than, the "sines-friendly" bit depth. While I haven't seen this assumption scientifically proven in an experiment directly aimed to do so, this does appear to be a reasonable approximation for now. Still, I'd like to see this validated one day.

What MQA ostensibly achieves is a compression ratio that allows to store perceptually accurate representation of practically encountered music pieces, sampled at the highest currently believed "sines-friendly" and "transients-friendly" sampling rates and bit depths, in a space required for the uncompressed 20th century 44/16 representation.

MQA creator Bob Stuart answers questions.

Senior Member

Grand Contributor

Grand Contributor

Grand Contributor

Member

Grand Contributor

Grand Contributor

Grand Contributor

Major Contributor

Senior Member

Grand Contributor

Senior Member

Senior Member

Grand Contributor

Member

Grand Contributor

Senior Member

Grand Contributor

Grand Contributor

Senior Member

Similar threads