• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

What would it mean to use Bayesian statistics in listening tests?

OP
Blumlein 88

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,767
Likes
37,627
Hi
6 sigma, in that case, is a QA management method, not a statistical spread width.
I knew what he meant. I think 6 sigma QA shoots for what actually is about 5 sigma results or 1 defect per 1 million. Like most things that are marketed they spruced it up a bit instead of being fully honest.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,469
Likes
2,466
Location
Sweden
Well Harman did test that. Same speakers were chosen in the same relative rankings in stereo as mono. I don't know if they did tests altering the response for stereo to see if that changed matters, but doing stereo tests vs mono tests gave the same results effectively. So continuing with mono tests of course makes plenty of sense. They did find the relative rankings differed by smaller amounts.

I would have expected that a priori. I've found listening to complex orchestral music is more difficult to discern fine differences vs simple music with a voice and one instrument or few instruments. I think our limited processing bandwidth in hearing is overloaded with complex sounds. This also fits with what is known about pattern matching of our senses as done in our brain. So I think trying to choose differences in one speaker as a sound source vs two our hearing would do a better job with one. Results of such tests agree. This is just the opposite of what audiophiles believe of course.

If I give a specific example. You pick the best, most preferred speaker that you find in a mono/single speaker test. Place those in the stereo configuration and play a mono source (like pink noise). They will sound different compared the single speaker in mono due to comb filtering effects giving timbral differences. So the preferred speaker in stereo cannot be the same speaker that is preferred in mono/single configuration.
 
OP
Blumlein 88

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,767
Likes
37,627
If I give a specific example. You pick the best, most preferred speaker that you find in a mono/single speaker test. Place those in the stereo configuration and play a mono source (like pink noise). They will sound different compared the single speaker in mono due to comb filtering effects giving timbral differences. So the preferred speaker in stereo cannot be the same speaker that is preferred in mono/single configuration.
Yet they were the same. The mistake is to assume some other pair of speakers would be preferred. They were not. Would one be preferred vs two? Maybe but that is beside the point.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,469
Likes
2,466
Location
Sweden
Yet they were the same. The mistake is to assume some other pair of speakers would be preferred. They were not. Would one be preferred vs two? Maybe but that is beside the point.

That the same speaker was preferred must be beacuse of the other speakers having other errors. Point is that you cannot have two different timbral responses being the most preferred. It must be one of them. Tooles answer to this is that stereo is flawed, which it is. if a perfectly linear response is preferred in mono use you must change the frequency curve somewhat in stereo use to mimic the same response in mono. At least for the majority of sound created in the phantom images between the speakers.
 
OP
Blumlein 88

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,767
Likes
37,627
That the same speaker was preferred must be beacuse of the other speakers having other errors. Point is that you cannot have two different timbral responses being the most preferred. It must be one of them. Tooles answer to this is that stereo is flawed, which it is. if a perfectly linear response is preferred in mono use you must change the frequency curve somewhat in stereo use to mimic the same response in mono. At least for the majority of sound created in the phantom images between the speakers.

So you'd need a speaker very similar to the one testing best in mono, to have just the right frequency response difference to make it work better in stereo than the best mono speaker. And this response only helps with phantom mono images. I don't know that such a specific comparison was done. It also assumes the other aspects beyond phantom mono image don't have effects which still might overall favor the speaker that also was best in mono. The testing that was done showed the same speakers scoring best in stereo. My guess would be the even off axis response is important in both cases, maybe even more so in stereo. Changing the response to fix the phantom image inherently must worsen the off axis response so that is a negative effect. Perhaps a bigger negative than fixing the phantom image. All just guessing on my part though.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,469
Likes
2,466
Location
Sweden
So you'd need a speaker very similar to the one testing best in mono, to have just the right frequency response difference to make it work better in stereo than the best mono speaker. And this response only helps with phantom mono images. I don't know that such a specific comparison was done. It also assumes the other aspects beyond phantom mono image don't have effects which still might overall favor the speaker that also was best in mono. The testing that was done showed the same speakers scoring best in stereo. My guess would be the even off axis response is important in both cases, maybe even more so in stereo. Changing the response to fix the phantom image inherently must worsen the off axis response so that is a negative effect. Perhaps a bigger negative than fixing the phantom image. All just guessing on my part though.

What I know is that the effect is real and it summarised in Toole's book. It is simple physics. I don't think the off axis will worsen since it will just follow the same pattern as the on axis. Now this is a special effect of stereo and will be negligible in multi-channel with a center speaker playing. That is also one reason Toole favours multichannel; it fixes some flaws that are there in stereo.

The experiment is simple, put three identical speakers (center and stereo positions). Play pink noise using the center mono speaker and switch to the stereo speakers in A-B fashion playing mono pink noise. You will hear a timbral change. EQ the stereo speakers to mimic the mono speaker. When you have a similar timbral response, you have the ideal frequency response of the stereo speaker for a central phantom image. Then average this adjusted curve and the original and use that as the target response. Compared to the perfectly linear mono speaker, you will end up with curve within +/- 1 dB or so in the 1-8 kHz range for the stereo pair.* This compromise will make sound panned left and right sound reasonably correct, and such speaker will also work good in multichannel.

What does that in practice mean? Looking at a range of very good speakers, they may vary within +/- 1 dB. However some speakers will vary in an opposite manner according to the ideal stereo target curve, and now you deviate from the target by +/- 2 dB. So a design goal for a stereo speaker would be to have just a bit more energy in the 1-2 kHz region compared to the 2-5 kHz region. If anything avoid peaking in the 2-5 kHz region compared to the 1-2 kHz region.

*Just to add that this would be the case in a normal listening room, not anechoic rooms where the deviations would be larger.
 
Last edited:

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,221
Likes
5,497
Most of us as normal people already engage in Bayesian logic to begin with automatically and sub consciously, and do it with greater effect as we grow more experienced. It's essentially a critical thinking algorithm used to test the likelihood of any hypothesis of it's level of potential objective truth.

As an example, we take those folks who claim there is "something up top for us to hear" in the 192kHz range and beyond music; as preposterous.

The reason this occurs so instantly is because Bayes Theorem goes something like this in a simple form:

The odds that a claim is true is equal to: the prior probability that the claim is true (multiplied by) by the odds that the evidence of the claim is true rather than not true.

In the example I just mentioned this means, because we've never seen anyone demonstrate being able to hear 44.1kHz frequency spectrum, let alone one that extends to the 192kHz range, and the nonexistent evidence on top of that being non-existent.. we can safely conclude from our observations that hearing anything in the frequency range is simply an unsubstantiated claim with zero truth.

The nice thing about Bayesian reasoning, is it leave the doors open to the formula (either prior probability portion, or the evidence portion of claim being true) to always be in flux if in case new evidence comes to light that support the claim. Which also keeps person employing the theorem in a state of always receptive to new data if it ever comes, and not always convinced that one day something may not change. This is done due to the annoying ontological arguments that question reality's existence outside of the mind (an annoying thing to deal with for some people when you come across such nonsense). So.. because we've never seen a rock actually turn into a human, we can safely conclude we are about as sure as that, as we are a rock transforming into a leaf (basically zero evidence, and zero prior odds as that has never been observed in the past).

Now you might say "this is simple math in that case, why even call it a theorem". Well if the chance of some claim being true works out eventually 50/50 for example, it's neither evidence for, or against a claim, so it can effectively be disregarded as statistically not relevant to the summation of the overall probability you're working toward.
 
Last edited:

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,485
Likes
4,111
Location
Pacific Northwest
I read that book a year or two ago and really enjoyed it.

I work in Bayesian stuff daily, common with analyzing internet related data. Intuitively, the Bayesian concept is that you start with a phenomena that has some aspect of randomness in its behavior. If you learn something more about it, the probabilities change -- how can you quantify that? The Monty Haul problem is the classic example. You have a 1/3 chance to pick the right door. But suppose after you pick, someone else points to another door that is guaranteed to be empty. Now you have more information. How can you use this information to improve your odds, and quantify the improvement? This can be counter-intuitive. A simpler Bayesian example: I have 2 coins, one fair, the other heads on both sides. If I pick one at random, flip it and it comes up heads, what's the probability that it's the fair coin? (obviously, if it comes up tails the probability is zero).

The most common problem I see with ABX/DBT is not Bayesian, but even simpler. It's about precision vs. recall. If the goal is to demonstrate that a difference is audible beyond any reasonable doubt, you want high precision which is necessarily low recall. But if the goal is to test the threshold of audibility, you want the opposite: high recall which is necessarily low precision. When people do ABX tests, they often don't think enough ahead of time what is their goal, and set the number of trials and confidence % accordingly.
 
Top Bottom