• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Study on blind testing: Is ABX worse than other protocols?

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,939
Location
Oslo, Norway
Howdy folks, thought I should share an interesting study I came across, which was published just now:

Listening tests in room acoustics: Comparison of overall difference protocols regarding operational power​


It's behind an academic paywall, but if anybody wants a pdf copy do send me a message and I'll send it to you (academic publishing is nothing but evil capitalism!). Or search for it using sci cough hub cough.

There's been some discussions on blind testing here before, including a mammoth thread some years ago. Does blind testing make us "blind" to real differences which are there, and can be perceived with normal relaxed listening, but which are difficult to perceive with blind testing protocols? That's the subjectivist claim, but there is little evidence to back it up systematically. This study here though is probably the most thorough attempt I've seen at looking at different protocols for blind testing, what the limitations may be, and whether some test protocols perform better than others. I'll try to sum it up as best as I can.

They used 134 test persons, and compared various ways of doing it:

The common ABX method, where one is asked to say whether X is identical to either A or B - where X is often played at the end (or sometimes in the middle)
Same/different approach: Just to say whether A and B are identical or not
CR-DTF: Complicated name, but essentially it is quite similar to ABX, only that "X" is always played first, and then one needs to decide whether A or B is similar to X - it's a XAB test, kind of

It's a bit complicated, but here are the protocols they tried out:
blind protocols.PNG


The test was about comparing recordings of room acoustics through headphones (Sennheiser 650) - could the listeners perceive the acoustic conditions as different.

So... drumroll... what did they find?
The ABX protocol, which is a common way of doing blind listening in audio, was actually the worst protocol for discerning differences. Same/different testing did better. But the best results were with the CR-DTF test, i.e. the "XAB" test.

I'm posting the table with the results as well (higher score is better):

blind protocols2.PNG


I found this very interesting, as the ABX method has long been the most common method of doing blind testing in audio. I was surprised by the fact that the same-different test did not score the highest - that's what I would have assumed initially. The highest scoring method was the CR-DTF or XAB test!

I'm not sure if it's possible to do blind testing through Foobar using the CR-DTF or XAB formula today?

Anyways, maybe this can be of interest to some of you. If any of you smart guys on the forum who actually know something about blind testing procedures (unlike me) read the whole paper, I would be interested in hearing what you think of the study.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,499
Likes
25,314
Location
Alfred, NY
I’ll try to sci hub it. In organoleptic testing that I ran, triangle was the method of choice. With most ABX testing, the user can control test order, repetition, and length. Ditto triangle.
 
OP
oivavoi

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,939
Location
Oslo, Norway
I’ll try to sci hub it. In organoleptic testing that I ran, triangle was the method of choice. With most ABX testing, the user can control test order, repetition, and length. Ditto triangle.

Cool, looking forward to hearing your thoughts, if you get the time to read it.
 

ahofer

Master Contributor
Forum Donor
Joined
Jun 3, 2019
Messages
5,037
Likes
9,108
Location
New York City
Interesting paper. But in this hobby, *any* controlled testing is often considered too large an ask. We mustn’t let the perfect blind test be the enemy of all controlled testing.
 

Soniclife

Major Contributor
Forum Donor
Joined
Apr 13, 2017
Messages
4,510
Likes
5,437
Location
UK
With most ABX testing, the user can control test order, repetition, and length. Ditto triangle.
I find this critical to get good results, a regimented presentation is not good. If they have excluded this from their testing you have to question why they did.
 

DVDdoug

Major Contributor
Joined
May 27, 2021
Messages
3,023
Likes
3,977
Does blind testing make us "blind" to real differences which are there, and can be perceived with normal relaxed listening, but which are difficult to perceive with blind testing protocols?

This is a common "audiophile excuse" for "failing" the test. IMO, making a listening test blind NEVER makes it worse!
The ABX protocol, which is a common way of doing blind listening in audio, was actually the worst protocol for discerning differences.

That's hard to believe. But an ABX test is simply to determine IF you can reliably hear a difference. It doesn't tell you which is better or what the differences are.

I didn't read the paper but it says "room acoustics" so I'd assume there IS an audible difference so ABX may not be appropriate.

Same/different approach: Just to say whether A and B are identical or not
An A/B test is pretty useless when A & B are actually identical, unless you are tying to fool the listener. Usually we are comparing two different devices or two different file formats and we want to know if there is an audible difference. The "X" helps to remove any bias or placebo effect, etc., to get a statistically useful result.
 
OP
oivavoi

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,939
Location
Oslo, Norway
Interesting paper. But in this hobby, *any* controlled testing is often considered too large an ask. We mustn’t let the perfect blind test be the enemy of all controlled testing.

Very much agreed
 
OP
oivavoi

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,939
Location
Oslo, Norway
I find this critical to get good results, a regimented presentation is not good. If they have excluded this from their testing you have to question why they did.

I think their main "context" is academic blind tests in scientific studies audio and acoustics - they tested the protocols they think are most common in such studies. So not so much how we audio guys on forums do it... :) They have a fairly thorough discussion of the literature, both on audio and on sensory testing in other disciplines (food for example), so I don't think they've left anything obvious out on purpose. But I may be wrong, I'm not an expert on blind testing at all.
 

Soniclife

Major Contributor
Forum Donor
Joined
Apr 13, 2017
Messages
4,510
Likes
5,437
Location
UK
I think their main "context" is academic blind tests in scientific studies audio and acoustics - they tested the protocols they think are most common in such studies. So not so much how we audio guys on forums do it... :) They have a fairly thorough discussion of the literature, both on audio and on sensory testing in other disciplines (food for example), so I don't think they've left anything obvious out on purpose. But I may be wrong, I'm not an expert on blind testing at all.
If their focus is to find the best quick test then it makes sense, if it's too find the most sensitive test I don't see how it's good.
 
Top Bottom