• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Four Speaker Blind Listening Test Results (Kef, JBL, Revel, OSD)

sarumbear

Master Contributor
Forum Donor
Joined
Aug 15, 2020
Messages
7,604
Likes
7,324
Location
UK
Is this test done because blind people on average have better hearing ? I cannot find anything about that in the description.
You must be joking :facepalm:
 

JohnBooty

Addicted to Fun and Learning
Forum Donor
Joined
Jul 24, 2018
Messages
637
Likes
1,595
Location
Philadelphia area
Is this test done because blind people on average have better hearing ? I cannot find anything about that in the description.
It's called that because "information which may influence the participants of the experiment is withheld until after the experiment is complete.."

https://en.wikipedia.org/wiki/Blinded_experiment

Sort of like how medical treatments are tested with a placebo/control group. If the participants and researchers knew which group was which, that knowledge could influence the outcomes.
 

JohnBooty

Addicted to Fun and Learning
Forum Donor
Joined
Jul 24, 2018
Messages
637
Likes
1,595
Location
Philadelphia area
You must be joking :facepalm:
This is only the second time I have had to deploy this emoji: :facepalm:
Guys. The member who asked this question is located in the Netherlands. Maybe it is a troll job, but much more likely that English is not their primary language. Be kind!
 

CtheArgie

Addicted to Fun and Learning
Forum Donor
Joined
Jan 11, 2020
Messages
512
Likes
778
Location
Agoura Hills, CA.
That is not true. The ratings from this panel were significantly different for OSD vs KEF, OSD vs Revel and JBL vs Revel.
I did not see your analysis when I posted my comment. My apologies.

Based on what YOU posted, it seems that KEF and Revel are indistinguishable for this audience.

I still think it is rather dangerous to post conclusions based on a sample of 10. I would prefer to replicate the findings with a larger sample. Also, I would make sure you have representative numbers of "trained" versus "untrained" listeners.

To my cautious eyes, this is just an interesting observation and worth validating.

Your "overlap" comment is what I looked at in the original graph. If I look at the overlap graph alone as posted #80, the KEF and JBL are "trending" to be different, but I am not sure you can say they will sound different to other groups. There seems to be of little doubt that the OSD appears less preferred. If this is validated, then do the comparison with only the KEF, JBL and Revel to make sure.
 

PeteL

Major Contributor
Joined
Jun 1, 2020
Messages
3,303
Likes
3,846
I did not see your analysis when I posted my comment. My apologies.

Based on what YOU posted, it seems that KEF and Revel are indistinguishable for this audience.

I still think it is rather dangerous to post conclusions based on a sample of 10. I would prefer to replicate the findings with a larger sample. Also, I would make sure you have representative numbers of "trained" versus "untrained" listeners.

To my cautious eyes, this is just an interesting observation and worth validating.

Your "overlap" comment is what I looked at in the original graph. If I look at the overlap graph alone as posted #80, the KEF and JBL are "trending" to be different, but I am not sure you can say they will sound different to other groups. There seems to be of little doubt that the OSD appears less preferred. If this is validated, then do the comparison with only the KEF, JBL and Revel to make sure.
Ranked the same doesn't mean indistinguishable
 

MerlinGS

Active Member
Forum Donor
Joined
Dec 28, 2016
Messages
132
Likes
275

Kachda

Addicted to Fun and Learning
Forum Donor
Joined
May 31, 2020
Messages
909
Likes
1,616
Location
NY
Based on responses thus far, I am 95% confident that 15% of ASR members understand basic statistics and 90% do not.
82% of statistics are made up on the spot anyway
 

GaryH

Major Contributor
Joined
May 12, 2021
Messages
1,356
Likes
1,873
2 of the 12 listeners did. The two of us that organized it knew the speakers going in--but we were careful to randomize it so neither of us knew what was what. I connected them all up and did the virtual routing and then covered them behind the blind. My partner then used a random number generator to randomize everything. Inside the software they were just labeled A, B, C, D--but he didn't know what was what. "Speaker 1" for Fast Car and "Speaker 1" for Just a Little Lovin were randomized, they may have been the same and they may have been different.

If you'd heard any of the speakers before, then knowing they were among the lineup could have allowed you to identify them (even if subconsciously) and so bias your judgements. What do the results look like if you remove these two (not fully blinded) listeners?
 

LightninBoy

Addicted to Fun and Learning
Joined
Jan 9, 2019
Messages
722
Likes
1,472
Location
St. Paul, MN
Car washing (detailing?) is such an apt metaphor because many find it incredibly therapeutic as they pull out their car washing kit with clay bar, spritzer, sponges and exotic waxes, etc.

That's way deeper than I was thinking when I made that. Guess the meme works on multiple levels!

(And I made it with love, btw)
 

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,560
Likes
1,705
Location
California
Guys. The member who asked this question is located in the Netherlands. Maybe it is a troll job, but much more likely that English is not their primary language. Be kind!

True. But this is the equivalent of someone raising their hand to ask "the heart is that organ that pumps blood, right?" in the middle of an American College of Cardiology conference. So yes, by all means, pause the conference and start drawing pictures of the heart for this person because you are a kind person. :facepalm: <--- and that's the 3rd.
 

ROOSKIE

Major Contributor
Joined
Feb 27, 2020
Messages
1,936
Likes
3,525
Location
Minneapolis
Ranked the same doesn't mean indistinguishable
I would be REALLY interested in the OP taking my earlier ABX suggestion and find out if listeners can reliably distinguish between the REVEL and the KEF in the set-up/configuration the OP used.
 

PeteL

Major Contributor
Joined
Jun 1, 2020
Messages
3,303
Likes
3,846
I would be REALLY interested in the OP taking my earlier ABX suggestion and find out if listeners can reliably distinguish between the REVEL and the KEF in the set-up/configuration the OP used.
I still don’t get how would you want to achieve that. One is an in wall speaker, the sound is not coming from the same place, of course they will distinguish the two!
 
Last edited:

Newman

Major Contributor
Joined
Jan 6, 2017
Messages
3,533
Likes
4,372
The scores are not significantly different. Thus, I don’t think that the panel could distinguish them. Alternatively, they were not different For them.

Logic error. Like @PeteL said. From the OP, "Participants were asked to rate each track and speaker combination on a scale from 0-10 where 10 represented the highest audio fidelity." Two speakers can score 8 out of 10 with two different "2 out of 10 shortfalls" -- it is not logical to say they are indistinguishable. If 2 cars score 8 out of 10 for good looks, does that make them indistinguishable?
 

doug2761

Active Member
Forum Donor
Joined
Mar 23, 2020
Messages
155
Likes
271
Thanks for publishing your experiment. The correlation of speaker preference to spin data is interesting to me. I wonder if performing a room EQ to normalize each speaker's frequency response to a preferred curve would reduce the distinction between speakers and by how much. I've been playing around with a few headphones lately and applying Oratory's Harman Curve PEQ settings to each of them as a way to normalize frequency response. According to Captain Obvious, normalizing should make each sound rather normal to each other but I was surprised by the extent of the sound similarity. I'm curious how anomalous a loudspeaker would need to be that basic room correction is unable to redeem the sound quality. Spin data looks like a good filtering mechanism to toss out bad performers but between similar scoring speakers, digital EQ may make differences audibly very small.
 

PeteL

Major Contributor
Joined
Jun 1, 2020
Messages
3,303
Likes
3,846
Logic error. Like @PeteL said. From the OP, "Participants were asked to rate each track and speaker combination on a scale from 0-10 where 10 represented the highest audio fidelity." Two speakers can score 8 out of 10 with two different "2 out of 10 shortfalls" -- it is not logical to say they are indistinguishable. If 2 cars score 8 out of 10 for good looks, does that make them indistinguishable?
Plus, no one scored both equally, it’s a preference, the average is close, but some prefered the Revel, some prefered the Kef., there is just not a trend to say conclusively than one is prefered by most. I feel everything get’s mixed up when you say the word “blind test” it gets used to push all kind of theories that has nothing to do with the test. Let’s take the result for what they are, we’re on a science based forum.
 

Gatordaddy

Active Member
Forum Donor
Joined
Apr 1, 2020
Messages
119
Likes
201
@MatthewS congratulations on the well designed and executed experiment

@Semla thanks for running the statistical analysis on it

I'm really impressed that such a modest experiment produced statistically significant results. And that the results support the preference predictions. And that this was still achieved with an ordinary room and untrained listeners. It's very promising and shows the community can get meaningful results with blinded experiments.

It's kind of disheartening to see how many people have piled on that either didn't understand the purpose of the experiment or that think the results aren't conclusive enough. Lol i remember in all the "labs" i performed in college my error bounds were the same order of magnitude of the thing being measured.
 

Vince2

Active Member
Joined
Dec 7, 2019
Messages
109
Likes
82
Location
Kentucky
I still think it is rather dangerous to post conclusions based on a sample of 10. I would prefer to replicate the findings with a larger sample. Also, I would make sure you have representative numbers of "trained" versus "untrained" listeners.
Actually it is brave to publish the results with a small sample size. Hard to get significant results with small samples. The fact that there are 2 significant effects means these are very noticeable. To find subtle significant differences you want a large sample.
 
Top Bottom