Four Speaker Blind Listening Test Results (Kef, JBL, Revel, OSD)

HooStat · Aug 16, 2021

Did you measure the speakers in the room? Just curious as to what that might look like. Very enjoyable to read about. Thanks for sharing.

MatthewS · Aug 16, 2021

amirm said:
Are you able to post the raw scores? Like to do a bit of statistical analysis on it.

I've attached the Excel file with the raw results.

Please note, Listener 2 and Listener 5 did not rank on a 10 point scale. They ranked them in order of preference 1-4. I excluded their data from my analysis and graphs. There were 12 total participants, but only 10 participants data was used.

FeddyLost · Aug 16, 2021

respice finem said:
To EQ the speakers as good as possible (FR not just level) and re-test, to see how much audible difference remains.

I think it's a good idea for research center of some manufacturer (i.e. choosing dsp preset) , not for evaluation of commercially available speakers with mono downmix at reasonable level.
Difference will be too small for good results.

MatthewS · Aug 16, 2021

HooStat said:
Did you measure the speakers in the room? Just curious as to what that might look like. Very enjoyable to read about. Thanks for sharing.

I didn't--I briefly considered it, but ran out of time. If I set the test up again for other folks, I'll make sure I do.

I do have this graph of the Revels playing with my 2 subwoofers engaged. I used MSO to optimize their response. There isn't any EQ applied above 180hz. The room is only about 11x11 so it suffers some pretty miserable room modes. The EQ on the multiple subwoofers really works some magic. I've graphed the predicted in room response on top of the measured response. At some point, I've got another big post that details the build and measurements.

respice finem · Aug 16, 2021

FeddyLost said:
I think it's a good idea for research center of some manufacturer (i.e. choosing dsp preset) , not for evaluation of commercially available speakers with mono downmix at reasonable level.
Difference will be too small for good results.

I mean EQing for FR in the given room, and then finding out how much difference remains. This is something manufacturers could do in their facilities, comparing with competitions' products, and maybe even do, but probably not many would want to publish the results (just a speculation).
I think this would be interesting, because a) most of us will be listening in "normal", untreated or partly treated rooms, b) to music and not test signals, and c) I guess many of us are using room EQ/DSP, or at least have the possibility to do it.

MatthewS · Aug 16, 2021

respice finem said:
I mean EQing for FR in the given room, and then finding out how much difference remains.

Above the transition frequency, we wouldn't want to apply EQ based upon room measurements. We can EQ off the anechoic data though--which we have for all of these speakers. It's on my to-do list to build an above the transition frequency EQ based upon the spin data for the Revels and do a blind comparison in room.

I tried applying EQ to the OSD speakers for giggles and it was a disaster. If you try to fix some of the issues you end up with some scary sounds coming out--it might be distortion, but it sounds more like a dying animal.

I've already mostly EQed the room modes as best as I can with only 2 subwoofers. You can see the results above.

preload · Aug 16, 2021

Pretty awesome experiment.
Unfortunately it demonstrates that untrained/inexperienced listeners were unable to reliably differentiatet their loudspeaker preferences. In other words, the predicted pref scores didn't really predict their blinded preferences. The predicted pref scores were only predictive with a subset of your listener sample.
My takeaway was that predicted preference scores may not be terribly helpful for predicting the preferences of the average consumer. Ouch.

preload · Aug 16, 2021

MatthewS said:
Average rating across all songs and participants:
Revel W553L: 6.6
KEF Q100: 6.2
JBL Control X: 5.4
OSD 650: 5.2

Plotted:
View attachment 147692

You can see that the Kef and Revel were preferred and that the JBL and OSD scored worse.

No I don't see that. The medians are so close and theres so much overlap on your box and whiskers plot that my initial interpretation was that your listeners were unable to reliably differentiate between the 4 speakers under blind conditions. You'd have to show statistics to say that any speaker was preferred.

MatthewS · Aug 16, 2021

preload said:
No I don't see that. The medians are so close and theres so much overlap on your box and whiskers plot that my initial interpretation was that your listeners were unable to reliably differentiate between the 4 speakers under blind conditions. You'd have to show statistics to say that any speaker was preferred.

I posted the raw data in post 22.

I think @amirm is going to run some additional analysis. Please weigh in with your own analysis as well.

sprellemannen · Aug 16, 2021

The "results", that is the bar graphs, of the blind test is of very little value: There is no statistical analysis and the results may be due to random variation. I do not think there is a statistical justification of the test's "quasi-statistical setup".
As a statistician, I advice Amir to remove this "test" from ASR's list of reviews.

MCH · Aug 16, 2021

Very interesting indeed, but in principle I agree with preload and sprellemannen: If one wants to extract conclusions, better carry an statistical analysis. Otherwise at first sight that graph.... could mean there is no difference.
But in any case, very interesting and surely good to initiate a debate about speakers from which I am sume many of us (at least I) will learn a thing or two!

DuncanTodd · Aug 16, 2021

Well done!
I'm curious why was Hunter was added. Is it a track that frequents in other listening tests or chosen for certain characteristics?
It's my own go to track that I picked fairly at random years ago as I do 0 listening to typical audiophile tracks. For me, the LF and strings parts in it are what makes it an interesting test track

thewas · Aug 16, 2021

MatthewS said:
Here is a photo of what the setup looked like after we unblinded and presented results back to the participants:

Nice work and effort, so first of all sincerely thank you for that, on the other hand what I have to criticise is the placement of the loudspeakers on the inner side of the table, please place them at least on its front edge next time (even better would be on seperate stands) as the large close and reflective horizontal surface will muddle up their FR and imaging.

amirm · Aug 16, 2021

MatthewS said:
I think @amirm is going to run some additional analysis. Please weigh in with your own analysis as well.

I ran a quick F statistics test between a couple of samples. First was JBL Control X against KEF Q100. That difference is quite statistically valid with p value of 0.001.

Same test against KEF and Revel gives a p value of 0.1 so 90% chance that the difference is genuine. So it fails the typical p = 0.05 or 95% confidence.

TLEDDY · Aug 16, 2021

Fantastic!!! Thanks again…

amirm · Aug 16, 2021

Just tested the JBL vs OSD. The results almost pass the test with p = 0.054.

FeddyLost · Aug 16, 2021

respice finem said:
mean EQing for FR in the given room, and then finding out how much difference remains

Then we would have to decide how exactly and what will be EQ'd and this just adds up a lot of variables. Automatic DRC will measure and apply different filters with unpredictable results.
For example, it can boost LF in sealed speakers and with low spl this make them more preferable than they are, but at some higher spl they will bottom out.
Manual EQ will require good understanding what is going on in this room and IMO you need to have speakers at the same point in room.
So, briefly - i think this will take a lot of effort and move us far away from speakers as main subject of investigation.

respice finem said:
I guess many of us are using room EQ/DSP, or at least have the possibility to do it.

Not much really as i can see.
Auto DRC in AVR most probably. Conscious customer is a rare species now.

preload said:
Unfortunately it demonstrates that untrained/inexperienced listeners were unable to reliably differentiatet their loudspeaker preferences

For sure. They don't have hard reference (live sound), soft reference (good studio sound) or any understanding what is "good mastering". Speakers are artistic tools for them.
I think good idea is to use record of familiar voices. Otherwise newbies will be unable to define "neutrality".

maty · Aug 16, 2021

KEF Q100 sounds better with bass-reflex closed. And some mods too, but the first is easy as the foam plugs come with the loudspeakers.

If you do no have a subwoofer, you can compensate for the loss of bass with a little equalization.

PeteL · Aug 16, 2021

Newman said:
I'm curious as to whether A weighting wouldn't be a better choice. After all, if Speaker A had a bit more bass than Speaker B, using C weighting will mean the sensitive range 1-5k will probably be quieter for Speaker A during the listening tests. Wouldn't that (in general, if not every single time) lead to a preference for Speaker B, by dint of being set to play louder in the ear's sensitive band?

Applying my theory to the in-room responses you show, I would predict a preference order of KEF first, then Revel (close), then JBL, then OSD (with its suppressed 1-3 kHz).

That holds pretty close to your listening test result. Which means the preference order might have been due to the use of C weighting instead of A weighting for the level matching.

Interesting?

In a discussion that I can't find anymore Amir was quoting some excerpts from an Olive study recommending C weighting for level matching, but I don't know what's the theory behind that.
Edit: sorry B weighting was mentioned. It's a bit more unusual, not meters have that.

Ellebob · Aug 16, 2021

I would be curious to see rankings with a sub to reduce the difference of bass between the speakers.

Four Speaker Blind Listening Test Results (Kef, JBL, Revel, OSD)

Addicted to Fun and Learning

Member

Attachments

Addicted to Fun and Learning

Member

Major Contributor

Member

Major Contributor

Major Contributor

Member

Active Member

Major Contributor

Active Member

Master Contributor

Founder/Admin

Addicted to Fun and Learning

Founder/Admin

Addicted to Fun and Learning

Major Contributor

Major Contributor

Senior Member

Similar threads