• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Binaural blind comparison test of 4 loudspeakers

Which loudspeaker sound do you personally prefer?

  • Loudspeaker A

    Votes: 7 13.5%
  • Loudspeaker B

    Votes: 42 80.8%
  • Loudspeaker C

    Votes: 0 0.0%
  • Loudspeaker D

    Votes: 7 13.5%

  • Total voters
    52
  • Poll closed .
Loudspeaker C Votes: 0 0.0% Klipsch Klipschhorn KH 60th Anniversary (as a low anchor test)

And a lot of people bought these and "love" them.

It's no wonder people feel the need to spend $$$$ on voodoo, hoping to make turds like this sound "good".
 
So, here are finally the results, but let me first write few words about my motivation and choices. :)

Many years ago I was sent many binaural recordings of loudspeakers which were performed in the mid '00s.
In this forum often discussions pop up about loudspeakers preference in the famous Harman tests, like for example if the results would be significantly different if performed in stereo (usually they are done in mono), if the BBC dip is preferred by many, if large dipoles like magneto- or electrostats are preferred etc.

Most of the recordings I had were from rather lower price loudspeakers and from different sessions unfortunately using different music but I found 4 that were from the same session and reminded me of this famous Harman test:

View attachment 156338
Source

R and I were Revel and Infinity models from the Harman group, B was a B&W (802N I think) as a representative of the BBC dip school of design ad M was a Martin Logan electrostatic as a representative of the the large dipoles.

So I took 3 loudspeakers which were the most similar I had to those and one as a low anchor test as it had significant and audible colorations:

Loudspeaker A Votes: 7 13.5% B&W 802D (as a BBC dip representative, similar to B above)
Loudspeaker B Votes: 42 80.8% Revel F52 (as a Harman group representative, similar to R and I above)
Loudspeaker C Votes: 0 0.0% Klipsch Klipschhorn KH 60th Anniversary (as a low anchor test)
Loudspeaker D Votes: 7 13.5% Quad ESL 2805 (as a large dipole representative, similar to M above)

Here also some measurements of each:

View attachment 156343

Despite the limitations of the recordings and the method in the last hours quite some good/close guesses were posted by some members, also the ASR members voting preference doesn't seen too different to above Harman test.

Hope that this first, rather just fun and limited amateurish attempt might lead to more and better such tests in the future which as shown in existing research can work quite well and you had some fun.

I'm left wondering whether it was the BBC dip, 200-500Hz dip, or both that ruined the vocal track for the B&W.
 
I recorded track 1 today from Spotify. It should probably be some averaging made here between channels but this is just a plain spectrum analysis the first 29 s exported to REW vs speakers that I just made as well. Very approximate of course, trying to align around 300-400 Hz.

Speaker A:
Speaker A.png

Smaker B:
Speaker B.png

Speaker C:
Speaker C.png

Speaker D:
Speaker D.png

My speaker:
My DIY speaker.png


Some general findings are that spectral content below about 200 Hz is deviating in all measurements. Speaker C is clearly deviant with its 600 Hz peak. Otherwise some variation in the 1-4 khz range, but no obvious cues. My speakers drop around 6-7 kHz which may be due to the strange microphone pointing upwards. They are Omni, but some decrease at higher frequencies.
 
I made an EQ adjustment of my raw file, a shelf 4-6 khz having approximately 6 dB gain from 6 kHz and up and listened again. I sounds quite much clearer. I think this might be the limitation of the Omni microphone pointing at the roof that is causing the drop.



My DIY speaker 6 kHz boost.png
 
Just as we adjust to rooms, it seems we adjust to situations like this as well, and we still are able to pick out the least flawed DUT despite using a variety of listening devices (headphones, in-ears, ..). Very interesting test for sure. Thanks for doing this.

This is what really surprises me. I had pretty much assumed that these online comparison videos/recordings were useless for anything other than teasing out relative differences. The thinking was that a brighter speaker would sound more neutral if the recording itself were darker. But, it seems there actually may be some ability for us to hear absolute differences as well. It seems that not only can we hear through our own rooms, but we can also hear through the recording room/equipment. Really interesting.
 
I recorded track 1 today from Spotify. It should probably be some averaging made here between channels but this is just a plain spectrum analysis the first 29 s exported to REW vs speakers that I just made as well. Very approximate of course, trying to align around 300-400 Hz.

Speaker A:
View attachment 156419
Smaker B:
View attachment 156420
Speaker C:
View attachment 156421
Speaker D:
View attachment 156423
My speaker:
View attachment 156424

Some general findings are that spectral content below about 200 Hz is deviating in all measurements. Speaker C is clearly deviant with its 600 Hz peak. Otherwise some variation in the 1-4 khz range, but no obvious cues. My speakers drop around 6-7 kHz which may be due to the strange microphone pointing upwards. They are Omni, but some decrease at higher frequencies.
It would be interesting to see if these spectral differences in the binaural recordings correlate well with measured in-room frequency responses of the speakers.

The problem with those binaural recordings is that the sound they capture depends on the direction of the sound - if there is low direction, like with a wide dispersion speaker in a reflective room - the captured frequency response will be very different from a directive speaker in a well-damped room.

Differences in tonality across the midrange are perceived as much larger than expected from looking at on-axis frequency response measurements of the speakers. This can be caused by the recording method using binaural technique.
 
I'll report back when I get them. I actually have no idea what the result will be. I was really surprised either by the low quality of the recordings used in this test or by the speakers themselves (if the recording is OK). There are people on youtube who appear to get very decent results. I am quite sure I won't be able to achieve anything close.

https://www.youtube.com/user/gettingrobbed/videos comes to mind, with the caveat that I heard him a long time ago and can't listen to his videos right now as I am in the office with Logitech Z-10 speakers :)

The goal is a comparison anyway, not accuracy. As we have just seen, even a poor recording of so-so tracks seems to give consistent results.

Ok thanks. I was thinking of getting the Soundman OKM II Classic/Studio Solo model to play around with.
 
It would be interesting to see if these spectral differences in the binaural recordings correlate well with measured in-room frequency responses of the speakers.

The problem with those binaural recordings is that the sound they capture depends on the direction of the sound - if there is low direction, like with a wide dispersion speaker in a reflective room - the captured frequency response will be very different from a directive speaker in a well-damped room.

Differences in tonality across the midrange are perceived as much larger than expected from looking at on-axis frequency response measurements of the speakers. This can be caused by the recording method using binaural technique.
My primary concern is how to compare against the original file. Not sure how the spectral content should be visualised. I was a bit surprised that there was no hint of treble drop off in the binaural recordings compared to the original file, neither from the samples or from my own recording (except for the >6 kHz drop off, which could be due to microphone positioning). Perhaps I have to eat my hat.
 
Some general findings are that spectral content below about 200 Hz is deviating in all measurements.

The Roland binaural microphones arrived yesterday and I had some time to play with them today. To answer your previous question, no calibration files - these feel like "toys" microphones. The spec sheet is very limited. Since I don't have much experience with quality microphones, I can't really rate their absolute performance. They certainly can be used to record fun sound scenes.

I can't comment yet as far as how adequate they are as relative speaker comparison tools. I've recorded several pieces on my two main systems (in the same room) and, at this point, I am a bit lost in the circle of confusion. They certainly create a plausible binaural recording, but I wouldn't say it correlates that well with what I actually hear from my speakers. That could be caused by my lack of experience recording stuff, sub-optimal choices I made or just me having bad ears.

Anyway, here are the aligned spectra (in @pkane's deltawave) of a piece I recorded on both systems (after I normalized the remaining level difference in Audacity - so there is again a potential source of difference). In the high frequency range, it seems the microphones picked up the different signal characteristics and filtering in the 2 chains (I'd love to say this was an intentional test, but it is just something I remembered to check afterwards :) I can't hear much above 14kHz anyway)

But what prompted my answer is what also seems to happen below 200Hz (actually 100Hz here) which is similar to what you describe. That's clearly where most of the difference lies (both speakers are rated very near full range). I would think this is the result of room interaction (different woofer topology in those speakers).

And there is even more confusion in my case because I have also measured both speakers with REW and a calibrated umik, at the sweet spot and the REW results don't correlate well with what I see here given that in those the "blue" speaker has a (room induced?) peak at 35Hz that is significantly above the "white" speaker.

It definitely seems that measurements with the umik at the sweet spot don't correlate too well with what I record binaurally at the same sweet spot. I didn't expect absolute accuracy but I am really surprised that, on one hand, the umik sees the "blue" speaker significantly higher than the white speaker in the 25 to 45 Hz range, which is the exact opposite of what the binaural microphone sees.

o_Oo_Oo_O

1633194640780.png



Edit:
- speaker position adjusted
- filter on "white" speaker chain adjusted from steep to min phase

firm conclusion: roland binaurals are good enough to accurately catch filter changes.

very uncertain conclusions:

- on one hand, not surprised by the similarity, these are the two speakers I have posted pictures previously and about which I reported failing blind tests that I was 100% sure I would pass sighted.
- on the other hand, looks a bit too good to be true (what is audacity's normalize function exactly doing...)

still puzzled:

- REW still sees blue speaker with the 25 to 45 Hz bump


1633198293682.png
 
Last edited:
The Roland binaural microphones arrived yesterday and I had some time to play with them today. To answer your previous question, no calibration files - these feel like "toys" microphones. The spec sheet is very limited. Since I don't have much experience with quality microphones, I can't really rate their absolute performance. They certainly can be used to record fun sound scenes.

I can't comment yet as far as how adequate they are as relative speaker comparison tools. I've recorded several pieces on my two main systems (in the same room) and, at this point, I am a bit lost in the circle of confusion. They certainly create a plausible binaural recording, but I wouldn't say it correlates that well with what I actually hear from my speakers. That could be caused by my lack of experience recording stuff, sub-optimal choices I made or just me having bad ears.

Anyway, here are the aligned spectra (in @pkane's deltawave) of a piece I recorded on both systems (after I normalized the remaining level difference in Audacity - so there is again a potential source of difference). In the high frequency range, it seems the microphones picked up the different signal characteristics and filtering in the 2 chains (I'd love to say this was an intentional test, but it is just something I remembered to check afterwards :) I can't hear much above 14kHz anyway)

But what prompted my answer is what also seems to happen below 200Hz (actually 100Hz here) which is similar to what you describe. That's clearly where most of the difference lies (both speakers are rated very near full range). I would think this is the result of room interaction (different woofer topology in those speakers).

And there is even more confusion in my case because I have also measured both speakers with REW and a calibrated umik, at the sweet spot and the REW results don't correlate well with what I see here given that in those the "blue" speaker has a (room induced?) peak at 35Hz that is significantly above the "white" speaker.

It definitely seems that measurements with the umik at the sweet spot don't correlate too well with what I record binaurally at the same sweet spot. I didn't expect absolute accuracy but I am really surprised that, on one hand, the umik sees the "blue" speaker significantly higher than the white speaker in the 25 to 45 Hz range, which is the exact opposite of what the binaural microphone sees.

o_Oo_Oo_O

View attachment 156740


Edit:
- speaker position adjusted
- filter on "white" speaker chain adjusted from steep to min phase

firm conclusion: roland binaurals are good enough to accurately catch filter changes.

very uncertain conclusions:

- on one hand, not surprised by the similarity, these are the two speakers I have posted pictures previously and about which I reported failing blind tests that I was 100% sure I would pass sighted.
- on the other hand, looks a bit too good to be true (what is audacity's normalize function exactly doing...)

still puzzled:

- REW still sees blue speaker with the 25 to 45 Hz bump


View attachment 156753
You might try measuring your speakers with REW using the moving microphone method. You feed it pink noise, use the RTA in REW and move the microphone slowly in an oval or round shape that is at least the size of your two ears or somewhat larger. Or you could measure at the spot that would be about where each ear is and average the result.
 
I will do some pink noise measurements using
The Roland binaural microphones arrived yesterday and I had some time to play with them today. To answer your previous question, no calibration files - these feel like "toys" microphones. The spec sheet is very limited. Since I don't have much experience with quality microphones, I can't really rate their absolute performance. They certainly can be used to record fun sound scenes.

I can't comment yet as far as how adequate they are as relative speaker comparison tools. I've recorded several pieces on my two main systems (in the same room) and, at this point, I am a bit lost in the circle of confusion. They certainly create a plausible binaural recording, but I wouldn't say it correlates that well with what I actually hear from my speakers. That could be caused by my lack of experience recording stuff, sub-optimal choices I made or just me having bad ears.

Anyway, here are the aligned spectra (in @pkane's deltawave) of a piece I recorded on both systems (after I normalized the remaining level difference in Audacity - so there is again a potential source of difference). In the high frequency range, it seems the microphones picked up the different signal characteristics and filtering in the 2 chains (I'd love to say this was an intentional test, but it is just something I remembered to check afterwards :) I can't hear much above 14kHz anyway)

But what prompted my answer is what also seems to happen below 200Hz (actually 100Hz here) which is similar to what you describe. That's clearly where most of the difference lies (both speakers are rated very near full range). I would think this is the result of room interaction (different woofer topology in those speakers).

And there is even more confusion in my case because I have also measured both speakers with REW and a calibrated umik, at the sweet spot and the REW results don't correlate well with what I see here given that in those the "blue" speaker has a (room induced?) peak at 35Hz that is significantly above the "white" speaker.

It definitely seems that measurements with the umik at the sweet spot don't correlate too well with what I record binaurally at the same sweet spot. I didn't expect absolute accuracy but I am really surprised that, on one hand, the umik sees the "blue" speaker significantly higher than the white speaker in the 25 to 45 Hz range, which is the exact opposite of what the binaural microphone sees.

o_Oo_Oo_O

View attachment 156740


Edit:
- speaker position adjusted
- filter on "white" speaker chain adjusted from steep to min phase

firm conclusion: roland binaurals are good enough to accurately catch filter changes.

very uncertain conclusions:

- on one hand, not surprised by the similarity, these are the two speakers I have posted pictures previously and about which I reported failing blind tests that I was 100% sure I would pass sighted.
- on the other hand, looks a bit too good to be true (what is audacity's normalize function exactly doing...)

still puzzled:

- REW still sees blue speaker with the 25 to 45 Hz bump


View attachment 156753

Nice. Is it possible to plot the pink noise for left and right ear, having the oriiginal pink noise file as a reference? 1/12 smoothing.
 
I did some tests to see if it's possible to fix speaker B with a simple tweak, and I think it can.
Obviously these are very limited examples, but as I showed earlier there's a clear discrepancy between A and B in the 2-4kHz band. So I applied a minimum-phase Q2 cut at 3kHz. With this set to 4dB there's a reasonably good match between the two:
Track 1 (blue=speaker B original; red: speaker A; green: speaker B with EQ):
Track1.PNG

Track 2:
Track2.PNG


As I said earlier, I thought speaker A was a little too restrained, so tried a variety of gains, finally settling on a 3dB cut as the one that retained the liveliness while taming B's native shouty stridency. I've uploaded all the files here if anyone wants to take a listen (these have all been ITU loudness-matched so aren't identical to the ones originally uploaded).

A little bit of judicious EQ can work wonders, but I didn't even try to fix speaker C, which is just too broken to bother with.
 
Found and added some 30°, 60° and 90° directivity measurements to the result post which show the very different radiation patterns of the 4 loudspeakers. Also one think I would like to add, these are anechoic measurements and due to the room interaction of the very different principles the perceived and measured bass at the listening position is quite different, for example for the Quad rather less than in the anechoic measurements due to its dipole radiation pattern while on the Klipschhorn more due to its bass horn interaction with the room walls.
 
Found and added some 30°, 60° and 90° directivity measurements to the result post which show the very different radiation patterns of the 4 loudspeakers. Also one think I would like to add, these are anechoic measurements and due to the room interaction of the very different principles the perceived and measured bass at the listening position is quite different, for example for the Quad rather less than in the anechoic measurements due to its dipole radiation pattern while on the Klipschhorn more due to its bass horn interaction with the room walls.
I suppose the Quad's cliff like drop off after 30 degrees is why I made note of the B&Ws spacial qualities by comparison.
 
Found and added some 30°, 60° and 90° directivity measurements to the result post which show the very different radiation patterns of the 4 loudspeakers. Also one think I would like to add, these are anechoic measurements and due to the room interaction of the very different principles the perceived and measured bass at the listening position is quite different, for example for the Quad rather less than in the anechoic measurements due to its dipole radiation pattern while on the Klipschhorn more due to its bass horn interaction with the room walls.

It's curious that while the B&W has better directivity (off axis matches on axis better even way off axis), the Revel is much flatter/smoother on axis. To my ears the Revel sounded better, even though I thought the B&W was second.

Ans I'm still suprised that the Klipschorn sounded that bad to me, and apparently to everybody else as well. Would really like to hear such blind binaural recordings of other non-traditional speakers, like large horns which have been properly equalized with DSP, omnis, dynamic dipoles, other electrostats, etc. I think that could be educational.
 
It's curious that while the B&W has better directivity (off axis matches on axis better even way off axis), the Revel is much flatter/smoother on axis. To my ears the Revel sounded better, even though I thought the B&W was second.
The Revel has also a better controlled, that is smoother, directivity than the B&W, it doesn't have that have those discontinuities between 2-10 kHz:

index.php
 
The Revel has also a better controlled, that is smoother, directivity than the B&W, it doesn't have that have those discontinuities between 2-10 kHz:
I see your point. What I meant was that the B&W retains more dispersion that is somewhat spectrally correct at 90 degrees than the Revel, but there's no doubt that the Revel overall is both flatter and has a more smooth directivity pattern! It's probably not correct to say that the B&W has "better directivity", so I retract that statement
 
The Revel has also a better controlled, that is smoother, directivity than the B&W, it doesn't have that have those discontinuities between 2-10 kHz:

index.php
When you show the graphs together on one page the smoothness on and off axis of the F52 just jumps off the page so to speak versus the other speakers. Yet even when I've repeated the listening I prefer the Quad. Even when listening over some Revel F12 speakers which presumably have some similarity in how they work to the F52s. So the F12s apparently are accurate enough I can hear the Quads which I owned for 12 years.
 
Given my own impressions, the Quad measurements are most interesting. They're actually fairly neutral outside of that big bass boost. Hard to place it relative to the B&W. The B&W sounded maybe on average better?, but the Quad was closest to being preferred over A on one of the tracks. In fact, I actually did prefer it for the vocal track on the first few listens. I kinda thought I was hearing a v-curve speaker(which is why I guessed B&W/Paradigm), and while the bass is certainly boosted like that, the treble is actually really flat. I wonder what it is that made the vocals sound good?

Looking at the measurements side by side, the Revel has the best on axis response, as well as the best directivity. Cool to see that the science "held" and most people picked it. Also cool to see that no one picked the one with the worst measurements.
 
I recorded three short sequences today and applied an approximate EQ inverse of my room response, just to experiment further. Again using two OM1 microphones taped to my ears. Should be listened through headphones with a target curve similar to a room (e.g. Harman target).

 
Back
Top Bottom