Speaker Blind Listening Comparisons: Methodology Discussion

echopraxia · Nov 10, 2019

Because I plan to be doing a few more blind listening comparisons, I wanted to ask here for advise on methodology to maximize the generalize-ability of such results. You can read the results of my first blind test (where I used the methodology described below) here where I recently compared KEF R3 vs Ascend Sierra 2EX.

However, I would greatly appreciate as much feedback and suggestions as possible to improving the methodology here! I want to make sure as much as possible is covered. And, if this thread goes well enough, maybe we can take the combined ideas here and make it into a sort of loose informal 'standard' for in-home blind test speaker comparisons.

Below I have listed in detail the methodology I used for that test.

Setup:

Place both speakers as close to each other as possible, spaced equally from the wall behind it.
Level-match both speakers precisely, to avoid one speaker from winning due to playing louder than the other.
Integrate a subwoofer crossed over at 100hz to factor out differences in bass extension preference, since Dr. Toole's research indicates that ~30% of speaker preference is is determined just from bass extension capability. It would be interesting to get other thoughts here, though.

Test:

Compare one song at a time, moving on to the next song only when the listener conclusions for one song have been recorded.
For each new song, the first speaker played is called "Speaker A" and the second "Speaker B" from the listener's perspective, but the actual assignment to which of the two speakers being compared is randomized here to prevent the possibility of a cumulative bias from developing in the listener that voids the statistical independence of each song test.
When comparing a single song, a sub-interval (usually ~30 seconds) of the song is played on "Speaker A" then "Speaker B". In my past test I allowed the listener to request replaying the same segment if they wish. Then I proceed to the next ~30 seconds or so played on each speaker. The test ends when the listener indicates they are done (either due to forming a confident preference, or deciding they have none).
When switching speakers, I delay around 10 seconds at least between resuming playback. This is meant to prevent the switch from having an obvious audible positional shift of the sound source location, since each speaker is in a slightly different position (side by side).
Once the listener has completed the evaluation of both speakers, the following questions are asked:
- Which speaker (if any) did you prefer, for this song?
- Can you explain the differences you heard between Speaker A and Speaker B?
- [When clarification is necessary due to use of ambiguous terms:] Can you explain what you mean by [word/phrase]?
The latter of these three questions is never to be asked more than (3) times.
- This is meant to reduce any possibility of expectation bias inflicted by the administrator of the test by varying the number of questions asked (e.g. asking more questions to subconsciously try to invert the results, or something like that). Beyond this single degree of freedom (whether to ask 0, 1, 2, or 3 questions), the test administrator has no other possible influence that I can think of that would be removed if the test were double-blind (double-blind would be nice, but often impractical with limited resources).
This 'interview' portion is transcribed or otherwise recorded, to be compiled in the test results.

General:

Of course, the more participants in any such test, the better. Though Dr. Toole's research shows that most humans appear to have the same speaker preferences, there is always a chance that with only N=1 (where N = number of listeners) you'll get unlucky and have someone whose hearing preference deviates significantly from the mean -- even if the probability of that being the case is low. Please correct me if I'm wrong though.
Room choice? I'm not sure, but it seems an "average" or "good" room should be preferred; not necessarily one with extensive expensive treatments, and not necessarily one with horrible echo/reflection problems. Other than that, I don't think there's any need to replicate tests in more than one room for the results to be valid, according to Dr. Toole's research here. Please correct me if I'm wrong, though.

Rja4000 · Nov 10, 2019

Hi
Interesting question.
I'll give it a try myself.

I think the only way to do proper blind test is to have someone installing the speakers behind some (acoustically transparent) curtain.

I understood from Toole's work that you'd better listen to a single speaker, rather than a pair.

I doubt that placing speakers at the exact same distance from wall works. Some will require more space than others.

If you want to remove low frequency extension difference, cut-off under 60Hz.
I guess that's high enough for speakers still to behave decently, and that would remove the subwoofer from the equation.

As for me, I always use a selection of music I'm used to for speaker listening.

cookiefactory · Nov 10, 2019

Hey, thanks for the write-up as I will soon be testing the Ascend Sierra 2EX vs the Buchardt S400. I was not planning on integrating a subwoofer as I figured the bass performance is also up for debate no? I could be convinced otherwise.

Another major consideration is stereo vs mono. Suffice it to say logically it’ll be a lot easier for me to test the speakers in mono as I can hook up Speaker A to my AVR’s Zone A Left channel, and Speaker B to Zone B’s Right channel for instantaneous switching between the two. The main rub is while I can set the level for L and R independently, I do not know if I can set the level per channel per zone (i.e different levels for the Left channel in Zone A vs B). Apparently according to Toole, mono performance is highly correlated with stereo. Thoughts?

echopraxia · Nov 10, 2019

Rja4000 said:
I doubt that placing speakers at the exact same distance from wall work. Some will require more space than others.

How would you determine this distance in a 100% objective way?

Rja4000 · Nov 10, 2019

echopraxia said:
How would you determine this distance in a 100% objective way?

By measuring low frequency.

echopraxia · Nov 10, 2019

cookiefactory said:
Hey, thanks for the write-up as I will soon be testing the Ascend Sierra 2EX vs the Buchardt S400. I was not planning on integrating a subwoofer as I figured the bass performance is also up for debate no? I could be convinced otherwise.

Yes it is still useful to compare without a subwoofer, but it really depends on how we expect these speakers to be used in the real world. If you compare without a subwoofer, it’s very likely that all your test will really be evaluating is the differences in bass extension. We know that bass extension has a huge impact a speaker system preference. And given how all these speakers have excellent measurements aside from bass extension, it’s therefore very likely that bass extension is the *only* thing such results would be measuring.

Ideally, I think it would be good to test both configurations: One without a subwoofer, and one with (configured and crossed over somewhat similarly to what would be realistic).

Rja4000 · Nov 10, 2019

echopraxia said:
it would be good to test both configurations: One without a subwoofer, and one with (configured and crossed over somewhat similarly to what would be realistic).

I don't understand what you want to achieve exactly.
But I'm not sure "crossed over similarly" makes sense. If you use a subwoofer, you have to configure the speakers+sub in the best possible way.
And that's probably different for each speaker.

echopraxia · Nov 10, 2019

Rja4000 said:
I don't understand what you want to achieve exactly.
But I'm not sure "crossed over similarly" makes sense. If you use a subwoofer, you have to configure the speakers+sub in the best possible way.
And that's probably different for each speaker.

The goal is very simply to solve the problem where a comparison (of two speakers that are already very flat) without a subwoofer is extremely likely to be little more than a comparison of two speakers' bass extension differences, which will likely add no particularly useful information that isn't already known. For example: I could show you some $500 speakers that will dominate such a test against either of those ~$2000 speakers, which despite being in inferior in every other way, simply happens to extend deeper into low frequencies.

If you want to compare without a subwoofer and post the results, it's not like the data will be completely without value (e.g. it may be useful for those who wish to accept a huge sound accuracy compromise, by never integrating a subwoofer). But objectively speaking, all these bookshelf speakers do need a subwoofer to cover the entire audible spectrum, and so most audiophiles seeking accurate music/audio reproduction will need to be combining these with a subwoofer.

This goes back to the old strange audiophile myth that music is best experienced in pure stereo, which is absurd unless those stereo speakers are flat down to 20hz. Most speakers, including these, will have attenuated dozens of decibels by the time they reach 20hz. Imagine seeing the same drop in the middle of the frequency spectrum: everyone agrees that would be completely unacceptable. But objectively and subjectively speaking (and the science does confirm this) it is similarly unacceptable to hugely attenuate the frequency response on the bottom end.

If you use a stereo pair of speakers without a subwoofer, that's fine BTW and there is certainly a time and place for that (usually for financial or space constraints). But everyone should understand that this will entail discarding a significant portion of the signal in the recording, upon playback. In such a case, both subjectively and objectively it's well known that not much else matters until you first fix huge problems like completely missing certain frequencies (no matter where they lie on the spectrum).

RayDunzl · Nov 11, 2019

Assuming they are already "closely matched", has anyone ever tried or thought of putting Speaker A on the left and Speaker B on the right?

Played simultaneously, of course.

Blumlein 88 · Nov 11, 2019

RayDunzl said:
Assuming they are already "closely matched", has anyone ever tried or thought of putting Speaker A on the left and Speaker B on the right?

I've seen it discussed. Normally even if level matched an obvious disparity. I have done extensive speaker EQ and managed to get different speakers to align quite well. Well enough nothing seems wrong playing stereo material this way.

One issue is we apparently process imaging differently on each side, and that also changes with age as different parts of the brain are involved when young vs not young. I've wondered if this is one reason why Harman results were so much more consistent with mono speakers heard from straight ahead.

Of course you may have been thinking of playing then first one side and then the other instead of concurrently in stereo.

RayDunzl · Nov 11, 2019

Blumlein 88 said:
Of course you may have been thinking of playing then first one side and then the other instead of concurrently in stereo.

Nope.

Play stereo, see which side sounds more "right", or more "wrong".

I'm an experimentalist, not afraid to try dumb stuff.

---

Back at work, long ago, upon discovering some new anomaly in the machine:
Engineer: You can't do that!
Me: I just did!

Blumlein 88 · Nov 11, 2019

RayDunzl said:
Nope.

Play stereo, see which side sounds more "right", or more "wrong".

I'm an experimentalist, not afraid to try dumb stuff.

---

Back at work, long ago, discovering some anomaly in the machine:
Engineer: You can't do that!
Me: I just did!

Okay so would you play stereo source material or dual mono?

You could try it with your M-L and LSR308. One in each channel and level matched with pink noise between 500 and 2000 hz.

RayDunzl · Nov 11, 2019

Blumlein 88 said:
Okay so would you play stereo source material or dual mono?

Both.

Blumlein 88 said:
You could try it with your M-L and LSR308.

I could.

RayDunzl · Nov 11, 2019

RayDunzl said:
I could.

Ok, I did.

Contestants:

In the East corner, coming in at a solid 91 pounds and standing a lanky 5'11", from Salina Kansas, along with his tag-team partners, The Cheezewoofers, from parts unknown, the 22 year old current reigning champion MartinLogan reQuest!

(yaaaaaayyyyyyyyyyy!!!!!!!!!!)

In the West corner, weighing in at at a slim 18.9 pounds, standing a diminutive 1'4", from Tijuana Mexico, the two year old upstart JBL LSR 308

(yaaaaaayyyyyyyyyyy!!!!!!!!!!)

Rules:

Pink noise limited 500 to 2000Hz for the level match within .1dB LAeq at 65.4dB SPL at the listening condition.

Play music.

The Match:

One JBL alone sounds bigger and louder, One ML alone sounds small.

Playing stereo, it sounds like two JBLs are playing.

My conjecture:

JBL room spray overwhelms the much more direct dipole. Sounds loud and nasty and indistinct to my audio scientifically untrained heathen ears.

Going back to straight ML, the personally preferred presentation immediately returns.

The Decision:

The referee has stopped the match!

Martin Logan wins on points for clarity.

(we fear the judge may be biased)

---

I'll have to see if I normally listen to the JBL at a lower volume/SPL setting, because I don't know if I do or not. The ML and JBL are on two different zones on the preamp with separate volume settings, and the numbers are too small to see from across the room when adjusting with the remote.

I didn't try matching their levels by ear.

I suppose a grudge match could be scheduled, tune in next week!

I do spend more time with the JBL now than the ML for economy reasons, TV and movies and taking a nap, and find the two casually interchangeable, But I never use the JBLs for really listening to music, or for watching a Feature Presentation on the tube (there's another out of date term. What should it be now? The flat? The panel? The screen?).

---

If I haven't seemed all excited and ready to tithe for the Klippel Spinorama DreamMachine, the above might give a clue, because I think I know what will win and be further sought out there, and what will be dismissed as non-conformant, and it isn't what moves me at this time.

That's ok, to each their own.

*wants to try some directional horns someday, maybe after an audition of Harbeths which were the only thing that really caught my ear at the show last year. I only spent a couple of minutes in the ML room there, whatever was playing sounded lame. the presenter said I could have the Big Red Neoliths for $50k after I complained about the $25k price for the new Renaissance15. I offered $20k for Neoliths, no counteroffer was posited, and of course, no sale.

BDWoody · Nov 11, 2019

RayDunzl said:
Nope.

Play stereo, see which side sounds more "right", or more "wrong".

I'm an experimentalist, not afraid to try dumb stuff.

---

Back at work, long ago, upon discovering some new anomaly in the machine:
Engineer: You can't do that!
Me: I just did!

How about splitting the L and R between two DAC's, or taking the L from one and the R from another when sync'd, and playing through matching speakers... Time alignment might be tricky, but might be interesting.

Edit: not related to speaker testing obviously...more in the ' not afraid to try dumb stuff' category...

Blumlein 88 · Nov 11, 2019

How about this Ray? Place your other 308 behind the first one, and firing backward in reverse phase. Turn it into a LSR308 dipolar speaker to match the ML.

Your described results are interesting. Maybe I'll try it in my video rig. Revel F12 on one side and LSR305 on the other. See if two Harman designs show differences between each other.

I too have some doubts about the Harman approach being correct for dipoles. Seems to work very well for boxes. In a sense the wide dispersion with slight directivity increase with frequency is to minimize reflections being noticed. With panels you largely don't have reflections for a good portion of the spectrum. What is left is an uneven response in many dipoles.

I used a Tact Room Correction system with a Hales System Two on one channel and Soundlab on the other. It was a surprisingly seemless match. But my room was long and narrow. The Hales was a woofer, tweeter, woofer configuration in a large super stiff sealed box.

https://www.audiocheck.net/audiotests_ledr.php

This LEDR test signal from the Chesky Test and Demo CD was one I used when mis-matching speakers. I even managed a fair result with it. Give it try and see what happens. The sound should move from side to side in an arc.

RayDunzl · Nov 11, 2019

Blumlein 88 said:
I too have some doubts about the Harman approach being correct for dipoles. Seems to work very well for boxes. In a sense the wide dispersion with slight directivity increase with frequency is to minimize reflections being noticed. With panels you largely don't have reflections for a good portion of the spectrum. What is left is an uneven response in many dipoles.

In my lived-in but relatively untreated room.

Uncorrected and unsmoothed, at the listening position, Left and Right speakers speaking:
JBL (red) vs ML (blue)

Spray

Zoomed Impulse over time, shows linear reflection levels.

The biggest with the ML are the dipole wall bounce at 7ms and a double room length bounce at 27ms

The 1ms spike is the mic stand/couch, and doesn't count.

In dB, highest reflection relative to the direct: JBL -11dB, ML -25dB.

Maybe a 10dB difference in the ratio of direct to reflected/dispersed over time.

The above are my interpretations and may bear no resemblance to a professional's analysis.

Juhazi · Nov 11, 2019

Different speakers means different crossover topology, which means different step response, which means phase mismatch, which means destroyed imaging as pair.

This applies also to 5.1 system's L/C/R speakers!

Blumlein 88 · Nov 11, 2019

Juhazi said:
Different speakers means different crossover topology, which means different step response, which means phase mismatch, which means destroyed imaging as pair.

This applies also to 5.1 system's L/C/R speakers!

Depends upon where the crossovers are. We don't much hear phase once you get past 1500 to 2000 hz. The 308 Ray has crosses over at either 1700 or 1800 hz.

RayDunzl · Nov 11, 2019

Juhazi said:
Different speakers means different crossover topology, which means different step response, which means phase mismatch, which means destroyed imaging as pair.

While the above measurements used uncorrected speakers, it doesn't have t be so...

Red/Orange - JBL uncorrected/corrected
Blue/Green -ML uncorrected/corrected

Speaker Blind Listening Comparisons: Methodology Discussion

Major Contributor

Major Contributor

Active Member

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Grand Contributor

Grand Contributor

Grand Contributor

Grand Contributor

Grand Contributor

Grand Contributor

Chief Cat Herder

Grand Contributor

Grand Contributor

Major Contributor

Grand Contributor

Grand Contributor

Similar threads