Evidence-based Speaker Designs

MSNWatch · Mar 5, 2019

MattHooper said:
Yes, I'm aware of that in audio.

But it seems rather strange to equate that to food and cars as certainly tastes widely diverge in cars, and especially food. Some people get off on everything from a Volkswagen golf, to a Hummer, to a 50's sports car, and there are clearly massive variations in taste when it comes to food.

So it struck me as an inapt analogy and strange to claim people tend to buy the same cars had like the same food. That's just manifestly false.

But, no need for a false analogy to muck up a perfectly good point, which is that, yes, the work of Floyd Toole et al show that under controlled test conditions people will prefer a certain trend in speaker design.

Just that in the US, a majority will prefer a pick-up or SUV vs a car and burger and fries over a quinoa salad.

noobie1 · Mar 5, 2019

SIY said:
1. Because even if 75% of people tested preferred the sound of speaker A, that leaves 25% who might prefer the sound of something else.
2. DBTs test for preference in sound. That's not the only (or often the main) reason behind purchase decisions.

2) Harman DBT tests single speakers in mono mode correct? To me, it’s a leap of faith to believe that those findings necessarily translate to stereo listening. Clearly, pristine measurement is less important in stereo listening (which is why they conduct the test in mono). It’s entirely plausible to me that other sonic attributes become more important in stereo setups. If Harman has conducted similar DBT for stereo setups, I would love to see the results.

MattHooper · Mar 5, 2019

Ilkless said:
And as I said in an reply in another thread, these were under sighted conditions, where non-acoustic bias can simply distort judgments of the sound:

Our judgments under sighted conditions are unavoidably a composite of acoustic and non-acoustic factors, even if these judgments are framed/articulated as if they were purely evaluations of sound quality.

Also, it is reductive to just take Harman's statistical results at face value as expressions of preference. Instead, there are even more fundamental empirical principles like the precedence effect that underpin this preference.

Yes, I know.

But I don't buy speakers, or listen to speakers, under double blind test conditions. So if in the conditions in which I'd actually listen to speakers this induces the experience of "hearing more enjoyable sound" from A over B, then it's rational to choose A over B.

This isn't strictly like testing between high end AC cables, because in that case we haven't seen any plausible technical story for why they would sound different, and have nothing like the correlations between measurements and perception like we do confirming the audible effects of different speaker design choices. Speakers DO sound different. So, while sighted and other forms of influence may certainly contaminate a consumer's perception, it's also not implausible that the actual, audible differences in sound are part of that mix. It's not implausible (even if less likely in blind testing) that I enjoyed the actual sound of the Devore speaker not *just* how they look and their tech story.

The reply can certainly be "but you don't KNOW - it may be the sound but it remains an unreliable inference UNLESS you have strictly controlled the variables double-blind." Sure. But I'm talking of the process of purchasing speakers for the consumer like myself. I don't have those test facilities. It certainly may be the case that if I could go to the HK lab and double blind test between the Devore and the Revel, that I'd say the Revel sounds better. But statistically that is not absolutely certain and I can't know it's certain before-hand, and it doesn't take in to account whatever other factors may influence my experience of the sound in my normal sighted conditions, which result in my enjoying the Devore speakers more. As I don't have access to the HK lab to shoot out the great many speakers I'm interested in, I'm not going to limit my choices to only HK products, even though from their research they always come out on top in blind tests.

Again, that's my own approach. I understand other people who take a stricter more measurement-based approach in guiding their speaker purchase, where they may even be comfortable just buying a speaker because it measures the way Toole's research suggests it should.
(Though, I would *prefer* a speaker I buy to both measure well and sound great. That's more intellectually satisfying. But ultimately I will go with the experience I have when listening to a speaker as the arbiter).

MattHooper · Mar 5, 2019

MSNWatch said:
Just that in the US, a majority will prefer a pick-up or SUV vs a car and burger and fries over a quinoa salad.

Now that was a stretch ;-)

MattHooper · Mar 5, 2019

noobie1 said:
2) Harman DBT tests single speakers in mono mode correct? To me, it’s a leap of faith to believe that those findings necessarily translate to stereo listening. Clearly, pristine measurement is less important in stereo listening (which is why they conduct the test in mono). It’s entirely plausible to me that other sonic attributes become more important in stereo setups. If Harman has conducted similar DBT for stereo setups, I would love to see the results.

Toole and HK doesn't use anything like a 'leap of faith' in choosing to use single speakers for evaluation. It's well justified choice by their research.
For instance, they found that the rated performance of a single speaker correlates very well to the rated performance of the speaker in pairs, and Toole, if you read him, gives other justifications for why one speaker at a time makes more sense (for instance, as I remember, deviations and coloration, like resonances, can be easier to detect listening to a single speaker).

SIY · Mar 5, 2019

noobie1 said:
2) Harman DBT tests single speakers in mono mode correct?

From my understanding, they have done both mono and stereo. @Floyd Toole might be able to be more specific.

edit: Matt types faster than I do.

noobie1 · Mar 5, 2019

MattHooper said:
Toole and HK doesn't use anything like a 'leap of faith' in choosing to use single speakers for evaluation. It's well justified choice by their research.
For instance, they found that the rated performance of a single speaker correlates very well to the rated performance of the speaker in pairs, and Toole, if you read him, gives other justifications for why one speaker at a time makes more sense (for instance, as I remember, deviations and coloration, like resonances, can be easier to detect listening to a single speaker).

I did a quick search in my copy of Floyd’s book. He writes that the speakers that win mono DBT win stereo DBT but not as convincingly. In stereo mode, the technical abilities of a speaker are harder to distinguish. I think this was the point I was trying to get at. Pristine measurements make less of a difference in stereo settings and other sonic attributes could become more important.

Juhazi · Mar 5, 2019

Sean Olive has a blog of what is done at Harman reseach. Please read through, start from 2008!
These links apply to listener preference and DBT testing

http://seanolive.blogspot.com/2008/12/are-consumer-reports-loudspeaker.html
http://seanolive.blogspot.com/2010/12/how-to-listen-course-on-how-to.html
http://seanolive.blogspot.com/2009/05/harman-international-reference.html

This thread is a about Harman target curve (room response) http://www.audioheritage.org/vbulletin/showthread.php?39134-Harman-Target-Curve
So, yes, ordinary people prefer a different curve than "trained" listeners! But I must ask, which group is "right?" And how about you?

noobie1 · Mar 5, 2019

If there are 10 listeners testing 10 different speakers and one with the best measurements gets 2 votes and no other speaker gets more than single vote then there is clearly a most preferred speaker as it gets 2x many votes as the nearest competitor. However, based on these results we can’t say there is an objective standard for how all speakers should be when a majority of listeners prefer other speakers. The actual numerical results of stereo DBT are very important.

Frank Dernie · Mar 5, 2019

b1daly said:
This is an interesting thread, but IMO the mark has been badly missed. The issue of what makes a “good” speaker is vexingly difficult subject, as there are many different axes that this could be measured on, and perhaps axes that exist but simply cannot be represented as data.

Going back to the earlier part of the thread where someone suggested that using the popularity of a speaker was an example of the argumentum ad populum fallacy: this is an incorrect application of the this idea. That fallacy applies when there is an objective fact about the world that is contentious, and an argument is made that the most popular belief is true. It requires a subject that can fit into the categories of true/false, accurate/innacurate etc...

If a person was trying to ascertain which speaker has the most accurate frequency response, trying to answer this based on popularity would be an example of the argumentum ad populum fallacy.

But I will venture a guess that that this thesis has never been proposed in seriousness.

When it comes to personal preferences or subjective experiences it is not possible to validate some and invalidate others. A total idiot who knows nothing about audio can be just as happy with their crap speakers as the genius with their hyper designed monstrosity.

Resarch about personal preferences, which are subjective, does count as evidence. I think Bose has based their designs on such evidence, and succeeded with this approach. (Personally I think they hit some sweet spots along the way, but once they got into the “spacialized dsp” home theater stuff, and then the portable speaker market, I start to hate the sound. But apparently I and most audio enthusiasts are in the minority here.)

When it comes to subjective preferences, there can be study of the objective elements of the speaker that are the cause of the subjective preference. But if the goal was to “evidence based design,” it seems like evidence from psychoacoustic research would be the proper grounding, with the physics in service to that.

The concept of “scientific” investigation of audio as Amir is doing is interesting, but for practical reasons very limited. The core criteria of this approach, as applied to DAC, amps, streamers is looking most narrowly whether a device is at a minimum copying and transmitting data correctly, and then looking how closely an electrical signal is to its original when passed through gear, after scaling for gain.

In this sense, one can say the gear is “accurate.” But this is not only a limited perspective on the question of whether a speaker is accurate, but reflects that the notion of accuracy, of “hi fi” itself is virtually meaningless outside of a narrow set of parameters, important only to the ever dwindling audience for acoustically generates music.

This standard would be, does a chain of transducer-electricity-a whole bunch of other complicated stuff- electricity- transducer give a listener a close approximation of the original sound.

There are so many reasons why this view of audio is outdated at best, but I will try to point out some glaring examples:

- even if we are considered simply mic’d acoustic music played back on a high quality system, the system could only hope to come close to replicating the sound, of reprocducing it with high fidelity, if the actual sound level was in the ballpark of the original acoustic source. This will almost never be the case, and it is only under the control of the listener. My guess is that the vast majority of real world listening is done at a much lower level than the original event. (Maybe an acoustic guitar or other solo instruments would come the closest.). Most people would find the experience of sitting 10 feet in front of a drum set with a drummer playing hard to be on the upper level of comfort.

Even people who love to blast rock music will do so at much lower relative levels than the actual instruments.

Because of this, people who make recordings of music (which I do) are not concerned with “accuracy” but more with communicating the “expression” of the music. So if you are trying to represent the expression of let’s say a big band orchestra, techniques are required to represent this sound energy without literally recreating the sound energy. This is basically the illusion of “verisimilitude,”

(This lowely “loudness control” of the old days was an attempt to mitigate this.)

Beyond this, as we move from the dawn of the recording industry to now, speaker systems are more properly considered sound “producers” than “reproducers.” Many sound sources are not even electric, never mind acoustic, but actually originate as data and become sound for the first time at the playback speaker

This leads to strange feedback loops, but one of the biggest that had played a role since the beginning of the music business is that people who make recordings have to target the expected listening environment of the audience. This means the technologies and trends of speakers designs themselves become active considerations for record producing choices.

Simply put, if the one is trying to make a hit pop record these days, you have to take into account that primary listening environments will be cars, Bluetooth speakers, various earbuds and headphones, computer speaker systems, home theater systems. The way these systems are voiced, which is all over the damn place, cannot be left out of the production process.

What this means is that for the vast majority of records, the only reference point that accuracy has any meaning would be how close the playback represents the experience of the music makers in the production environment. This is largely dominated by nearfield monitors, with an increasing reliance on subwoofers.

This makes for a moving target. A commercial speaket designer has not only to figure out these vexingly issues of voicing and accuracy, but also countless other aspects that affect the marketability of the product. This is a freakishly tall order, and I agree with the sentiment that some amazing engineering is happening among the larger commercial speaker companies to address it.

FWIW, I purchased a pair of the Neumann KH120s to use as studio montitors, and found them to be a profoundly horrible sounding speaker. Like many modern studio monitors, they presented a sound that at best, at best, I could say was accurate enough that I could use them. Even in that realm, they seemed to fall short in that they tended to, paradoxically, impart a characteristic sound that was superficially flattering to a mix (not what I look for.) They seemed voiced way too bright, and had an “ artificial” quality to them, whatever that vague subjective description might mean. They also have a steady hiss, which was annoying when sitting close.

I sold them relatively quickly as I don’t like to work on speakers that “sound bad.”

I can’t pretend to expertise here, but while I certainly think measuremt and science should be used in speaket design, “accuracy” considered too narrowly can lead speaker designers badly astray.

Interesting observations.
It is certainly the case that the recording, and how it is mixed, is probably the most important aspect of the satisfaction of listening at home, and that most people are now listening to music in a car, or on headphones with a portable in the gym/bus/train.
The sort of recording that will sound good in these conditions will disappoint in a quiet home environment on wide band speakers IME.
I used to record, mainly classical, music using a Decca tree type microphone layout or a Sennheiser dummy head. These recordings sound extremely realistic to me at home and hopeless in the car, where the quiet parts are completely drowned out.
I balanced the sound by moving the microphones during rehearsal.
I believe all we can do is try to put together the least coloured home system since we are at the mercy of the quality of the recordings anyway and altering the system to suit one recording will probably mean is being sub-optimal on others.
I have always thought listening to a favourite recording to evaluate Hi-Fi just means always risking having a system which compensates for any shortcoming in that recording which may not suit many others.
In the end we are royally screwed nowadays since most modern releases are indeed balanced for car/earbud not Hi-Fi.

Frank Dernie · Mar 5, 2019

MattHooper said:
Yes, I know.

But I don't buy speakers, or listen to speakers, under double blind test conditions. So if in the conditions in which I'd actually listen to speakers this induces the experience of "hearing more enjoyable sound" from A over B, then it's rational to choose A over B.

This isn't strictly like testing between high end AC cables, because in that case we haven't seen any plausible technical story for why they would sound different, and have nothing like the correlations between measurements and perception like we do confirming the audible effects of different speaker design choices. Speakers DO sound different. So, while sighted and other forms of influence may certainly contaminate a consumer's perception, it's also not implausible that the actual, audible differences in sound are part of that mix. It's not implausible (even if less likely in blind testing) that I enjoyed the actual sound of the Devore speaker not *just* how they look and their tech story.

The reply can certainly be "but you don't KNOW - it may be the sound but it remains an unreliable inference UNLESS you have strictly controlled the variables double-blind." Sure. But I'm talking of the process of purchasing speakers for the consumer like myself. I don't have those test facilities. It certainly may be the case that if I could go to the HK lab and double blind test between the Devore and the Revel, that I'd say the Revel sounds better. But statistically that is not absolutely certain and I can't know it's certain before-hand, and it doesn't take in to account whatever other factors may influence my experience of the sound in my normal sighted conditions, which result in my enjoying the Devore speakers more. As I don't have access to the HK lab to shoot out the great many speakers I'm interested in, I'm not going to limit my choices to only HK products, even though from their research they always come out on top in blind tests.

Again, that's my own approach. I understand other people who take a stricter more measurement-based approach in guiding their speaker purchase, where they may even be comfortable just buying a speaker because it measures the way Toole's research suggests it should.
(Though, I would *prefer* a speaker I buy to both measure well and sound great. That's more intellectually satisfying. But ultimately I will go with the experience I have when listening to a speaker as the arbiter).

Speakers are all over the place (a bit like record players) in terms of uneven frequency response and audible levels of distortion plus cabinet resonance and cone breakup.
It is obviously impossible to do a DBT between 2 speakers with considerably different frequency responses.
Plenty of people like record players despite (or probably because of) the high levels of distortion and usually uneven frequency response.
I do look at speaker test results to check frequency response, distortion and the waterfall plot. Pretty obviously a speaker with a singing cabinet, uneven frequency response, high levels of distortion and resonances prolonging every note after it has stopped in the recording are not “good” but may well sound “nice” like a record player compared to CD.
I have some horn speakers which I like and the only waterfall plot I have seen for them shows lots of, presumably acoustic, resonances over the whole frequency band so I guess they are “nice” rather than “good”.
I do tend to listen to a more accurate pair more often though.

March Audio · Mar 6, 2019

noobie1 said:
If there are certain speaker attributes that are favored in DBT, why isn’t the market flooding to these speakers? They are readily available. I myself have listened to Kii Threes in a dedicated listening environment. I really wanted to love the speakers because it had these attributes but couldn’t. I also had various studio monitors in my home and returned all of them. And I started off with the bias that these speakers are better than the hifi offerings.

Perhaps the DBT is somewhat flawed? From what I understand, Harman only tests single speakers at a time. Whereas most people listen in stereo mode. That seems silly to me.

I would suggest because many manufacturers still don't know what they are doing. Many hifi manufacturers don't have or use anechoic research facilities.

Is it of any surprise that a speaker with a flat, smooth anechoic response and smooth off axis response is preferred by people? BTW that response when measured in room typically has a downward slope to higher frequencies.

My suggestion is that you read the Floyd's book. The tests aren't flawed. People are simply much more sensitive to, and critical of, flaws in speakers when listened to singularly.

I can't really comment on your individual findings. It doesn't follow that studio monitors are necessarily good or follow the design rules. In fact it's a big problem, and part of the circle of confusion that music is mixed, equalised and generally messed with on non standardised studio monitors. The results are going to be all over the place. In contrast video/film production very much uses screens that are calibrated and conform to international standards.

March Audio · Mar 6, 2019

MattHooper said:
And yet...

I personally found a wide range of music that I played through the "badly designed" Devore speakers more compelling than on, for instance, Magico, Paradigm or Revel speakers I auditioned.

This is one reason why, though I certainly acknowledge the soundness of the statistical results of Toole's research, I personally am not ready to rely solely on that research to guide my own speaker purchases i.e. "Harman Kardon's tests show I would likely prefer their speakers...."

If I did, rolling the dice on getting something like a Revel speaker would have made the most sense, but in practice, though they sounded well designed and competent, they just didn't do much for me.

That's fine, not everyone is looking for neutral sound. To me it's the essence of hifi - High Fidelity - faithful to the original. However if you want tone controls, have them.

The thing that I have found FwIW is that the more neutral speaker I have had the less I am concerned about "sound quality", the less I am bothered by recording to recording variations and the more I just listen to the music.

The research indicates that a speaker with a flat smooth anechoic response is preferred by most people. This isn't anything contentious, in fact it's a pretty intuitively obvious result.

March Audio · Mar 6, 2019

MattHooper said:
But, no need for a false analogy to muck up a perfectly good point, which is that, yes, the work of Floyd Toole et al show that under controlled test conditions people will prefer a certain trend in speaker design.

I'm not sure I would call it a trend though. Those recommended design characteristics seem technically obvious and correct, to me at least.

Would you buy a CD player/dac with a wonky frequency response?

March Audio · Mar 6, 2019

noobie1 said:
2) Harman DBT tests single speakers in mono mode correct? To me, it’s a leap of faith to believe that those findings necessarily translate to stereo listening. Clearly, pristine measurement is less important in stereo listening (which is why they conduct the test in mono). It’s entirely plausible to me that other sonic attributes become more important in stereo setups. If Harman has conducted similar DBT for stereo setups, I would love to see the results.

It's not a leap of faith, it was the result of research and testing.

This is specifically commented upon in the lecture video I linked above. Without exception speakers preferred in mono were preferred when used stereo. People are just more sensitive to speaker flaws when used in mono.

Take an hour and watch the video and then read the book for greater detail.

March Audio · Mar 6, 2019

Juhazi said:
Sean Olive has a blog of what is done at Harman reseach. Please read through, start from 2008!
These links apply to listener preference and DBT testing

http://seanolive.blogspot.com/2008/12/are-consumer-reports-loudspeaker.html
http://seanolive.blogspot.com/2010/12/how-to-listen-course-on-how-to.html
http://seanolive.blogspot.com/2009/05/harman-international-reference.html

This thread is a about Harman target curve (room response) http://www.audioheritage.org/vbulletin/showthread.php?39134-Harman-Target-Curve
So, yes, ordinary people prefer a different curve than "trained" listeners! But I must ask, which group is "right?" And how about you?

Trained.

Trained means they are better and more reliable and consistent at hearing/noticing aspects of sound. Trained does not mean they have been "programmed" into liking a certain sound, it just means they are more critical listeners.

I think the Harman training is available on line. I will try and find a link.

Edit:

http://harmanhowtolisten.blogspot.com/2011/01/welcome-to-how-to-listen.html

Sancus · Mar 6, 2019

The simple answer to 'Why do bad speakers sell just fine?' is probably that the placebo effect of sighted listening+marketing is stronger than a typical buyer's ability to distinguish the effect of the actual frequency response differences.

Honestly, this is the case in video/displays to some extent too, it's just controlled by professional industry standards for accuracy and calibration. Everyone knows, for example, that in-store TVs are often set oversaturated and overly bright in stores because the average buyer just looks at which TV has the brightest screen under intense, poor CRI lighting and buys that one.

But a professional calibrator will still calibrate your TV properly, there is an extensive set of industry standards on how to calibrate those displays, and mastering/grading monitors are very carefully set to achieve certain levels of accuracy. Many buyers don't care about or appreciate these standards, but they're what allow display manufacturers and content producers to deliver a consistent product that will look the way they intended on a wide variety of displays. The alternative is the circle of confusion in audio. And buyer preference is indeed insufficient to prevent the circle of confusion, it requires deliberate industry action.

noobie1 · Mar 6, 2019

Sancus said:
The simple answer to 'Why do bad speakers sell just fine?' is probably that the placebo effect of sighted listening+marketing is stronger than a typical buyer's ability to distinguish the effect of the actual frequency response differences.

Honestly, this is the case in video/displays to some extent too, it's just controlled by professional industry standards for accuracy and calibration. Everyone knows, for example, that in-store TVs are often set oversaturated and overly bright in stores because the average buyer just looks at which TV has the brightest screen under intense, poor CRI lighting and buys that one.

But a professional calibrator will still calibrate your TV properly, there is an extensive set of industry standards on how to calibrate those displays, and mastering/grading monitors are very carefully set to achieve certain levels of accuracy. Many buyers don't care about or appreciate these standards, but they're what allow display manufacturers and content producers to deliver a consistent product that will look the way they intended on a wide variety of displays. The alternative is the circle of confusion in audio. And buyer preference is indeed insufficient to prevent the circle of confusion, it requires deliberate industry action.

March Audio said:
It's not a leap of faith, it was the result of research and testing.

This is specifically commented upon in the lecture video I linked above. Without exception speakers preferred in mono were preferred when used stereo. People are just more sensitive to speaker flaws when used in mono.

Take an hour and watch the video and then read the book for greater detail.

I have Floyd’s book. I don’t disagree with the general principles. But if the test produced only consistent results without exception, that test is either poorly designed or didn’t test enough samples. Without other independent tests, Harman’s work is one theory. Not the end all be all.

March Audio · Mar 6, 2019

noobie1 said:
I have Floyd’s book. I don’t disagree with the general principles. But if the test produced only consistent results without exception, that test is either poorly designed or didn’t test enough samples. Without other independent tests, Harman’s work is one theory. Not the end all be all.

No. The work is not theory, far from it. It's scientifically tested. The correlation between anechoic prediction and listener preference is IIRC 90%. Not being 100% is no indication at all of a poorly designed test or incorrect conclusions.

The work is peer reviewed and presented in places such as the AES. Do you have any credible contrary testing/information/theories?

Would a contrary view be that people prefer a non flat anechoic response with wonky off axis and resonances? That's really the only place to go away from the Harman research. Seem unlikely to you? It does to me.

Ilkless · Mar 6, 2019

noobie1 said:
If

March Audio said:
No. The work is not theory. The correlation between anechoic prediction and listener preference is IIRC over 90%.

You may have read the book but I'm not sure you took in what was said.

The work is peer reviewed and presented in places such as the AES. Do you have any credible contrary testing/information?

0.86 actually, but still excellent.

Evidence-based Speaker Designs

Active Member

Active Member

Master Contributor

Master Contributor

Master Contributor

Grand Contributor

Active Member

Major Contributor

Active Member

Master Contributor

Master Contributor

Master Contributor

Master Contributor

Master Contributor

Master Contributor

Master Contributor

Major Contributor

Active Member

Master Contributor

Major Contributor

Similar threads