• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required as is 20 years of participation in forums (not all true). There are daily reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

I cannot trust the Harman speaker preference score

Do you value the Harman quality score?

  • 100% yes

  • It is a good metric that helps, but that's all

  • No, I don't

  • I don't have a decision


Results are only viewable after voting.

Newman

Major Contributor
Joined
Jan 6, 2017
Messages
1,858
Likes
2,145
…It just seems to me that with the tools we have today, a consumer who really cares about sound quality would probably get a subwoofer and be willing to EQ so part of me wonders why I should focus on regular preference score. …

I think that argument falls apart when we consider beyond 2-channel. And the consumers you refer to as “really caring about sound quality” are also crippling themselves if they don’t go multi-channel, so let’s take multi-channel seriously. And when we go multi-channel, there is no readily available way to get accurate EQ on all channels.). So I think the scores unequalised are very useful to us.

(even with 2 channels, “being willing to EQ” is not the same as being savvy enough to measure your own speakers competently, which woud be far preferable to relying on a published measurement of the same model of speaker and trusting that your sample has exactly the same measurements…just ask a photographer if every sample of a lens has the same measurements, and ask a loudspeaker driver manufacturer if every sample of a driver has the same measurements, or every sample of a crossover, most of which use +/- 10 or 20% tolerance capacitors…or save some time and believe me when I say you need to test your own lenses, and speakers)
 

ROOSKIE

Major Contributor
Joined
Feb 27, 2020
Messages
1,402
Likes
2,438
Location
Minneapolis
So instead of dealing with all those (perhaps unsurmontable issues) you chose to avoid them and set up a test that pretends they don't exist in real life.
Dude, be realistic.
All of this is a compromise.
Even rocket science and neurology are full of examples of making the best compromise you can given your goals and reasonable responsibility to your budgets of time, cash, and energy.
That reality is really a 101 level understanding.
I do think it is cool to talk about how maybe less compromise could be made. I do appreciate you are digging into this as I actually think you are being genuine if not a bit obsessed by your view point, still it sure seems like you want it all without being patient or willing to realize that nobody will ever know what came 1st between the chicken and the egg.
 

Newman

Major Contributor
Joined
Jan 6, 2017
Messages
1,858
Likes
2,145
With this excellent post from the creator himself (I put the first sentence in bold although the rest is also so rich in important information and compressed to the maximum but still so easy to understand also for hobby beginners), this and several similar discussions could be closed.
If we closed every discussion that was based on a misinterpretation or misapplication of research, we could almost shut down all of ASR!

Maybe it should be a requirement that thread-starters should ask the researchers directly before starting a thread that disses their work’s usefulness? Hah, that’ll be the day!

cheers
 

MattHooper

Major Contributor
Joined
Jan 27, 2019
Messages
3,691
Likes
5,894
Except they are. The hierarchy of preference is the same in both conditions - the quad speakers, being dipolar, are an interesting case where the preference score typically falls short, but they remain least preferred in monophonic and stereophonic listening.

I'm not arguing against mono-listening tests, but that reply does seem a little too dismissive given the chart posted by Tuga.

First, in the "sound quality" columns the Quads made a significant leap in it's rating. Well below in mono, up to nipping at the heels of the Rega and Kef speaker in stereo rating.

Then we have the "spatial quality" ratings. Spatial quality is another aspect of listener preferences (we tend to like spacious sound).
The Kef was rated lower than the Rega in mono listening, but higher in stereo listening.

And the Quads leapt from a very low score way up to a positive score, almost even with the Rega.

That does seem to speak to a certain amount of liability in relying on mono listening for all speakers.
 

Sean Olive

Active Member
Audio Luminary
Technical Expert
Joined
Jul 31, 2019
Messages
287
Likes
2,630
75.8% of the (naïve) voters say that they value the score. I'll remember that next time someone calls subjectivists gullible or biased...
I don't think 75.8% of the people are naive or stupid. Quite the opposite.

I think they probably understand how works, what it was intended for. and what its limitations are. I think this graph probably explains it better. Often the loudest, most insistent and contrarian people in a thread appear to be the most confident, but the least informed.
 

Attachments

  • Dunning  Krugger.jpg
    Dunning Krugger.jpg
    145.1 KB · Views: 192
Last edited:
OP
sarumbear

sarumbear

Major Contributor
Forum Donor
Joined
Aug 15, 2020
Messages
4,671
Likes
4,210
Location
Southampton, England
The statistical confidence of the predictions does not allow comparisons beyond 0.5- 1 point so it is moot to get down to 1 decimal place.
There again, the man who created the scoring systems confirmed what I had been saying as at least the minimum ASR has to do: Remove the decimal point.
 

Mad_Economist

Senior Member
Technical Expert
Joined
Nov 29, 2017
Messages
439
Likes
1,284
I'm not arguing against mono-listening tests, but that reply does seem a little too dismissive given the chart posted by Tuga.

First, in the "sound quality" columns the Quads made a significant leap in it's rating. Well below in mono, up to nipping at the heels of the Rega and Kef speaker in stereo rating.

Then we have the "spatial quality" ratings. Spatial quality is another aspect of listener preferences (we tend to like spacious sound).
And, again, the Kef was rated lower than the Rega in mono listening, but higher in stereo listening. And the Quads leapt from a very low score way up to a positive score, almost even with the Rega.

That does seem to speak to a certain amount of liability in relying on mono listening for all speakers.
I mean, it depends on how you contextualize the criticism I suppose - if the claim is "monophonic listening tests cannot predict perceived spatial characteristics, and may result in less indicative results for loudspeakers with atypical directivities", then yes, I'd strongly agree. It's a limited testing paradigm, and I personally would suggest that both monophonic and stereophonic listening be used in loudspeaker testing.

In general, I feel that the predicted preference rating is poorly suited for atypical directivity paradigms specifically because they made up a small portion of @Sean Olive's dataset (and the market), and seem to frequently be outliers as a result.

Conversely, if the claim is "because monophonic listening tests do not have a 1:1 correlation with stereophonic listening tests, we should ignore their superior consistency of differentiation of frequency response and directivity errors", then I think that merits a very dismissive response.
 

Mad_Economist

Senior Member
Technical Expert
Joined
Nov 29, 2017
Messages
439
Likes
1,284
There again, the man who created the scoring systems confirmed what I had been saying as at least the minimum ASR has to do. Remove the decimal point..
Might I suggest a compromise point of removing the ability to sort reviews by metric? I've complained about similar things regarding even-less-meaningfully-pertinent-to-subjective-impression SINAD before...
 

Sean Olive

Active Member
Audio Luminary
Technical Expert
Joined
Jul 31, 2019
Messages
287
Likes
2,630
Might I suggest a compromise point of removing the ability to sort reviews by metric? I've complained about similar things regarding even-less-meaningfully-pertinent-to-subjective-impression SINAD before...
The same issue used to occur in Consumer Reports Accuracy Ratings: they would list speakers according to their predicted accuracy ratings.
Two problems:
1. Scores that fell within an 8 point range on their 100-point were statistically tied (not many people knew this).
2. The scores were shown to be negatively correlated with listener preference ratings.


Together this led to misinformed consumers and poor purchase decisions if they acted upon the recommendations.

ConsumerReportsListeningResults.png


 

Sancus

Major Contributor
Forum Donor
Joined
Nov 30, 2018
Messages
2,429
Likes
6,120
Location
Canada
I'm aware of those things. It just seems to me that with the tools we have today, a consumer who really cares about sound quality would probably get a subwoofer and be willing to EQ so part of me wonders why I should focus on regular preference score. For score with EQ, I dunno how exactly they figure out how well the FR can be EQed given the spinorama measurements and was wondering if you are familiar with/have any opinions on those scores.

FWIW, I think one should take the EQ and sub scores with a pinch of salt. In some cases, the score improvements are achieved by correcting many tiny <1dB faults which would definitely fall inside unit to unit variation. EQ can't fully correct resonances(and there are some which present very audibly even when not necessarily obvious in measurements). It also eats into headroom for larger corrections, especially when made in low/midrange frequencies played by a small woofer(eg 5-6").

The "with sub" score also assumes you have a subwoofer that is flat to 20hz(-6dB at 15hz), and it is perfectly integrated. Single 10-12" sealed subs typically aren't capable of that. Perfect sub integration is a non-trivial process especially if you don't have the funds for excellent auto-integration systems like Genelec's GLM.

If you are aware of all those caveats and assumptions, then I think those scores are definitely useful! If a speaker has a good score that becomes great with a sub, chances are it's a good candidate for using with a sub. If a speaker has a big boost from EQ, then chances are it's got great directivity but just needs help with its on-axis performance.

Use them as general trends, not "oh this speaker has 7.3 with EQ therefore it's definitely better than this active speaker that has 7.1 without EQ". That's where they'd fall apart.
 

MattHooper

Major Contributor
Joined
Jan 27, 2019
Messages
3,691
Likes
5,894
As to the argument for blind testing speakers in mono....

Obviously the argument against testing speakers in mono is "but most of us listen to speakers in stereo, so how do the results tell me what I'm likely to prefer when listening as I'll actually be using the speakers?"


I've seen roughly two arguments over the years.

1. Careful studies have shown that blind listening tests using single speakers tend to predict the ratings in stereo.

To the extent that is true, that seems an unassailable argument.

But there's another type of argument I've seen raised for testing in mono as well that goes along the lines:

2. Mono tests are better at revealing problems in speakers (e.g. frequency deviations, resonances etc). The problem with testing in stereo is that stereo listening can ameliorate or tend to mask to some degree the problems that are evident in mono.

Well, that certainly is an argument for using mono testing if you want to identify exactly what is going on with a speaker (in terms of identifying the audibility of various problems). But if we are talking about identifying what people tend to prefer, then it seems to have a strange circularity.

An analogy to cooking:

A Chef says: Here are two cuts of meat. One is a high quality cut vs a lower quality cut of meat. You can easily tell the higher quality when you eat them a la carte.

Response: But...I'm going to be using them in a stew.

Chef: But that's a problem! Tasting them in a stew will mask the difference in quality!

Response: Yes...but if I'm only ever going to use them in a stew...why do I care what they taste like individually? I need to know how they will taste in a stew. And what you've just said suggests the more expensive cut of meat won't matter so much in a stew.

So if speakers are going to be used in stereo, and stereo masks to a significant degree some flaws that are easily determined in mono...in terms of the end subjective perception, why care that much about the difference in mono? We need to know how things shake out in stereo, not mono.

So for instance if you have Quads that would "show their deficiencies" in mono but would "be perceived as sounding very good in stereo," that's significant. Also, if stereo masks certain deficiencies found in mono, then maybe you don't have to spend as much money sometimes on a pair of speakers that are "better engineered" due to how well they measure in mono, but which won't sound subjectively that much better when listened in stereo.

As I've said, the first argument seems solid to me. The second, about how stereo may mask some sonic problems, if in fact I'm remembering that claim properly, seems like I need some clarification on that one.
 
Last edited:

Sean Olive

Active Member
Audio Luminary
Technical Expert
Joined
Jul 31, 2019
Messages
287
Likes
2,630
I mean, it depends on how you contextualize the criticism I suppose - if the claim is "monophonic listening tests cannot predict perceived spatial characteristics, and may result in less indicative results for loudspeakers with atypical directivities", then yes, I'd strongly agree. It's a limited testing paradigm, and I personally would suggest that both monophonic and stereophonic listening be used in loudspeaker testing.

In general, I feel that the predicted preference rating is poorly suited for atypical directivity paradigms specifically because they made up a small portion of @Sean Olive's dataset (and the market), and seem to frequently be outliers as a result.

Conversely, if the claim is "because monophonic listening tests do not have a 1:1 correlation with stereophonic listening tests, we should ignore their superior consistency of differentiation of frequency response and directivity errors", then I think that merits a very dismissive response.
I agree they are instances where both mono and stereo tests should be done.

A speaker with an unusual directivity, or in cases claims are made about its spatial imaging that need to be tested and verified. The Lexicon SL1 with variable directivity is one example. The spatial results are what you might expect and highly depend on the room acoustics and the recording.
 

witwald

Senior Member
Forum Donor
Joined
Jul 23, 2019
Messages
356
Likes
321
Another approach to achieve spaciousness without this and maintain accuracy is using a high level of lateral later arriving diffuse energy. Something Harman never included in ther researchers to my knowledge but others have and it's often used in the studio world.
If you will, can you please provide some examples? Maybe a photo or two, if at all possible?
 

tuga

Major Contributor
Forum Donor
Joined
Feb 5, 2020
Messages
3,343
Likes
3,421
Location
Oxford, England
Your sarcasm does little to dispel my impressions here.

What level of predictive capability would you require to verify that a model is fit for purpose? Or do you work backwards from the number of variables included to determine validity, regardless of the actual correlation of the model's outputs with results?

I would require that the test be performed with adequate methodology. You can't eliminate or disregard variables just because they're inconvenient.
 

ROOSKIE

Major Contributor
Joined
Feb 27, 2020
Messages
1,402
Likes
2,438
Location
Minneapolis
The "with sub" score also assumes you have a subwoofer that is flat to 20hz(-6dB at 15hz), and it is perfectly integrated. Single 10-12" sealed subs typically aren't capable of that. Perfect sub integration is a non-trivial process especially if you don't have the funds for excellent auto-integration systems like Genelec's GLM.
Because I use subwoofers I "use" the with subwoofer score when I use the score.
Though what I really want to say is that if you are looking at the scores, especially for monitors/bookshelves I really like seeing what the prediction is for the small speaker with a "neutral" bass source (ideal subwoofer, even if mine is not ideal. In fact I'd wager that nobody has ideal bass in a typical home system) rather than a score affected by say one using an 8" driver and one a 6" and yet reality is the majority of the bass can be handled by 3 powered subwoofers.
 

Mad_Economist

Senior Member
Technical Expert
Joined
Nov 29, 2017
Messages
439
Likes
1,284
I would require that the test be performed with adequate methodology.
Again, how do we define "adequacy" here? Is a model with predictive capability acceptable, and, if so, what threshold is "adequate"? If not, what is?
You can't eliminate or disregard variables just because they're inconvenient.
Do you want to know how I know you don't work in the sciences?
 

Sancus

Major Contributor
Forum Donor
Joined
Nov 30, 2018
Messages
2,429
Likes
6,120
Location
Canada
I think they probably understand how works, what it was intended for. and what its limitations are. I think this graph probably explains is better. Often the loudest, most insistent people in a thread are the most confident, but the least informed.

The best thing about this graph is that it shows the people keeping quiet are the true gurus. Can't argue with that. Thanks for your responses, they're greatly appreciated. It's honestly a privilege to see you participate so much in this thread.
 
Top Bottom