I cannot trust the Harman speaker preference score

Newman · Mar 13, 2022

thewas said:
With this excellent post from the creator himself (I put the first sentence in bold although the rest is also so rich in important information and compressed to the maximum but still so easy to understand also for hobby beginners), this and several similar discussions could be closed.

If we closed every discussion that was based on a misinterpretation or misapplication of research, we could almost shut down all of ASR!

Maybe it should be a requirement that thread-starters should ask the researchers directly before starting a thread that disses their work’s usefulness? Hah, that’ll be the day!

cheers

MattHooper · Mar 13, 2022

Mad_Economist said:
Except they are. The hierarchy of preference is the same in both conditions - the quad speakers, being dipolar, are an interesting case where the preference score typically falls short, but they remain least preferred in monophonic and stereophonic listening.

I'm not arguing against mono-listening tests, but that reply does seem a little too dismissive given the chart posted by Tuga.

First, in the "sound quality" columns the Quads made a significant leap in it's rating. Well below in mono, up to nipping at the heels of the Rega and Kef speaker in stereo rating.

Then we have the "spatial quality" ratings. Spatial quality is another aspect of listener preferences (we tend to like spacious sound).
The Kef was rated lower than the Rega in mono listening, but higher in stereo listening.

And the Quads leapt from a very low score way up to a positive score, almost even with the Rega.

That does seem to speak to a certain amount of liability in relying on mono listening for all speakers.

Sean Olive · Mar 13, 2022

tuga said:
75.8% of the (naïve) voters say that they value the score. I'll remember that next time someone calls subjectivists gullible or biased...

I don't think 75.8% of the people are naive or stupid. Quite the opposite.

I think they probably understand how works, what it was intended for. and what its limitations are. I think this graph probably explains it better. Often the loudest, most insistent and contrarian people in a thread appear to be the most confident, but the least informed.

Newman · Mar 13, 2022

tuga said:
But if the testing is not done in stereo then it's best not to do it at all.

Ridiculous comment of the century. Correction, of the post-1950s.

sarumbear · Mar 13, 2022

Sean Olive said:
The statistical confidence of the predictions does not allow comparisons beyond 0.5- 1 point so it is moot to get down to 1 decimal place.

There again, the man who created the scoring systems confirmed what I had been saying as at least the minimum ASR has to do: Remove the decimal point.

Mad_Economist · Mar 13, 2022

MattHooper said:
I'm not arguing against mono-listening tests, but that reply does seem a little too dismissive given the chart posted by Tuga.

First, in the "sound quality" columns the Quads made a significant leap in it's rating. Well below in mono, up to nipping at the heels of the Rega and Kef speaker in stereo rating.

Then we have the "spatial quality" ratings. Spatial quality is another aspect of listener preferences (we tend to like spacious sound).
And, again, the Kef was rated lower than the Rega in mono listening, but higher in stereo listening. And the Quads leapt from a very low score way up to a positive score, almost even with the Rega.

That does seem to speak to a certain amount of liability in relying on mono listening for all speakers.

I mean, it depends on how you contextualize the criticism I suppose - if the claim is "monophonic listening tests cannot predict perceived spatial characteristics, and may result in less indicative results for loudspeakers with atypical directivities", then yes, I'd strongly agree. It's a limited testing paradigm, and I personally would suggest that both monophonic and stereophonic listening be used in loudspeaker testing.

In general, I feel that the predicted preference rating is poorly suited for atypical directivity paradigms specifically because they made up a small portion of @Sean Olive's dataset (and the market), and seem to frequently be outliers as a result.

Conversely, if the claim is "because monophonic listening tests do not have a 1:1 correlation with stereophonic listening tests, we should ignore their superior consistency of differentiation of frequency response and directivity errors", then I think that merits a very dismissive response.

Mad_Economist · Mar 13, 2022

sarumbear said:
There again, the man who created the scoring systems confirmed what I had been saying as at least the minimum ASR has to do. Remove the decimal point..

Might I suggest a compromise point of removing the ability to sort reviews by metric? I've complained about similar things regarding even-less-meaningfully-pertinent-to-subjective-impression SINAD before...

abdo123 · Mar 13, 2022

sarumbear said:
There again, the man who created the scoring systems confirmed what I had been saying as at least the minimum ASR has to do: Remove the decimal point.

So a 5.4 speaker and a 5.5 speaker would be one whole point apart?

I don’t see the usefulness in that.

Sean Olive · Mar 13, 2022

Mad_Economist said:
Might I suggest a compromise point of removing the ability to sort reviews by metric? I've complained about similar things regarding even-less-meaningfully-pertinent-to-subjective-impression SINAD before...

The same issue used to occur in Consumer Reports Accuracy Ratings: they would list speakers according to their predicted accuracy ratings.
Two problems:
1. Scores that fell within an 8 point range on their 100-point were statistically tied (not many people knew this).
2. The scores were shown to be negatively correlated with listener preference ratings.

Together this led to misinformed consumers and poor purchase decisions if they acted upon the recommendations.

Audio Musings by Sean Olive

A blog about the science of sound recording and reproduction

seanolive.blogspot.com

Sancus · Mar 13, 2022

BoredErica said:
I'm aware of those things. It just seems to me that with the tools we have today, a consumer who really cares about sound quality would probably get a subwoofer and be willing to EQ so part of me wonders why I should focus on regular preference score. For score with EQ, I dunno how exactly they figure out how well the FR can be EQed given the spinorama measurements and was wondering if you are familiar with/have any opinions on those scores.

FWIW, I think one should take the EQ and sub scores with a pinch of salt. In some cases, the score improvements are achieved by correcting many tiny <1dB faults which would definitely fall inside unit to unit variation. EQ can't fully correct resonances(and there are some which present very audibly even when not necessarily obvious in measurements). It also eats into headroom for larger corrections, especially when made in low/midrange frequencies played by a small woofer(eg 5-6").

The "with sub" score also assumes you have a subwoofer that is flat to 20hz(-6dB at 15hz), and it is perfectly integrated. Single 10-12" sealed subs typically aren't capable of that. Perfect sub integration is a non-trivial process especially if you don't have the funds for excellent auto-integration systems like Genelec's GLM.

If you are aware of all those caveats and assumptions, then I think those scores are definitely useful! If a speaker has a good score that becomes great with a sub, chances are it's a good candidate for using with a sub. If a speaker has a big boost from EQ, then chances are it's got great directivity but just needs help with its on-axis performance.

Use them as general trends, not "oh this speaker has 7.3 with EQ therefore it's definitely better than this active speaker that has 7.1 without EQ". That's where they'd fall apart.

MattHooper · Mar 13, 2022

As to the argument for blind testing speakers in mono....

Obviously the argument against testing speakers in mono is "but most of us listen to speakers in stereo, so how do the results tell me what I'm likely to prefer when listening as I'll actually be using the speakers?"

I've seen roughly two arguments over the years.

1. Careful studies have shown that blind listening tests using single speakers tend to predict the ratings in stereo.

To the extent that is true, that seems an unassailable argument.

But there's another type of argument I've seen raised for testing in mono as well that goes along the lines:

2. Mono tests are better at revealing problems in speakers (e.g. frequency deviations, resonances etc). The problem with testing in stereo is that stereo listening can ameliorate or tend to mask to some degree the problems that are evident in mono.

Well, that certainly is an argument for using mono testing if you want to identify exactly what is going on with a speaker (in terms of identifying the audibility of various problems). But if we are talking about identifying what people tend to prefer, then it seems to have a strange circularity.

An analogy to cooking:

A Chef says: Here are two cuts of meat. One is a high quality cut vs a lower quality cut of meat. You can easily tell the higher quality when you eat them a la carte.

Response: But...I'm going to be using them in a stew.

Chef: But that's a problem! Tasting them in a stew will mask the difference in quality!

Response: Yes...but if I'm only ever going to use them in a stew...why do I care what they taste like individually? I need to know how they will taste in a stew. And what you've just said suggests the more expensive cut of meat won't matter so much in a stew.

So if speakers are going to be used in stereo, and stereo masks to a significant degree some flaws that are easily determined in mono...in terms of the end subjective perception, why care that much about the difference in mono? We need to know how things shake out in stereo, not mono.

So for instance if you have Quads that would "show their deficiencies" in mono but would "be perceived as sounding very good in stereo," that's significant. Also, if stereo masks certain deficiencies found in mono, then maybe you don't have to spend as much money sometimes on a pair of speakers that are "better engineered" due to how well they measure in mono, but which won't sound subjectively that much better when listened in stereo.

As I've said, the first argument seems solid to me. The second, about how stereo may mask some sonic problems, if in fact I'm remembering that claim properly, seems like I need some clarification on that one.

Sean Olive · Mar 13, 2022

Mad_Economist said:
I mean, it depends on how you contextualize the criticism I suppose - if the claim is "monophonic listening tests cannot predict perceived spatial characteristics, and may result in less indicative results for loudspeakers with atypical directivities", then yes, I'd strongly agree. It's a limited testing paradigm, and I personally would suggest that both monophonic and stereophonic listening be used in loudspeaker testing.

In general, I feel that the predicted preference rating is poorly suited for atypical directivity paradigms specifically because they made up a small portion of @Sean Olive's dataset (and the market), and seem to frequently be outliers as a result.

Conversely, if the claim is "because monophonic listening tests do not have a 1:1 correlation with stereophonic listening tests, we should ignore their superior consistency of differentiation of frequency response and directivity errors", then I think that merits a very dismissive response.

I agree they are instances where both mono and stereo tests should be done.

A speaker with an unusual directivity, or in cases claims are made about its spatial imaging that need to be tested and verified. The Lexicon SL1 with variable directivity is one example. The spatial results are what you might expect and highly depend on the room acoustics and the recording.

witwald · Mar 13, 2022

Bjorn said:
Another approach to achieve spaciousness without this and maintain accuracy is using a high level of lateral later arriving diffuse energy. Something Harman never included in ther researchers to my knowledge but others have and it's often used in the studio world.

If you will, can you please provide some examples? Maybe a photo or two, if at all possible?

tuga · Mar 13, 2022

Mad_Economist said:
Your sarcasm does little to dispel my impressions here.

What level of predictive capability would you require to verify that a model is fit for purpose? Or do you work backwards from the number of variables included to determine validity, regardless of the actual correlation of the model's outputs with results?

I would require that the test be performed with adequate methodology. You can't eliminate or disregard variables just because they're inconvenient.

ROOSKIE · Mar 13, 2022

Sancus said:
The "with sub" score also assumes you have a subwoofer that is flat to 20hz(-6dB at 15hz), and it is perfectly integrated. Single 10-12" sealed subs typically aren't capable of that. Perfect sub integration is a non-trivial process especially if you don't have the funds for excellent auto-integration systems like Genelec's GLM.

Because I use subwoofers I "use" the with subwoofer score when I use the score.
Though what I really want to say is that if you are looking at the scores, especially for monitors/bookshelves I really like seeing what the prediction is for the small speaker with a "neutral" bass source (ideal subwoofer, even if mine is not ideal. In fact I'd wager that nobody has ideal bass in a typical home system) rather than a score affected by say one using an 8" driver and one a 6" and yet reality is the majority of the bass can be handled by 3 powered subwoofers.

Mad_Economist · Mar 13, 2022

tuga said:
I would require that the test be performed with adequate methodology.

Again, how do we define "adequacy" here? Is a model with predictive capability acceptable, and, if so, what threshold is "adequate"? If not, what is?

tuga said:
You can't eliminate or disregard variables just because they're inconvenient.

Do you want to know how I know you don't work in the sciences?

Sancus · Mar 13, 2022

Sean Olive said:
I think they probably understand how works, what it was intended for. and what its limitations are. I think this graph probably explains is better. Often the loudest, most insistent people in a thread are the most confident, but the least informed.

The best thing about this graph is that it shows the people keeping quiet are the true gurus. Can't argue with that. Thanks for your responses, they're greatly appreciated. It's honestly a privilege to see you participate so much in this thread.

witwald · Mar 13, 2022

tuga said:
I can see several other problems with listening to a single speaker located left or right.

Can you please provide a list? That way we might see them too.

Sean Olive · Mar 13, 2022

Bjorn said:
With a mono test with the speaker placed in the middle, the side wall specular reflections are at a lower level. So this takes the room a bit more of out of the play compared to having speakers closer to side walls with no side wall treatment.

Optimal position for each speaker can makes sense. As a speaker designer you have to tune the low frequency with a certain type of room gain and how a designer does this can vary. Some may tune it with a corner gain of 9 dB in, some with 6 dB, etc. And with a cardioid or a dipole, the best position generally are different compared to a monopole.

Multichannel is a step away from accuracy due to much more comb filtering and lobing. While one cannot hear discrete reflections in the same matter with many channels, everything is sort of a mess. Another approach to achieve spaciousness without this and maintain accuracy is using a high level of lateral later arriving diffuse energy. Something Harman never included in ther researchers to my knowledge but others have and it's often used in the studio world.

Personally I don't fancy multichannel for music and much prefer a late arriving diffuse tail. However, as a speaker designer it would financially be much better to sell more speakers to people.

<<<nother approach to achieve spaciousness without this and maintain accuracy is using a high level of lateral later arriving diffuse energy. Something Harman never included in ther researchers to my knowledge but others have and it's often used in the studio world.>>

That is what this does: The DI is adjustable from cardioid to almost omni - with something in between.

SoundSteer Technology from Lexicon

As Stereophile's minister without portfolio, my goal was to find something interesting that didn't quite fit into traditional categories. The prize was an introduction, at a Harman demo room in the Hard Rock Casino/Hotel, to Lexicon's SL-1 loudspeaker prototype (price TBD) and the SoundSteer...

www.stereophile.com

Most speaker designers design for flat anechoic frequency response on-axis. Designing a loudspeaker for an optimal position makes absolutely no sense unless it's intended to be used in/on a wall.

Otherwise, you cannot predict where people will put it or what the size and acoustics of the room will be. It has to be designed to sound good in all types of rooms, and hopefully have controls or the ability to adapt to the room acoustics via calibration or auto-correction.

Sean Olive · Mar 13, 2022

Sancus said:
The best thing about this graph is that it shows the people keeping quiet are the true gurus. Can't argue with that. Thanks for your responses, they're greatly appreciated. It's honestly a privilege to see you participate so much in this thread.

Which means I should shut up and leave

I cannot trust the Harman speaker preference score

Do you value the Harman quality score?

100% yes

It is a good metric that helps, but that's all

No, I don't

I don't have a decision

Major Contributor

Grand Contributor

Senior Member

Attachments

Major Contributor

Master Contributor

Addicted to Fun and Learning

Addicted to Fun and Learning

Master Contributor

Senior Member

Major Contributor

Grand Contributor

Senior Member

Addicted to Fun and Learning

Major Contributor

Major Contributor

Addicted to Fun and Learning

Major Contributor

Addicted to Fun and Learning

Senior Member

Senior Member