• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

I cannot trust the Harman speaker preference score

Do you value the Harman quality score?

  • 100% yes

  • It is a good metric that helps, but that's all

  • No, I don't

  • I don't have a decision


Results are only viewable after voting.

Newman

Major Contributor
Joined
Jan 6, 2017
Messages
3,454
Likes
4,217
...So why not standardise the targeted speaker response to be exactly that (flat direct sound).
Because you can make speakers that do exactly that, but have terrible off-axis behaviour that will always make them sound less preferred in a typical home listening environment.
 

dominikz

Addicted to Fun and Learning
Forum Donor
Joined
Oct 10, 2020
Messages
803
Likes
2,626
So, summarising... Perhaps we already have a suitable target frequency response for speaker designs (flat direct sound). Such speakers produce (on average) an in-room response close to the Harman target. Such speakers should also achieve high preference scores.

If the above is correct then of course the preference score should be a good metric that helps; although the actual frequency response and spin data can give much more information.
That is in principle correct, just please note that flat on axis / LW is not enough in itself - you also need controlled off-axis behaviour (good directivity).
This is why we can't just EQ all loudspeakers to flat LW to make them all sound good - that would only work well on loudspeakers that have good directivity.

Note that I haven't specified what is 'good directivity' yet. We know that off-axis should be smooth, continuous, free of resonances and in general similar in shape to on-axis. This can be implemented to result in wide or narrow dispersion, and there don't seem to be conclusive data on what is 'better'. So there are various flavours in practice, and consequently room for individual preference :)
 

-Matt-

Addicted to Fun and Learning
Joined
Nov 21, 2021
Messages
675
Likes
551
Yes, noted. I understand that there are other factors beyond flat, on-axis, direct sound. I was initially just concerning myself only with the first, perhaps most fundamental, part of the problem.

On directivity: I think I have read here that it may be application and preference dependent. (Therefore very difficult/impossible to specify).

I.e. Wide directivity could be preferable for stereo listening (especially when considering spatial effects). The off-axis frequency response still needs to be as flat/smooth as possible; with the rate of attenuation with angle determining how relatively important the room reflections are.

For multichannel reproduction, and maybe for near-field stereo monitoring, it may be preferable to minimise reflections and rely on direct sound (have the room disappear). This would require narrow directivity, with spatial effects being achieved by having discrete channels positioned around you. I guess it is still a compromise as very narrow directivity in a mutichannel system might produce a very small sweet-spot for the listener to sit in.

Is there any hope that we could eventually define a standardised ideal target for off-axis speaker response? (Edit: Perhaps only if we also define a standardised listening room).

Back on the topic of the preference score... I don't think it prescibes what speaker directivity should be like, other than that the off-axis frequency response should be smooth?

Anyway, I now feel that I understand the topic sufficiently to qualify to cast a vote! Thanks to those who helped me get there.
 
Last edited:

fineMen

Major Contributor
Joined
Oct 31, 2021
Messages
1,504
Likes
679
... As frequency goes down, ...
This has huge impact on the sound. Tonality ... timbre ... colored ... presence of rendered instruments ... physical presence ... clarity .... bass instruments now gets a rendering more similar to higher frequencies ... defined and precise.

I still do not see a connection to "the score" which deliberately neglects the dispersion in this respect. It allows for wide dispersion as well as for narrow dispersion, dispersion may change gradually from one frequency range to the other. Only "the tilt" has to be maintained, namely a wider dispersion in bass to lower midrange, narrowing towards the treble, but not too much, and for sure not abruptly.

Your motivation to doubt "the score" is an anecdotal report on improvements, as is expected from every singled out experimenter in this field. Fair enough. But, not just for the sake of argument; I'm tempted to present a caveat from a scientific, truth-seeking point of view.

Let us assume the reference against which the improvements are experienced is a speaker which pretty much complies to Harman 's recommendations, a good score that is. A good score would ask for that tilt in the power spectrum.

The other, improved speaker has more narrow dispersion in the lower midrange, it doesn't get wider than the dispersion in the upper mids and treble. Hence it would radiate comparatively less 'energy' in that range, hence the power spectrum would be less tilted.

So the comparison is between two speakers of differing power spectrum, one being 'lighter', more focused on the mids, than the other.

"The score" says, such differences are a matter of taste, as people easily adapt to it.

You say nay, it is quality due to reflections and the like.

How do You differentiate an other, namely 'lighter' more 'energetic' tonal rendition against the influence of--let me summarise, 'reflections'?

Anecdotally I personally would say, that all Your descriptive terms I quoted above come to my mind, once I mute the bass / lower mids a bit, using a plain o'l equaliser, and listening just a bit louder then.

ps: in short, avoiding reflections in lower mid just changes the tonal balance, and it has nothing to do with phase/time/resonance and other obscure 'influences'

pps: even Your argument of a smoothed out frequency response cannot be trusted, experimental evidence is missing to begin with; think of the reasoning behind multiple subs: the more reflections, the better they tend to equalise out single, 'resonant' peaks and dips ...
 
Last edited:

-Matt-

Addicted to Fun and Learning
Joined
Nov 21, 2021
Messages
675
Likes
551
Given the importance of level matching in blind listening tests, and the discussion of directivity, it causes me to ponder...

When levels are adjusted for A/B tests do you make sure that direct sound is at the same power or is it the total of all sound including reflections that is made equal?

If only direct sound level is made equal then it wouldn't be a fair test as a wide directivity speaker would put more total energy into the room.

If the level adjustment includes reflections, then the directivity instead determines the proportion of energy that is direct vs reflected; but perhaps it still isn't a fair test because the direct sound will have a lower level for a wide directivity speaker.

I guess the latter case is the most practical one?
 
Last edited:

dominikz

Addicted to Fun and Learning
Forum Donor
Joined
Oct 10, 2020
Messages
803
Likes
2,626
Back on the topic of the preference score... I don't think it prescibes what speaker directivity should be like, other than that the off-axis frequency response should be smooth?
To my understanding, the commonly calculated preference score is based on Equation (9) from this patent. The equation is based on the following metrics:
  • NBD_ON - Average Narrow Band Deviation (dB) in each ½-octave band from 100 Hz-12 kHz for on-axis measurement
  • NBD_PIR - Average Narrow Band Deviation (dB) in each ½-octave band from 100 Hz-12 kHz for predicted in-room response
  • LFX - Low frequency extension (Hz) based on -6dB frequency point transformed to log10
  • SM_PIR - Smoothness (r2) in amplitude response based on a linear regression line through 100 Hz-16kHz for predicted in-room response
So from my understanding directivity is only considered implicitly from the PIR, and loudspeakers with smooth PIR without many resonances will score better. I don't believe a specific slope is included in this equation, but perhaps @MZKM or @pierre would be better placed to answer this.

The patent does provide some interesting additional information, e.g.:
The target values are based on the mean slope values of speakers that fall into the top 90 percentile based on subjective preference ratings. Target slopes are defined for each of the seven frequency curves. The ideal target slope for the on-axis and listening window curves should be flat, while the off-axis curves should tilt gently downwards. The degree of tilt varies depending upon the type of loudspeakers being tested. For example, 3-way and 4-way loudspeaker designs tend to have wider dispersion (hence smaller negative target slopes) at mid and high frequencies than 2-way loudspeakers. This suggests that the ideal target slope may depend on the loudspeaker's directivity.

And the table:
1648205308034.png


Another IMO very interesting segment:
 

Holmz

Major Contributor
Joined
Oct 3, 2021
Messages
2,018
Likes
1,241
Location
Australia
Given the importance of level matching in blind listening tests, and the discussion of directivity, it causes me to ponder...

When levels are adjusted for A/B tests do you make sure that direct sound is at the same power or is it the total of all sound including reflections that is made equal?

If only direct sound level is made equal then it wouldn't be a fair test as a wide directivity speaker would put more total energy into the room.

If the level adjustment includes reflections, then the directivity instead determines the proportion of energy that is direct vs reflected; but perhaps it still isn't a fair test because the direct sound will have a lower level for a wide directivity speaker.

I guess the latter case is the most practical one?

Do we point the sound meter at the speaker to get the direct sound? ;)
 

-Matt-

Addicted to Fun and Learning
Joined
Nov 21, 2021
Messages
675
Likes
551
Yes, but you would probably need to do time gating also! ;)
 

edechamps

Addicted to Fun and Learning
Forum Donor
Joined
Nov 21, 2018
Messages
910
Likes
3,620
Location
London, United Kingdom
I don't believe a specific slope is included in this equation

SM, i.e. r², will increase with higher slope. Yes that means calling it "smoothness" is misleading. This is probably not what @Sean Olive intended when he defined this variable (otherwise he wouldn't have called it "smoothness"), but that's what we're stuck with. Note that NBD_PIR will counteract this "slope bias" effect to some extent, making the actual target difficult to define and reason about.

These subtleties have been discussed to death many, many times already. See for example this, this, this, this and especially this, this and this.
 

Ro808

Member
Joined
Jan 28, 2017
Messages
83
Likes
82
Average Joe will be guided by all kinds of personal preferences when buying new speakers, in accordance with Floyd Toole's (main) argument.

However, the chance that Harman's preference score appears on his on his wish list is nil.
 
Last edited:

Shadrach

Addicted to Fun and Learning
Joined
Feb 24, 2019
Messages
662
Likes
947

Do you value the Harman quality score?​

I've ticked yes.
The first thing I would mention is the poll title is wrong. The model in question doesn't attempt to ascribe "quality"; it predicts a preference under a set of conditions. Nothing more.
I've read most of the thread and the impression I've come away with is some of the contributors are expecting perfection from what is a limited data model.
No modeling of this type is going to be perfect. This doesn't make the model useless. One just has to take into account it's limitations.
I've bought two pairs of loudspeakers recently based on their preference score and Amir's measurements. I didn't listen to either pair before purchasing them.
The combination of the preference score, the measurements and a bit of common sense has provided me with a very acceptable listening experience. Perhaps the common sense part is missing in some of the criticisms of the model.
 

amper42

Major Contributor
Joined
Dec 21, 2020
Messages
1,583
Likes
2,286
The problem with the score is it can easily lead the reader to think it will directly relate to their preference for one speaker over another. That's a false premise. I have bought several pairs of speakers directly based on ASR reviews and Harman scores without personal listening tests. I found the Harmon score offered a false sense of security and my preference was much different. In fact the @amirm listening test portion of ASR reviews is quite a bit more helpful - and I give it considerably more weight than the Harmon score.

I'm not saying the score is totally worthless but it can just as easily lead the reader into a false sense of comfort with a speaker that they may actually prefer quite a bit less than another lower rated speaker. At least that is what I found. After buying several speakers and comparing their Harmon scores I have verified the Harmon number doesn't reflect my preference in most cases. That's not what people expect when they use it to help with decision making. That's why I cannot trust it as a good tool for deciding what to buy. Maybe if you use the Harmon score to look at speakers within a 2 point score range and realize the score can't actually pick a speaker but just give a selection to look at, that might be it's best use case. However, I don't see that disclaimer on any of the sites displaying the score.

Here's just one example:
Revel M105 - Score 5.6
BMR Monitor - Score 5.1
Based on this score you might expect the two speakers to be close with the M105 having an advantage. In my listening tests, the BMR Monitor is in a totally higher class than the M105. It's so much more engaging and dynamic with significantly more bass and clarity than the M105. Both are bookshelves. BMR is a three way, M105 2-way. Bass extension on BMR is lower. Yet, Harmon score rates M105 a full half point higher.

Maybe if I was only comparing speakers with a rating over 6.0 and I was giving the bass extension even more weight than the Harmon score I might find it partially useful. But, I find the composition of the driver can make a big difference in sound. My Revel 328Be Deep Ceramic Composite aluminum cones sound much more dynamic than a paper woofer. That's not reflected in the score. And I find the Revel F328Be 91dB sensitivity helps the speaker punch as well versus a lower sensitivity like my BMR Towers that are a full 6dB less sensitive.

Each reader will decide the value of the Harmon score but it gets close to zero weight in my speaker purchases. It's only use for me is to learn more about speakers within the higher ranges while understanding the score itself will be of zero value in the actual speaker purchase decision.

 
Last edited:

BenB

Active Member
Joined
Apr 18, 2020
Messages
284
Likes
446
Location
Virginia
I haven't read all 54 pages, so perhaps this has been mentioned already, but my quick search of the thread didn't reveal anything.

It would be very interesting to apply elo ratings to speakers that have been compared head-to-head with one another by various testers / reviewers, and see how that result relates to the Harman score. I'm doubtful we actually have enough head-to-head comparisons to make it work, but if we did I'd be curious how often the rankings agree and disagree.
 

stevenswall

Major Contributor
Forum Donor
Joined
Dec 10, 2019
Messages
1,366
Likes
1,075
Location
Orem, UT
My Revel 328Be Deep Ceramic Composite aluminum cones sound much more dynamic than a paper woofer. That's not reflected in the score. And I find the Revel F328Be 91dB sensitivity helps the speaker punch as well versus a lower sensitivity like my BMR Towers that are a full 6dB less sensitive.

Each reader will decide the value of the Harmon score but it gets close to zero weight in my speaker purchases. It's only use for me is to learn more about speakers within the higher ranges while understanding the score itself will be of zero value in the actual speaker purchase decision.


What do you mean by punch? If you mean dynamics, then I'd argue sensitivity doesn't matter: Speakers operating well within their parameters don't apply dynamic range compression. If you mean transients, perhaps the impulse response is better, but I'm not sure sensitivity directly relates to that either.

As far as the value of the Harman score, other places I'd read say you shouldn't put much weight on things measuring less than a point or maybe point and a half of difference.
 

Steve Dallas

Major Contributor
Joined
May 28, 2020
Messages
1,201
Likes
2,784
Location
A Whole Other Country
[snip]

Here's just one example:
Revel M105 - Score 5.6
BMR Monitor - Score 5.1
Based on this score you might expect the two speakers to be close with the M105 having an advantage. In my listening tests, the BMR Monitor is in a totally higher class than the M105. It's so much more engaging and dynamic with significantly more bass and clarity than the M105. Both are bookshelves. BMR is a three way, M105 2-way. Bass extension on BMR is lower. Yet, Harmon score rates M105 a full half point higher.



Adding to the difficulty of deciding the real-world value of the Olive score is the fact that we do not have consistency across all measurements. In your example, the M105 was measured on a Klippel NFS, and the BMR was measured outdoors and gated, then spliced to a groundplane measurement. (I am not asserting one method is faulty here.) The BMR could actually score higher than the M105 when measured on an NFS. I wonder how much that matters? If both methods are close enough to place the speakers within 1 point of each other, is that good enough?

The counter argument is they are both near each other in score, therefore they should be in the same class, yet there are aspects which are significantly different. About 1.5 years ago, I owned the M106, R3, and BMR at the same time. I was surprised at how different they each sound, and those differences cannot be described by a single score. I agree that the M106 was outclassed by both (but not the F206). Between the BMR and R3, I initially chose the BMR and kept it on the stands for 14 months. After putting R3 on the stands again, I decided they work better with the room. As it turns out, I ended up preferring the R3, and the R3 has a higher preference score. Coincidence? Or science? Maybe it is time to send the BMR to Amir or Erin for measurement on an NFS?
 

amper42

Major Contributor
Joined
Dec 21, 2020
Messages
1,583
Likes
2,286
What do you mean by punch? If you mean dynamics, then I'd argue sensitivity doesn't matter: Speakers operating well within their parameters don't apply dynamic range compression. If you mean transients, perhaps the impulse response is better, but I'm not sure sensitivity directly relates to that either.

As far as the value of the Harman score, other places I'd read say you shouldn't put much weight on things measuring less than a point or maybe point and a half of difference.

If the Harmon score by definition can't tell you much about speaker comparisons within 1 point or 1.5 points then it's pretty useless for selecting a specific speaker. Especially, when 30+ speakers can fall in that range and many are not even measured. In addition, when speakers with lower scores sound substantially better it makes you scratch your head. These realities are reasons why most should not trust the Harmon score for purchasing a specific speaker.

Comparison listening with your music is the best way to find your favorite speaker. However, that can be expensive and time consuming. Most listeners buy on a recommendation, rating or from an audio dealer sound room. With fewer retail audio stores with listening rooms/large selections, many users are looking for another way to identify the best speaker. On line reviews and the Harmon score may fall in that category. And yet, the Harmon score is not designed to do a good job at direct speaker comparisons.

If I were trying to create a formula to rate speakers for my preference it might look something like this:
Harmon Score 50% + Displacement/room size 15% + Bass Extension 15% + Preferred Driver composition 10% + Sensitivity 10% = my Speaker preference score.

This is just an outline but with real world testing of the exact percentages/factors it could provide me a much more trusted tool for selecting a speaker. While the formula recognizes the Harmon score has value, the secondary calculations would significantly adjust the final score into something more trustworthy. At least for me. :D
 

Newman

Major Contributor
Joined
Jan 6, 2017
Messages
3,454
Likes
4,217
You didn’t listen under controlled blind conditions, so your ‘evidence’ falls over at the first hurdle…and so does your argument.

Olive’s rating method says nothing about preferences under sighted listening conditions.

Next.
 

Shadrach

Addicted to Fun and Learning
Joined
Feb 24, 2019
Messages
662
Likes
947
If I were trying to create a formula to rate speakers for my preference it might look something like this:
Harmon Score 50% + Displacement/room size 15% + Bass Extension 15% + Preferred Driver composition 10% + Sensitivity 10% = my Speaker preference score.
I could have this wrong...
Amir measures a product. and then presents his findings on the forum. It's from these measurements that the preference score is calculated.
There is a lot of information in Amir's measurements that cover the parameters presented above. Amir even adds notes sometimes suggesting that this unit might be more suitable for such and such conditions.

One should have measured the listening space and have some idea of it's acoustic properties before looking at models, or loudspeakers.

You can from the graphs for example, tell how much bass or treble one is likely to get at the measuring distance and later one of the graphs shows a far field estimate. A quick look at the basic specs of the model here if they are given, or on the manufacturers website will give you some idea of the power of the amplifier you will need to get best performance.
Driver composition can also be found on the product website or a phone call. I'm not quite sure why one would want to know this. Whatever it's made of, how it performs is shown in the measurements.

What I do think is many people do not know how to interpret the data and be confident in the data and their susequent choice.
There isn't much that can be done about this. There are I'm sure posts in threads that cover most of the questions people might have. To write a guide would be a major undertaking if one had to cover the range of knowledge found on an internet forum.
 

Sancus

Major Contributor
Forum Donor
Joined
Nov 30, 2018
Messages
2,923
Likes
7,616
Location
Canada
I haven't read all 54 pages, so perhaps this has been mentioned already, but my quick search of the thread didn't reveal anything.
This thread ran its course and there's not much left to be said, IMO, but all the good discussion was happening around where Dr. Olive posted. For people who haven't seen the thread before I'd recommend reading through that part and then moving on with your lives :)
 
Top Bottom