• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

I cannot trust the Harman speaker preference score

Do you value the Harman quality score?

  • 100% yes

  • It is a good metric that helps, but that's all

  • No, I don't

  • I don't have a decision


Results are only viewable after voting.
Sure. That's simply one way to record instruments.

Another would be to, say, record a small orchestra or ensemble using close spot mics for maximum instrumental vividness and texture, which can then be played with in the mix to produce the desired effects. Like you find in many Bernard Herrmann scores and recordings. That is literally what he as a musical artist is going for.

If we ONLY would accept recordings with a single goal in mind...what a boring world.
not only that, but documentary style recordings just straight up don't work for a lot of genres of music. The studio itself becomes a creative tool.
 
It should be possible to measure something that can show ... far off the subject for this thread.
Sure, and I think our conversation is affected by the language barrier, as we both are not native speakers.

So, You still think, that directivity should be considered more than the score does? Or one needs room treatment, right?
Me thinks, such caveat shall be scrutinized for plausibility.

Anyway, best regards
 
So, You still think, that directivity should be considered more than the score does? Or one needs room treatment, right?
It is not that simple. A one-dimensional scale is not sufficient to describe the sound quality of a speaker. Perhaps the score number actually can be useful, if seen as a measure for frequency response linearity only.

For evaluation purposes, yes, the room needs to be fixed. To enjoy music in your home for pleasure, it is not necessary, it is a choice. For those with as dedicated room, it is a real option. For multi-purpose living rooms, it rarely is, sometimes it is possible to implement some treatment that can improve the situation to a level where it actually works quite well, with the right speakers.

In a "good" room, speakers with very different characteristics can work well, and they will sound different, with a different presentation. In a compromised, but still reasonable room, the properties of the speakers can make the difference between something that presents music like the recording is in your room, to just flat sound from speakers. For a customer, a measure that tells which speaker will work in that room, has value.
 
It is not that simple. A one-dimensional scale is not sufficient to describe the sound quality of a speaker. Perhaps the score number actually can be useful, if seen as a measure for frequency response linearity only.

For evaluation purposes, yes, the room needs to be fixed. To enjoy music in your home for pleasure, it is not necessary, it is a choice. For those with as dedicated room, it is a real option. For multi-purpose living rooms, it rarely is, sometimes it is possible to implement some treatment that can improve the situation to a level where it actually works quite well, with the right speakers.

In a "good" room, speakers with very different characteristics can work well, and they will sound different, with a different presentation. In a compromised, but still reasonable room, the properties of the speakers can make the difference between something that presents music like the recording is in your room, to just flat sound from speakers. For a customer, a measure that tells which speaker will work in that room, has value.
Of course directivity is a factor, that it why it is measured in the spinorama, and displayed in two versions: sound power DI and early reflections DI. As I explained in some detail in Post # 905 the single number rating combines all factors. I I do my "manual" analysis by inspecting the curves, and then it is possible to incorporate your own biases, interests, curiosities. To say that room treatment is an option to speaker directivity is to not appreciate how frequency dependent the off-axis radiations are, or how frequency dependent specific angle absorption of panels is. It can be done, but rarely is.
 
My 2 cents.

As I see it, if we impose ourselves the necessity to classify a speaker as good or bad, with some (finite or infinite) other degrees in between, we have already restricted our scale in a single dimension. Because we need to compare two units and be able to tell if one is better or not than the other, we require an ordered set of elements to describe how good something is. With two dimensions or more, the sets are no longer ordered, and then we can't compare. Is (2,4) better or worst than (3,3)? Its a matter of what we want to achieve. A score is a tool. Does the score of a test describe exactly how proficient a student is on the subject? Of course not, but the better the test is designed, the more reliable will be the score. However, if all students do the same test, their proficiency become instantly comparable, which is in the end the goal of the test.

It's better with three numbers rather than one? Well, then one can describe three dimensions, which gives more information, but doesn't allow us to compare between them, unless one defines some distance between them, some score. We need three numbers to tell a point position but one to say which is closer to us. What do we want to know? Where it is? Or how close it is?

I have the impression that the problem of describing all parameters that matter in a speaker have already being solved. The task of finding a single number, defining some distance/metric, has at least one candidate.
 
Of course directivity is a factor, that it why it is measured in the spinorama, and displayed in two versions: sound power DI and early reflections DI. As I explained in some detail in Post # 905 the single number rating combines all factors. I I do my "manual" analysis by inspecting the curves, and then it is possible to incorporate your own biases, interests, curiosities. To say that room treatment is an option to speaker directivity is to not appreciate how frequency dependent the off-axis radiations are, or how frequency dependent specific angle absorption of panels is. It can be done, but rarely is.
Figure 7.6 in the 3rd edition shows the frequency dependent reflectivity of 2" (50mm) fiberglass panels as used for early reflection absorption, with and without fabric covering. They are not "neutral" absorbers of sound. Figure 7.7 shows an engineered diffusing surface with and without fabric. I did not show bad examples from less reputable sources. These absorbing and diffusing surfaces modify the timbre of sounds re-radiated from them. Problem is that, as I see it, right now consumers of these products do not always know the real-world performance of them. Fortunately the direct sound is paramount, but its influence can be diluted by acoustical treatments of partially scattered and partially absorbed sounds. Equally fortunate is the fact that humans can adapt to some amount of this abuse - quite a lot in fact.
 
Bottom line is - one cannot trust Harman speaker preferece score without knowing many other factors and even then it's up to personal preferences and maybe even some unknown influences (music genre).
It's oversimplification.
 
What we normally see for the actual in room response of many speakers that have a high Harman preference score is too little level in the 150-300 Hz area and too much level above this area.

Below is an example of this, though in this case the level in the 150-200 Hz is good which is often not the case. The red graph is very typical in room response of many high score speakers, while the green one has addressed several shortcomings. Graph is shown with 1/3 oct. smoothing and taken in the listening position. Only showing down to 100 Hz because they were tuned differently below this when measurements were taken.
V1 red V2 green position 5 to 100 Hz.jpg


If we try both speakers at 5 different positions in the room and make average, we also see the same tendency:
V1 red V2 green 5 various positions to 100 Hz.jpg


The green one is clearly better and more even than the red response and it's very audible. Despite of that, a score in a Spinorama wouldn't show this difference between them and estimated response would be the same.

None of these are CBT speakers or horns by the way. They use identical drivers and no EQ in listening position was applied.
 
Last edited:
What we normally see for the actual in room response of many speakers that have a high Harman preference score is too little level in the 150-300 Hz area and too much level above this area.
I can’t see where your generalisation is coming from. “Actual in-room response” in what actual room? Yours? What about mine? The 150-300 region is so strongly subject to room specifics that I can’t see your basis for saying “normally”.
 
Bottom line is - one cannot trust Harman speaker preferece score without knowing many other factors and even then it's up to personal preferences and maybe even some unknown influences (music genre).
It's oversimplification.
Given the high confidence in the score from Dr Olive’s research, you should be just as confident as the p-factor suggests, for most typical speakers of typical comparability for a given purpose.

The real thing you cannot trust is that database referenced in the OP, which draws ”Harman scores” from numerous sources. Don’t trust the database as a comparison tool. I just don’t understand why people are choosing to trust that database as a tool for judging whether the scoring methodology can be trusted. Back-to-front logic. I bet if JBL used their own spinorama data on the M2 to derive its “Harman score”, it sure wouldn’t be the number in that database. But hey, trust the database, because….well actually there is no reason.

Your conclusion is the real oversimplification here.
 
I can’t see where your generalisation is coming from. “Actual in-room response” in what actual room? Yours? What about mine? The 150-300 region is so strongly subject to room specifics that I can’t see your basis for saying “normally”.
It's coming from an understanding of how a speaker operates in regards to boundaries. With these designs and directivity, it's unavoidable. I have touched on this before in the thread and why the Harman score is quite misleading.

There are of course in some cases where the dips will be less because of peaks in the same area or the opposite, but generally you'll se this trait. As mentioned previously, google the in room response of many of the high score speakers. There are few examples below from Erin's reviews but I encourage you too look for a lot more examples to get the general picture.

in-room vs PIR.png


PIR vs MIR (7).png

PIR vs MIR (6).png


PIR vs MIR (5).png


While the estimated response isn't totally off, we can see that it deviates quite a bit. It's the nature of how the speaker designs work.
 
Last edited:
It's coming from an understanding of how a speaker operates in regards to boundaries. With these designs and directivity, it's unavoidable. I have touched on this before in the thread and why the Harman score is quite misleading.

There are of course in some cases where the dips will be less because of peaks in the same area or the opposite, but generally you'll se this trait. As mentioned previously, google the in room response of many of the high score speakers. There are few examples below from Erin's reviews but I encourage you too look for a lot more examples to get the general picture.

View attachment 193885

View attachment 193886
View attachment 193887

View attachment 193888

While the estimated response isn't totally off, we can see that it deviates quite a bit. It's the nature of how the speaker designs work.
Are those graphs all from the same room? I’m just wanting to make sure I understand the data.
 
Given the high confidence in the score from Dr Olive’s research, you should be just as confident as the p-factor suggests, for most typical speakers of typical comparability for a given purpose.

The real thing you cannot trust is that database referenced in the OP, which draws ”Harman scores” from numerous sources. Don’t trust the database as a comparison tool. I just don’t understand why people are choosing to trust that database as a tool for judging whether the scoring methodology can be trusted. Back-to-front logic. I bet if JBL used their own spinorama data on the M2 to derive its “Harman score”, it sure wouldn’t be the number in that database. But hey, trust the database, because….well actually there is no reason.

Your conclusion is the real oversimplification here.
Even harman's own score review gave only 0.86 correlation coefficient. One can only guess what the result would be in some other room with other loudspeakers and other music... It's worthless. Not the research - the score.
 
Even harman's own score review gave only 0.86 correlation coefficient. One can only guess what the result would be in some other room with other loudspeakers and other music... It's worthless. Not the research - the score.
The score is derived from a model utilizing measurements that make up the spinorama with statistical correlation to preference. By definition it's not worthless. I encourage you to read the papers. If you don't want to use the preference score nobody is forcing you to. That doesn't make it "worthless" to everyone else.
 
Last edited:
It's coming from an understanding of how a speaker operates in regards to boundaries. With these designs and directivity, it's unavoidable. I have touched on this before in the thread and why the Harman score is quite misleading.

There are of course in some cases where the dips will be less because of peaks in the same area or the opposite, but generally you'll se this trait. As mentioned previously, google the in room response of many of the high score speakers. There are few examples below from Erin's reviews but I encourage you too look for a lot more examples to get the general picture.

View attachment 193885

View attachment 193886
View attachment 193887

View attachment 193888

While the estimated response isn't totally off, we can see that it deviates quite a bit. It's the nature of how the speaker designs work.
Jbl 305 was measured from relatively short distance (1,5m) hence somewhat better correlation with PIR.
I guess Erin's listening room is mostly drywalled - bass absorption is very high.
 
Even harman's own score review gave only 0.86 correlation coefficient. One can only guess what the result would be in some other room with other loudspeakers and other music... It's worthless. Not the research - the score.
Interesting that commentators continue to omit the dataset issues in their rush to conclude the score is useless.

0.86 is more than adequate to be useful for practical purposes. I expect the reliability of the score, as a predictor of preference between speakers with a score difference of at least (say) 1.0, would be extremely good and useful and worthwhile. If the data was reliable.

Remember that when they control the testing methodology they got a correlation coefficient of 0.995. That was with a limited set of speakers, but that is the only dataset where the testing methodology was controlled. It's a pity that the speaker set was so limited, but there is no other dataset where the testing methodology was controlled, and that includes the larger set where the 0.86 comes from.

Dr Olive has lamented the lack of further research into the ranking tool with a wider range of speakers, but there you go. The issue remains consistency of testing methodology (ie quality of data), and that is the primary issue with the database in the OP.

IIRC I also remember Dr Olive saying that with the hundreds of people that he has taken through his test regime, since the research into the score was published, no one has preferred speakers with a significantly lower score. And that includes floorstanders. That comment was years ago and may or may not still be current.

The issue is not the research, not the scoring tool - it is the data quality and how it is used. If the dataset of scores had been sourced from a single independent professional-grade laboratory in the business of conducting such tests and publishing its results, then I expect between speakers with a score difference of at least (say) 1.0, it would be an extremely good predictor.
 
Nice graphs from @Bjorn . My conclusion from them would be that the in-room response is so room dominated that actually the Harman score is probably only relevant for purchasing decisions if you're going to apply room correction.
 
Back
Top Bottom