Thank you for posting this, GaryH.
If you are using the
size of the scores as your gauge for a listener's ability to discriminate, then it appears that the trained listeners performed
slightly better than the untrained listeners (and also listeners as a whole) on that, based on the above graph.
I believe this graph is from one of Harman's early studies though, which was based on a relatively small sampling of listeners that included some Harman employees (including possibly Dr. Olive?). So I'd discourage people from trying to draw too many broad-based conclusions from the results in this test about headphone users and listeners as a whole. If you use the degree of spread or difference between the
highest and
lowest ranked headphones in the study though as your gauge for which group is the most discriminating, then the untrained listeners performed about the same as the more trained listeners. Because there is about a 50 point spread between the highest and lowest rated HPs in both groups.
The (apparent) ability of the trained listeners to better separate some of the headphone rankings which are more in the
middle of the preference range than the untrained listeners is interesting. But I'm not sure that necessarily translates to a better ability at correctly identify "good sound quality" on there part. What it might possibly indicate is some increased ability to parce, separate, or form opinions on headphones with different types of
mediocre sound quality... Which could be a very useful skill for some headphone reviewers.
-----------------------------------
The generally smaller error bars on the group of trained listeners is also not that surprising. Because if you take a group of people and train them to listen for certain qualities or characteristics in a transducer (for example), then you'd expect their responses to be more in sync or consistent with one another than in a group of individuals that has
not be trained to look for those specific qualities or traits. So that just makes sense.
Notice also that the error bars are about the same size in the two groups on the two most highly rated headphones in the test (which are Harman's "Target", and "Target 2"). And that the differences in the size of the error bars between the two groups only starts to become more apparent or significant as you begin to go
down the scale towards the headphones with lower preference ratings than the top two.
Imo, that indicates a
similar ability to discriminate the sound quality of the "better sounding" headphones in the two groups (which is perhaps a more useful skill). And a somewhat
reduced ability to form those same kinds of opinions with the "poorer sounding" headphones in the untrained listeners. To put this another way, the trained listeners were better at forming the same or similar opinions to one another when listening to what they perceived as a headphone with poorer or less-pleasing sound quality. (Which is also not surprising, because that's undoubtedly what the Harman training taught them to do!)
It takes an additional step (or maybe even two, or possibly three) of logic or imagination though, imho, to reach the conclusion that this would somehow also make the trained listeners better at correctly identifying (and discriminating) "good sound quality" in a more general sense, as solderdude seemed to imply in his previous comment.
-----------------------------------
As far as objective gauges or metrics for a headphone's sound quality are concerned, I think the only conclusion that the Harman research really reached was that people generally seemed to prefer the sound of a headphone that more closely resembled the frequency response of a pair of well-extended, neutral, anechoically flat loudspeakers in a room than a headphone which did not... Which is basically the same conclusion they also reached in their loudspeaker tests. I think the only other sound quality characteristic or issue they looked at a little bit on headphones (and made public) was
nonlinear distortion. And I believe in the one or two tests they did on that, they found a relatively low correlation between that characteristic and a headphone listener's preferences. Some more research is probably needed on this though.
Going back to the FR testing... If I remember right, I think the first Harman Target on the above graph, which was ranked the highest by the
untrained listeners, was the headphone that probably came the closest to the measured in-ear response of a good loudspeaker in a room. And Harman's "Target 2" was based on a similar response, but with both the bass and treble reduced a bit. So if you were using the (unaltered) response of a neutral loudspeaker in a room as your metric for "good sound quality", then it would seem from the above graph that the
untrained listeners were actually better at identifying that correctly than the trained listeners. Because they gave the first Harman Target a higher score than the modified Harman Target 2.
I'm goin mostly by memory on alot of this though. So I'm not entirely sure this is all correct.