• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Broader Considerations and Limits of Harman Headphone/IEM Curves

Ilkless

Major Contributor
Forum Donor
Joined
Jan 26, 2019
Messages
1,771
Likes
3,502
Location
Singapore
I was asked to write a three-part article series on the Harman headphone/IEM curves by an up-and-rising headphone blog. Being acutely aware that I could never measure up to existing articles about the curves and their experimental derivation from luminaries such as Sean Olive and Tyll Hertsens, I decided to focus less on the curve in itself. Why come up with a pale imitation of the existing coverage?

I chose another angle that seemed apparent based on my understanding of (psycho)acoustics, but has not been explicitly articulated on any head-fi community or headphone website to my knowledge. I situate the curves within two broader phenomena. Number 1 is a peculiar contradiction that appears at the heart of improving sound reproduction spatially - cancelling or introducing crosstalk. Both are touted to improve spatial reproduction, and supporters of both have empirical evidence on their side.My role was simply to compare the evidence, and explain the optimal applications of each solution, resolving the contradiction. Number 2 is the multitude of headphone target curves that are derived from different experimental techniques, with different premises from the Harman curves. I summarise how some major headphone target curves are derived differently from the Harman curves, and compare the premises of these derivations. I must stress that I am not a professional psychoacoustician or audio engineer, but an interested reader of the research. Moreover, due to work and other pursuits, I have not kept as close an eye on psychoacoustic research in the last 2 years or so. AFAIK there has not been any foundational change in the concepts and evidence I use in my analysis, but please correct me if I have erred. Cheers.
 

pozz

Слава Україні
Forum Donor
Editor
Joined
May 21, 2019
Messages
4,036
Likes
6,827
Really nice article: informative and linked to a bunch of good stuff.
 
OP
Ilkless

Ilkless

Major Contributor
Forum Donor
Joined
Jan 26, 2019
Messages
1,771
Likes
3,502
Location
Singapore
Very grateful that Sean Olive took the time to give his opinions on the article on his Twitter account. See here. He adds that he arrived at the Harman curve (meant for broadly-pleasurable, non-crossfeed playback of stereo recordings by headphones) using the binaural compensation technique for head externalised sound by David Griesinger. Interesting that he arrived at the same curve using a vastly different technique for an entirely different purpose. Of course, the Griesinger loudness comparison method is quick and dirty (in a good sense), so there could be all manner of reasons why.
 
Last edited:

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,759
Likes
37,612
You have written an excellent article here. I am one who has apparently very non-standard HRTFs. Binaural never works for me and often sounds really bad. I hate that Chesky left using two mike recording and went with binaural. I loved their old recordings and though they process the new ones to work binaurally and over speakers they are at best mediocre when I listen to them.

I use Sony MDR 7510 phones. They have very large drivers sitting further from your ears and are angled. They still don't make binaural work for me, but make it less bad. They also sound better for normal stereo. Over the years I've owned Stax Lamda's, Stax Nova, old Koss ESP9, Beyer DT880 Grado and some of the old little Nakamichi SP7 phones which sat on ear at an angle (and were long time favorites). I still have some Beyer DT880's which I think I am going to try and make custom ear cushions for which push them away and angle them to my ears.

I think Harman was largely successful in breaking the "circle of confusion" for loudspeakers. Initially they implied similar benefits for headphones. It was quickly apparent they have not broken through on that yet. If you offer three possible groups so divergent as those you list in the article then you don't have much. Maybe more than nothing, but not much. Mainly you are showing the need for individual adjustments.

Turning on crosstalk on phones has never been satisfactory for me. I do some recording, and you can use an ORTF or NOS pair which will work pretty well over phones. When I've done up close multi-miking and just panned tracks, you can't pan anything 100% right or left for phones. It will work for speakers, but sounds wrong on phones. If you make it an 85%/15% pan it is okay. So for such creations you need some kind of crosstalk.

I would say the Griesinger method worked better than others I've tried with Sonarworks EQ being next best. Mainly I don't like listening on phones. If they can make them sound like speakers it would be a big step forward. I've heard a few demos of crosstalk cancellation over speakers that when it works is really something too.

Thanks for your article.
 
Last edited:
OP
Ilkless

Ilkless

Major Contributor
Forum Donor
Joined
Jan 26, 2019
Messages
1,771
Likes
3,502
Location
Singapore
You have written an excellent article here. I am one who has apparently very non-standard HRTFs. Binaural never works for me and often sounds really bad. I hate that Chesky left using two mike recording and went with binaural. I loved their old recordings and though they process the new ones to work binaurally and over speakers they are at best mediocre when I listen to them.

Yes, I rather an otherwise-realistic image in miniature, inside my head, than a disconcerting, blurry halo clinging to my scalp like what Chesky binaural gives me.
 

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
The issues lies on difference between measurements equipment and human ears' acoustic transferfunction(not just one impedance curve). And the difference between preference and accurate reproduction. As well as non ideal methodology in various stages of the research.
 
OP
Ilkless

Ilkless

Major Contributor
Forum Donor
Joined
Jan 26, 2019
Messages
1,771
Likes
3,502
Location
Singapore
The issues lies on difference between measurements equipment and human ears' acoustic transferfunction(not just one impedance curve). And the difference between preference and accurate reproduction. As well as non ideal methodology in various stages of the research.

Also any correction demands more precision than can be practically-realised, compared to speakers at a significant distance from the head,ears and torso. The distance from ear to headphone/IEM, as well as the sensitivity to variations over time and between listening sessions make it almost impossible, unless the AKG N90Q-style approach is vastly-refined with some sort of feedback mechanism.

I think the Harman preference approach is simply untenable with how much variation there is in the circle of confusion, listening material, and session-to-session variation with the transfer function (except in the obvious, coarse, universal features like 3kHz peak) - better to work with demonstrable accuracy then give users autonomy to tweak to taste. Sean did concede that individualisation is needed for both better spatiality and tonality.
 

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
Also any correction demands more precision than can be practically-realised, compared to speakers at a significant distance from the head,ears and torso. The distance from ear to headphone/IEM, as well as the sensitivity to variations over time and between listening sessions make it almost impossible, unless the AKG N90Q-style approach is vastly-refined with some sort of feedback mechanism.

I think the Harman preference approach is simply untenable with how much variation there is in the circle of confusion, listening material, and session-to-session variation with the transfer function (except in the obvious, coarse, universal features like 3kHz peak) - better to work with demonstrable accuracy then give users autonomy to tweak to taste. Sean did concede that individualisation is needed for both better spatiality and tonality.
It's possible when the listener is at least somewhat trained(not necessarily "trained listener) that can tell 0.5dB at each frequency points 500hz - 10khz, 1db at 100hz - 12khz. And everyone goes through the process of matching the in the ear response (perceived response) to speakers. And that will disprove a lot of people saying their HRTFs are very different. It's instead simply they are used to certain sounds.
It's difficult because majority people don't like learn to like something.
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
Also any correction demands more precision than can be practically-realised, compared to speakers at a significant distance from the head,ears and torso. The distance from ear to headphone/IEM, as well as the sensitivity to variations over time and between listening sessions make it almost impossible, unless the AKG N90Q-style approach is vastly-refined with some sort of feedback mechanism.

It's not only the distance between ear and transducer, but also (at least arguably) the acoustic impedance of the volume of air between the two.

This is the one reservation I've had about headphone target curves based solely on the measured response in the ear canal: surely the correct target is partially dependent on the distance and acoustic impedance of the air between transducer and eardrum?

To illustrate this by thought exleriment, imagine a listener who listens to two speakers at a distance of 2+ metres in a typical room, but with a divider ensuring there is no crosstalk from L speaker to R ear and vice-versa. In other words, a situation in which speakers are listened to with acoustic (not electronic) crosstalk cancellation. Or in yet other words, a pair of headphones with transducers 2m from the ears and cups half the volume of the room.

In such a scenario, should the loudspeakers' frequency response be flat, or should it be whatever response results in a measured response at the eardrum that matches a desired target response?

I can't help thinking that the former is the better answer. Perhaps I'm wrong... Anyway, if I'm right, does this not mean that our target response must be partially dependent on the distance from transducer to ear and acoustic impedance of the volume of air between them?

(I don't know the answer to this btw, just hoping someone else has an interesting answer.)
 
OP
Ilkless

Ilkless

Major Contributor
Forum Donor
Joined
Jan 26, 2019
Messages
1,771
Likes
3,502
Location
Singapore
It's possible when the listener is at least somewhat trained(not necessarily "trained listener) that can tell 0.5dB at each frequency points 500hz - 10khz, 1db at 100hz - 12khz. And everyone goes through the process of matching the in the ear response (perceived response) to speakers. And that will disprove a lot of people saying their HRTFs are very different. It's instead simply they are used to certain sounds.
It's difficult because majority people don't like learn to like something.

That is the loudness-comparison approach that David Griesinger (1/3-octave) and Linkwitz (swept sine) used. HRTFs are different, the question is whether we help to moderate the perceived difference through any combination of training with the non-individualised HRTF, dynamic cues such as head-tracking and visual cues.

https://www.frontiersin.org/articles/10.3389/fnins.2018.00021/full

Good open-access paper, for all, on this.
 

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
It's not only the distance between ear and transducer, but also (at least arguably) the acoustic impedance of the volume of air between the two.

This is the one reservation I've had about headphone target curves based solely on the measured response in the ear canal: surely the correct target is partially dependent on the acoustic impedance of the air between transducer and eardrum?

To illustrate this by thought exleriment, imagine a listener who listens to two speakers at a distance of 2+ metres in a typical room, but with a divider ensuring there is no crosstalk from L speaker to R ear and vice-versa. In other words, a situation in which speakers are listened to with acoustic (not electronic) crosstalk cancellation. Or in yet other words, a pair of headphones with transducers 2m from the ears and cups half the volume of the room.

In such a scenario, should the loudspeakers' frequency response be flat, or should it be whatever response results in a measured response at the eardrum that matches a desired target response?

I can't help thinking that the former is the better answer. Perhaps I'm wrong... Anyway, if I'm right, does this not mean that our target response must be partially dependent on the distance from transducer to ear and acoustic impedance of the volume of air between them?

(I don't know the answer to this btw, just hoping someone else has an interesting answer.)
Ideally the two scenarios should match. When not, here's the issue of whether eq your speakers to flat in anechoic chamber or eq to a in room house curve?
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
Ideally the two scenarios should match. When not, here's the issue of whether eq your speakers to flat in anechoic chamber or eq to a in room house curve?

Yes, and IME and in my understanding of most of the literature, the ideal (to slightly oversimplify) would be a flat direct sound in the mids and highs, with correction (if necessary) in the bass to compensate for the room's influence.

But this would certainly not always result in a response at the eardrum that looked like eg the Harman target curve (or whatever target curve you view as desirable).
 
Last edited:

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
Yes, and IME and in my understanding of most of the literature, the ideal (to slightly oversimplify) would be a flat direct sound in the mids and highs, with correction (if necessary) in the bass to compensate for the room's influence.

But this would certainly not always result in a response at the eardrum that looked like eg the Harman target curve.
Harman Targets especially the IE ones are just wrong IMO. The original 2013 version was actually pretty good. The issue again lies on the measurement equipment. It's just different from human ears when shallow inserting. Current equipments are only accurate when using deep insertions like er4. Open canal response are accurate under limited frequency. New hires head is not the solution because fundamentally it's a difficult issue where the acoustic impedance should be exactly the same of a typical human ear canal with any arbitrary air volume in the ear canal.
Current solution is just use human ears to perform sine sweep and fr matching.
 
OP
Ilkless

Ilkless

Major Contributor
Forum Donor
Joined
Jan 26, 2019
Messages
1,771
Likes
3,502
Location
Singapore
Harman Targets especially the IE ones are just wrong IMO. The original 2013 version was actually pretty good. The issue again lies on the measurement equipment. It's just different from human ears when shallow inserting. Current equipments are only accurate when using deep insertions like er4. Open canal response are accurate under limited frequency. New hires head is not the solution because fundamentally it's a difficult issue where the acoustic impedance should be exactly the same of a typical human ear canal with any arbitrary air volume in the ear canal.
Current solution is just use human ears to perform sine sweep and fr matching.

Olive's argument is that the free and diffuse-field curves do not account for realistic amounts of reflection in domestic listening rooms that we are used to.



I'm encouraged by the gradual start of integrated, adaptive approaches like the N90Q and the Airpods Pro. Someone needs to make a high-performance version of such methods, with constant hi-res measurement and adaptation for placement variation, at least.
 

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
Olive's argument is that the free and diffuse-field curves do not account for realistic amounts of reflection in domestic listening rooms that we are used to.


I'm encouraged by the gradual start of integrated, adaptive approaches like the N90Q and the Airpods Pro. Someone needs to make a high-performance version of such methods, with constant hi-res measurement and adaptation for placement variation, at least.
Issue is that not one of them in Harman talks about in the ear response. Measurements can be more than 10db different from measurements in the ear at certain frequencies. That's not at all the same issue with using speaker in room based target(Harman target isn't tho but kinda sourced and heavily modified) and df/ff. The high frequency peaks are nightmare in high performance reproduction that's not bearable. Almost none shallow inseting iems suit my criteria. Only deep inserting er2se and er4 series and earbud type like leibesleid, chaconne do.

PS: He's using the same argument as in 2013 btw. Solely like that isn't good enough. Or it's even worse/less good than df + house curve based targets.
 
OP
Ilkless

Ilkless

Major Contributor
Forum Donor
Joined
Jan 26, 2019
Messages
1,771
Likes
3,502
Location
Singapore
Measurements can be more than 10db different from measurements in the ear at certain frequencies. That's not at all the same issue with using speaker in room based target(Harman target isn't tho but kinda sourced and heavily modified) and df/ff.

I'm not sure what you are saying, that the manikin differs because of different acoustic impedance, or difference from individual HRTF, or both? Or are you saying they don't measure in the drum? Because they definitely do, but of course the drum reference point of the manikin isn't going to be very individual.

Definitely, everyone naturally imposes their own HRTF upon the in-room speaker target since it's a free-field source at a large distance vs torso, head and ear dimensions + no acoustic impedance differences. While your concern with the change in ear canal resonances that result etc. is valid, my bigger issue is with the straightforward stuff of even implementing the correction (especially high-Q ones) amid the placement/insertion variations. This needs real-time, in-situ measurements.
 

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
I'm not sure what you are saying, that the manikin differs because of different acoustic impedance, or difference from individual HRTF, or both? Or are you saying they don't measure in the drum? Because they definitely do, but of course the drum reference point of the manikin isn't going to be very individual.

Definitely, everyone naturally imposes their own HRTF upon the in-room speaker target since it's a free-field source at a large distance vs torso, head and ear dimensions + no acoustic impedance differences. While your concern with the change in ear canal resonances that result etc. is valid, my bigger issue is with the straightforward stuff of even implementing the correction (especially high-Q ones) amid the placement/insertion variations. This needs real-time, in-situ measurements.
1, Dummy heads differ from each other yes not in my last post but true. The most different is the hi res head which is currently bad.
2, My last point was the difference between human ear and the HATS. There are two ways of measuring. Directly measure response using probe mic at drum. Or use human's perception directly tell the response.
3, I had been looking into and helped designing earphones for some time. The current best offerings from moondrop was heavily influenced by me. The issue can be the frequency response look amazing in measurements but have huge peak at 6-7khz in the hear. Or there can be a peak at 8k. but it's actually a dip in the ear. No current measurements equipment tell you that but using sine sweep in the ear does.
 
Top Bottom