• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The Nature of headphone Measurements

A secondary point is that measurement accuracy in the high end to the extend that we would like to be "completely accurate" is a problem for scientific discovery firstly. Hopefully the methods get better and even more accurate than they are now.

People wanting to purchase high end headphones need to be aware of inaccuracies in the high end of headphone measurements as well.

My opinion is I generally don't want to drop big money on headphones because the people who understand measurements are saying they can't be totally confident in the measurements above 8khz ... This is less of an issue with budget friendly gear simply because the financial stakes are lower. It doesn't mean don't buy expensive headphones ... It's just a point of contention for me with spending big money on headphones.
 
There is a notable absence of robust data validating upper treble measurements. Anecdotally, I have used IEMs that sounded piercing or muted even when the graph appeared acceptable to good. Conversely, other IEMs displayed pronounced peaking features on the graph but sounded entirely acceptable in the treble. Now, there are reasons supported by data for why everyone should doubt the value of upper treble IEM measurements. My experience therefore is entirely expected given the uncertainty factors that exist: 1. Insertion depth 2. HRTF 3. Target curve smoothing 4. Acoustic impedance
Each affects the FR substantially and makes you question IEM measurements as a metric if you're concerned with personal listening quality.

1736699058551.png

Just variation of insertion depth causes peaks and dips in the FR, but most enthusiasts are not aware of these factors, and how they could negate FR complaints that really are super conditional.
 
Last edited:
Let's recap: With headphone measurements, above 8khz isn't as precise as below 8khz (there was some talk about IEM measurements being more precise than Over ears). From a purely scientific standpoint, it's still not completely "reliable" for telling the true exact frequency response of the headphones above 8khz.
Yes, and so far I don't think anyone has answered my question: is this an actual problem? I honestly don't know.

IEM response varies per person. And also (not mentioned yet I think, but seems rather crucial if I understand it correctly) psychoacoustics show the hearing system's frequency resolution is frequency dependent and above 1kHz the bandwith can be approximated as 0.2f. That's already 1.6kHz wide at 8kHz..

So, with these 2 combined, to me that sounds like: measurements are inaccurate but response and hearing at those frequencies are also 'inaccurate' and person-dependent, so what do more accurate measurements bring to the table?

Like if a measurement shows ringing at 10kHz or so, ok then as you say that is a real effect and I'm not debating that nor whether that is useful, but my point is: you put this IEM in your ear and for you personally the ringing is actually at 11kHz. Does that change anything at all? Would you have made another decision when the measurement showed ringing at 10.1kHz? Or is that not what you mean with better measurements?
 
Yes, and so far I don't think anyone has answered my question: is this an actual problem? I honestly don't know.

IEM response varies per person. And also (not mentioned yet I think, but seems rather crucial if I understand it correctly) psychoacoustics show the hearing system's frequency resolution is frequency dependent and above 1kHz the bandwith can be approximated as 0.2f. That's already 1.6kHz wide at 8kHz..

So, with these 2 combined, to me that sounds like: measurements are inaccurate but response and hearing at those frequencies are also 'inaccurate' and person-dependent, so what do more accurate measurements bring to the table?

Like if a measurement shows ringing at 10kHz or so, ok then as you say that is a real effect and I'm not debating that nor whether that is useful, but my point is: you put this IEM in your ear and for you personally the ringing is actually at 11kHz. Does that change anything at all? Would you have made another decision when the measurement showed ringing at 10.1kHz? Or is that not what you mean with better measurements?
I‘m late, while ironically I registered only to clarify this topic. Then I saw people argue with that dips n peaks reaching out for on-paper perfection. It first appeared as audio-phoolery to me.

First of all the Harman gets rid of peaks n dips by measuring different individuals and then averaging over their in-ear signal when exposing them to some diffuse field, right? Each individual shows „resonances“, but at some different frequency; so much different that in the average all is summed up and we are presented a smooth curve. A peak at one person compensates a dip of another and so forth. Right?

Let us assume the measurement rig represents an average person. Still we expect it to show peaks and dips as every individual. In order to represent the average pattern of peaks and dips we must not sum and devide the human individuals responses. Instead the median of measured parameters needs to be evaluated, frequency and widthof dips and peaks, possibly even their numbers.

Sidenote: how come to not understand this, but nevertheless being empowered to shell out 5000 on a headphone amp?

Summarized, peaks and dips are expected to show in frequency response graphs of headphones made on an ear simulator. If not the device under test is broken by design. Especially with IEM the case is easy. Peaks and dips depend on internals of the device, but also on the fitting. Question is, how much do these resemble peaks and dips that occur with a particular individual’s open, as not to say naked ear? In the same realm the match of the individual‘s pinna amplification, depending on inclination is to be scrutinized. Individually, not against some median/average.

I measured an IEM of no further interest with a blocked nozzle, means directly at its output canal. The results need some further investigation.

At the same time I tried to identify the probs of peaks and dips. An equalizer was shifted over the treble band. But I‘m to old to generalize the outcome.

This all said and done, somebody else may be willing to smoothly conclude on the merits of the „target curves“.
 
Last edited:
So, with these 2 combined, to me that sounds like: measurements are inaccurate but response and hearing at those frequencies are also 'inaccurate' and person-dependent, so what do more accurate measurements bring to the table?

Like if a measurement shows ringing at 10kHz or so, ok then as you say that is a real effect and I'm not debating that nor whether that is useful, but my point is: you put this IEM in your ear and for you personally the ringing is actually at 11kHz. Does that change anything at all? Would you have made another decision when the measurement showed ringing at 10.1kHz? Or is that not what you mean with better measurements?

More or less yes. Everything up to "why bother with accurate measurements". My conclusion is different.

Accurate measurements would mean that a measured ring at 11khz would be the definitive answer that would allow us to adjust and tune accordingly. Accurate measurements are the goal. As it stands, we have an approximation of what is happening above 8khz, and it's helpful ... Though we shouldn't stop pursing more accurate ways of measuring.
 
More or less yes. Everything up to "why bother with accurate measurements". My conclusion is different.

Accurate measurements would mean that a measured ring at 11khz would be the definitive answer that would allow us to adjust and tune accordingly. Accurate measurements are the goal. As it stands, we have an approximation of what is happening above 8khz, and it's helpful ... Though we shouldn't stop pursing more accurate ways of measuring.
That's a non-sequitor. We need to acknowledge that limitations exist. In no way does that mean lacking a desire for accurate and meaningful measurements.
 
That's a non-sequitor. We need to acknowledge that limitations exist. In no way does that mean lacking a desire for accurate and meaningful measurements.
Hold on ... you said the same thing as me using different words and called it a non-sequitur (which means an illogical conclusion ...)
 
Hold on ... you said the same thing as me using different words and called it a non-sequitur (which means an illogical conclusion ...)
I think you are missing an important point when you object and spin others input as pessimistic. You can't reach accuracy without being brutally honest about limitations first, being an optimist won't make the problem go away.
 
Though we shouldn't stop pursing more accurate ways of measuring.
If we were scientists, we better accept the measurement as, well, accurate. Top top accurate to 24 bits and such. Only that the interpretation leaves a lot to be desired. Is it in order to refer back to a „positivist“ approach?

Declare the „falsifiable hypothesis“.

As mentioned already as a stray note here and there, the peaks and dips are understood as just amplitude errors. These can be mitigated, or even removed by using an equalizer. If there is a problem with peaks and dips this has to show up as a personal preference rating. Hence an individual, subjective preference driven adjustment, reliably and reproducible, is possible.

Good enough hypothesis?

It seems that nobody ever tried to eq the peaks and dips (while people relentlessly ask the IEM makers to remove them—how, and see my post above, for what reason?!).

We need to start with the equalizing first and foremost. Any other, in my book as an educated scientist, is void. LOL

(Look, before we address a problem with new „measurement“, we should be halfway clear that it exists.)
 
Last edited:
Accurate measurements would mean that a measured ring at 11khz would be the definitive answer that would allow us to adjust and tune accordingly.
Ok but we agree that because of personalized response there cannot possible be one single definitive answer, right? It's not because the manikin's 'ear' results in a ring at 11kHz that your ear does that as well.

So unless I'm missing something here, to get one definitive answer for you - such that you can EQ it away or decide the IEM won't ever work for you because you don't like even the slightest ring at some very specific frequency - we'd need a way to take measurement results then transform that to your own ear somehow (and probably separate left and right as well). Which would need at least 2 things: an accurate physical measurement of your ear and measurements taken in a way this transformation can be done (either by modelling or by doing measurements at different insertion depths and/or different manikin ear shapes). That doesn't seem practically feasible. Though again, maybe I'm missing something. But I don't see how else one would improve the current situation. And again: I'm still not convinced this is actually needed.


Though we shouldn't stop pursing more accurate ways of measuring
Like, infinitely? :)

I don't agree, at all, and I do a ton of measurements for various jobs and a lot of those are used for actual scientific publications. Not every possible situation needs the same amount of accuracy nor resolution. In fact, before a measurement even takes place, the decision is usually made already what accuracy/resolution is sufficient. Simple example to make this clear: if I want to know if my 220V outlet is ok it's good enough if a cheap multimeter tells me 221, 222, or 223V AC.
 
… a cheap multimeter tells me 221, 222, or 223V AC.
People reading measurements, but with different backgrounds. I gave, as I see it, a quite comprehensive summary on what‘s going on with peaks, dips, Harman and all above. Low expectations from my side. We‘re on audio territory. Will come back to this later, maybe.
 
It's a well known fact there is a lot of individual variance on the ear canal, how can you build a one size fits all solution and expect it to exactly match your ears?

Now if the "711" coupler emulates the average ear canal accurately is another question. It works properly when you shove a deep insertion IEM like an etymotic because the individual variation at that depth is much lower.
 
It's a well known fact there is a lot of individual variance on the ear canal, how can you build a one size fits all solution and expect it to exactly match your ears?
A naked ear should emphasize frequencies around one quarter of its canal length, 3/4 of the length, 5/4 and on and on (if relevant for bats). A pipe resonance with one side open (entry), the other closed (eardrum).
Now with a high-end IEM rammed into the ear we observe a different scenery. Both ends are closed—somehow, but remaining length of canal is not to different.

What pipe resonance pattern is expected?

If the IEM, high-end of course, was blocking the canal it would be one half, two half, three half—but it is positively not blocking. The nozzle is open—but we are in troubled water still, because sound coming from there brings with it some mechanical impedance, hits another mechanical impedance at the canal entry, interacts accordingly in complicated ways, only to laugh at you because of phase altering pathway lengths, vulgo extra delayed reflections.

What to do?

Now if the "711" coupler emulates the average ear canal accurately is another question. It works properly when you shove a deep insertion IEM like an etymotic because the individual variation at that depth is much lower.
Yes it does to a sufficient degree! If so, and yes it is so, confirmed, what does some regular Joe expect for an amplitude frequency response? Something hugging the smoothed ad nauseam Harman target? If this is modern philosophy, be it so.
If not, then next best expectation is regular Joe‘s resonance pattern naked plus pinna amplification, as with IEM the pinna is circumvented and needs to be emulated at least amplitude wise. And reiterated, we want that effing pattern of resonances as seen with the naked ear, only in this case realized with an IEM plugged into the canal!! Easy pleasy!

Now take the very same IEM and plug it into another simulated ear canal with different parameters. Same pattern of argumentation. We expect the other resonance pattern, matching the dimensions of the other simulator, a pattern to occur with an open ear, but again now with that IEM plugged in, and a different pinna amplification in case.

Can an IEM be designed to excel in both tests? Yes. (Mimic, at the nozzle, the impedance of the transition from canal to outer ear. The Truthear Hexa has two strong filters in that position, though.)
Is the test done? No.
Why? … …
 
Last edited:
A naked ear …

What to do?
Why? … …
Disclaimer: don’t know anything about acoustic engineering, I’m into this topic since around 6 weeks

Summary on IEM: design it so that it shows in a measurement on an arbitrary, kind of middle of the road ear model the same „resonance“ pattern as a diffuse field would on that same ear model. From understanding the ear model in relevant parameters like canal length, width, bend etc prove that the IEM on every other ear model would also tightly reproduce the resonance pattern of these ear models when exposed to a diffuse sound field.

The crucial design parameter is the acoustical impedance at the transition from nozzle to ear canal. If the impedance at this transition point would be (ideally) identical to that of the transition from inner ear to outer ear (basically terminated tube to funnel) every individual parameterizatin of the inner ear is covered, and won‘t need further consideration!

On over ears: the coupling of the transducer in the cup to the ear canal is presumably quite weak. The problem statement regarding ear canal resonances as with IEM is non existent. Best one can hope for is to generate a frequency neutral diffuse and homogeneous sound field around the ear‘s pinna. It would mimic the diffuse field that is, alas for no better reason, assumed to be an optimum reference.

In either case the typical smoothed to death target curves irritate customers. Today’s customer go even further and irritate themselves, while craving for on-paper perfection rather than preference—is there any?

What would help, and what will never be seen is: the actual in-detail head related transfer function with open ear canal for the „typical“ dummy head / measurement rig when exposed to free field direct sound and diffuse field sound. Remember, it will show nasty peaks and dips, because it logically has to! Easy to take, but they won‘t give it to you, only „averages“ of different types. Compare the amplitude response with results for the headphone of interest put on that ear model and measured.
The same for a few non-typical dummy heads to prove the design of the headphone, over-ear or IEM, as reasonably robust to parameter changes.

The latter would be a first approximation. Smoothed target curves are irritation.
 
Last edited:
I have had this debate a number of times already. The answer is always the same. The different acoustic 'load impedance' that is applied due to the 'air inside the ear cup' and reflective surfaces differing and those small 'load' differences could be important. I have never seen any evidence it is with over/on ears in the sense that current draw would differ. It does differ (the impedance peak of a driver differs) when measuring impedance in a sealed manner and 'open' manner.

Research had been done with combining 'free air' measurements and sealed measurements but of course this does not work.

With dynamic IEM's I would expect to see similar effects (with seal as ear canals in reality are never as circular as a tube).

This simply means that the 'errors' that are created (due to the modifications of the emitted signal from pinna and ear canal) have to be corrected (undone) to get a feel of what has been emitted and arrived at the mic.
Unfortunately this differs per situation so there is no 'perfect' correction possible. There is always an error that is introduced as the correction is 'averaged'.

The pleasant part of a standard is that the mechanical construction and tolerances is described in a standard so repeatable as long as it complies to a standard across manufacturers as long as they adhere to that standard. This is great for research purposes as the conditions are the same.

It goes sideways when different standards are used AND because real world 'auditory systems' differ in 'incoming sound modification' from that standard.
This has been thoroughly researched and that research shows real world situation vs standards can (and do) vary substantially (as in well over 20dB here and there).

But ... we DO need standards so we can compare devices.

Then we can add different 'target curves' on top of that, and there are a lot of them, mounting on HATS issues (like seal and positioning) and you get many different squiggles and it becomes a mess.

That is what should have been explained in the video more clearly than just the comment that Harman curve compliance is not a must and showing the difference between 2 fixtures and creating the illusion that this would be enough to show 'how much a headphone might deviate on a real head'. It isn't it just shows 2 different squiggles obtained in 2 different fixtures in some specific condition and is proof of nothing really ... other than a headphone measures different on 2 different standard fixtures ... duhhhh.

A confounding problem is that when one spends tens of thousands of $, € on very expensive test equipment you are going to have to use (and trust) it.
One makes a choice (usually to test different fixtures by loaning them), makes a choice and runs with it.

Yep... it is still wild west in headphone measurements but at least the conditions are captured in standards which, again, is very important to science.
Now ... can we expect perfect correlation with real world ears and conditions as well as production spread of the tested articles... NO of course not.

But there is a silver lining and that is despite all the errors that do occur one can measure a response and base EQ on it and that will (sorry.. can) improve tonality in real world situations assuming the delta between real world and 'standard' measurement is not too big.
Deviate it will, how much remains the question.

And then I have not even started to include taste and the tonal balance of recordings (which often has deviations as well) into the mix.

Look for a headphone you like In comfort and sound as it is sold.
If you want to take the scientific route look for 'common' issues of a measured model taken on different fixtures and check for those 'common across fixtures' errors that should be corrected and apply an 'average' correction for those errors.
Then use some tone control to make it fit your ears/brain.

Or the less scientific route ... just use a tone control to make it sound the way you like or pick a headphone that sounds nice as it is. Nothing wrong with that.
I believe my argument is for better standards and not a lack of them. Which is the point. Without a repeatable and reliable test, it fails all definitions of anything resembling a standard.
 
Even then compliance will only always be to that standard. Human ears might not comply to that standard.
 
Even then compliance will only always be to that standard. Human ears might not comply to that standard.
That's the part that is completely irrelevant. Human ears. This is what I have been saying this entire conversation. We need objective metrics and not subjective ones. Let me present this; if measuring through a pinna and ear canal (all of which are variable and not consistant) seems so important, then why is this process not used when measuring loudspeakers? This makes no sense. Please give me an actual reason and not some eye roll as it is not obvious and there are no publications explaining why this must be the standard. Klippel NFS does not use a HATS to measure loudspeakers. Yet we listen with our ears. That is because an objective evaluation of a loudspeaker's acoustic performance has nothing to do with how humans hear.
 
. Let me present this; if measuring through a pinna and ear canal (all of which are variable and not consistant) seems so important, then why is this process not used when measuring loudspeakers? This makes no sense.
The answer is simple... why introduce more measurement errors ?
 
Let me present this; if measuring through a pinna and ear canal (all of which are variable and not consistant) seems so important, then why is this process not used when measuring loudspeakers?
Because it does not matter. The relationship between measuring with a microphone or measuring with a microphone put in a dummy average ear is essentially fixed (in extreme detail there will be changes depending on e.g. directivity etc but those are essentially negligible).

If you measure frequency response X with a microphone, you measure frequency response Y with a manikin, there's a mathematical function which can convert between those measurements and simply said it depends only manikin ear shape, nothing external. One can be calculated based on the other and vice versa. Using a different speaker there does not change that function. Unlike for an IEM measurement for instance where tip shape and insertion depth immediately produce a different response, and those are external to the transfer function so need to be measured. Hope this makes things clear :)
 
Back
Top Bottom