• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The Nature of headphone Measurements

LevityProject

Active Member
Forum Donor
Joined
Jun 18, 2024
Messages
136
Likes
92
This won't be new for the industry peeps here, but I've been thinking about the inaccuracies in headphone measurements that happen above 8 khz and how these particular measurement inaccuracies don't = irrelevance.

The Story
After I purchased a pair of Salnotes 7hz Zero IEMs, and subsequently purchased them for friends in the audio industry, I discovered that while the exact frequency we each heard as being really sharp and overbearing differed a bit, we all agreed that as we listened, we found that somewhere between 11-13khz, there is a noticable spike happening.

Here is @crinacle 's measurement graph (this is the b&k 5128 target curve)
1000017292.png


For anyone confused by that graph, the review by @amirm shows measured response plotted against ASRs preferred Harman curve.

1000017295.png

What grabs my attention is that there 100% is a large spike in upper treble. I hear it, my trusted colleagues hear it, and the graphs confirm. We also did not notice any major issues below the spike.

We differ on exactly where the spike is, but that is a function of the problem at hand of measurements being innacurate (but not irrelevant) at high frequencies.

Conclusions (for now)
Acceptable HF tolerance is what measurement of headphones should be focused on. Not at the expense of scientific discovery, but as an aid to dealing with current measurment limitations. As measurement limitations are removed, tolerance can become an increasing part of the data used to determine what high fidelity means.

The measurement mics and systems being used today are reliably picking up the fact that issues exist. The problem to overcome is being able to identify the exact location of the high frequency issues. Am I wrong?

For anyone new to this discussion, here's an explanation of why headphone measurements are technically inaccurate above 8khz (though not useless ... Don't confuse the two).

1. Most measurement rigs use couplers and microphones designed to approximate the human ear, but they struggle to replicate the exact ear canal and pinna interactions, especially at high frequencies.

2. Interference Patterns: At higher frequencies, sound waves are more susceptible to interference caused by reflections and diffraction within the measurement setup. These phenomena can produce inconsistent results.

3. Variations in Fit and Seal: Tiny changes in how the headphones sit on the measurement rig (or on a person's head) can significantly alter measurements at high frequencies.

Note: this wild card is why IEMs are great for listening, not as reliable for sustained critical listening.

4. Ear Canal Resonances: The shape and length of an individual's ear canal can greatly influence the perception of high frequencies, making standardized measurements less representative of individual experiences.

5. Perceptual Differences: Human hearing becomes less sensitive and more variable among individuals at very high frequencies, making subjective impressions differ significantly from measurements.

For these reasons, headphone measurements above 8 kHz are best viewed as approximations rather than absolute indicators of performance.

This is why the (well done) scientific evaluations we see here combines measurements with subjective listening tests.
 
Last edited:
The problem to overcome is being able to identify the exact location of the high frequency issues. Am I wrong?
I'd take a step back and first ask: is this actually a problem? As in: suppose one pair has an issue at 12kHz and another one has an issue with same characteristics but at 13kHz, and for the rest there are no differences. As you say yourself at such high frequencies, a whole lot of personal things come into play. What does knowing the exact location solve, as opposed to knowing only that there is an issue roughly around 12.5kHz?
 
Is the ear! To finish the sentence. I will tell you the same thing I told Don even he ridiculed it. Buy cheap "open" earbuds that go into outer pine and play with them to see how much it's dependent and not only back ear part sealing but also ear turf in upper front for highs along with angle. Reason pretty much everyone is avoiding to measure such. For IEMs where cancelling is in highs will be determined with ear chenel shape and width. We take for average ones it's at about 10 KHz but there's no such thing as average really.
There is a lot of work ongoing for new mentioned HAT for the ear modeling (from Shon) and hopefully with time they will manage to better squash it down to couple models and better averaging but that's about it. So when picking ones for you try by listening (and hire training actually helps) to determine is treble right for you. What's popularly characterised as "Harman Bass" preference is nothing else than equal loudness compensation for upper 70's SPL done by shaping self low filler which (knob) tested subjects had under their hand and average of how they adjusted it all together.
That's in general without deap digging about psy and physical impacts nor hearing state or health.

Now to address elephant in the room. IEM's are bad at we will have world wide population issue on large scale regarding hearing loss and permanent one. There never whose anything so; affordable, popular, good tuned and with very low distortion in the past. Very low distortion and very good FR makes you want to simply crunch it up more. The measures to take to even try to prevent that is doing measurements and implementing equal loudness compensation to it along with education about it and our psy hearing sensory system in order to educate how they can have fully pleasurable experience with full dynamics even on moderate SPL, how to start on little lower levels than what we perceived at first as optimal and slowly increase it on longer listening sessions and when to make pause and restart from beginning regarding level's. So better experience and helt benefits.
 
All ear canals have resonances. IEM test fixtures have 'ear canals' (tubes).
an IEM directly couples into that ear canal.
Insertion depth and shape-size of the ear canal can cause those resonances to shift a little in frequency and amplitude depending on the mentioned factors.

resonance.png


and

resonance2.png
 
Last edited:
The problem to overcome is being able to identify the exact location of the high frequency issues. Am I wrong?
For IEMs this is less of a problem since they tend to only:
Insertion depth and shape-size of the ear canal can cause those resonances to shift a little in frequency and amplitude...
My point is for IEMs the high frequency measurements are very useful and hearing a sine sweep is enough to confirm where the resonances are for you.
Over-ear headphone measurements are a lot harder to deal with because even just the positioning can shift the frequency, not to mention how a single peak on the FR might become several on your ears.
 
4. Ear Canal Resonances: The shape and length of an individual's ear canal can greatly influence the perception of high frequencies, making standardized measurements less representative of individual experiences.

Reminds me of what is discussed in this interview with Paul Barton of PSB:

 
Don't you just love it when someone posts a graph and says, "the IEM has a peak at 12kHz". No, the IEM doesn't have a peak. The combination of measurement rig, ear tips, insertion point has a peak 12kHz, and probably the peak is expected given the physics of shoving a transducer down a tube. On top of that the FR may differ in many areas when personally fitted.
 
This won't be new for the industry peeps here, but I've been thinking about the inaccuracies in headphone measurements that happen above 8 khz and how these particular measurement inaccuracies don't = irrelevance.

The Story
After I purchased a pair of Salnotes 7hz Zero IEMs, and subsequently purchased them for friends in the audio industry, I discovered that while the exact frequency we each heard as being really sharp and overbearing differed a bit, we all agreed that as we listened, we found that somewhere between 11-13khz, there is a noticable spike happening.

Here is @crinacle 's measurement graph (this is the b&k 5128 target curve)
View attachment 418768

For anyone confused by that graph, the review by @amirm shows measured response plotted against ASRs preferred Harman curve.

View attachment 418770
What grabs my attention is that there 100% is a large spike in upper treble. I hear it, my trusted colleagues hear it, and the graphs confirm. We also did not notice any major issues below the spike.

We differ on exactly where the spike is, but that is a function of the problem at hand of measurements being innacurate (but not irrelevant) at high frequencies.

Conclusions (for now)
Acceptable HF tolerance is what measurement of headphones should be focused on. Not at the expense of scientific discovery, but as an aid to dealing with current measurment limitations. As measurement limitations are removed, tolerance can become an increasing part of the data used to determine what high fidelity means.

The measurement mics and systems being used today are reliably picking up the fact that issues exist. The problem to overcome is being able to identify the exact location of the high frequency issues. Am I wrong?

For anyone new to this discussion, here's an explanation of why headphone measurements are technically inaccurate above 8khz (though not useless ... Don't confuse the two).

1. Most measurement rigs use couplers and microphones designed to approximate the human ear, but they struggle to replicate the exact ear canal and pinna interactions, especially at high frequencies.

2. Interference Patterns: At higher frequencies, sound waves are more susceptible to interference caused by reflections and diffraction within the measurement setup. These phenomena can produce inconsistent results.

3. Variations in Fit and Seal: Tiny changes in how the headphones sit on the measurement rig (or on a person's head) can significantly alter measurements at high frequencies.

Note: this wild card is why IEMs are great for listening, not as reliable for sustained critical listening.

4. Ear Canal Resonances: The shape and length of an individual's ear canal can greatly influence the perception of high frequencies, making standardized measurements less representative of individual experiences.

5. Perceptual Differences: Human hearing becomes less sensitive and more variable among individuals at very high frequencies, making subjective impressions differ significantly from measurements.

For these reasons, headphone measurements above 8 kHz are best viewed as approximations rather than absolute indicators of performance.

This is why the (well done) scientific evaluations we see here combines measurements with subjective listening tests.
How about confirming the resonance position in the ear via tone generator/sine swept?
 
I'd take a step back and first ask: is this actually a problem? As in: suppose one pair has an issue at 12kHz and another one has an issue with same characteristics but at 13kHz, and for the rest there are no differences. As you say yourself at such high frequencies, a whole lot of personal things come into play. What does knowing the exact location solve, as opposed to knowing only that there is an issue roughly around 12.5kHz?
Knowing the exact location is helpful from a data perspective. But it's only really important if your getting into highly expensive products. For lower prices stuff, knowing a close approximation in the HF and having accuracy in the midrange and low end is just fine, I think.
 
I have always found it quit curious that measurement rigs for headphones/earphones often simulate human physiology. Since we would never do this to get a metric from a loudspeaker in a free field, why would we do it for headphones? It confounds me as to why we would think that the physiological acoustic filtering, independent to each individual, has anything to do with retrieving objective performance metrics on any kind of a headphone. The acoustic loading might be the only area that we should be concerned with for achieving an appropriate measurement. Our goal should be to ascertain the performance characteristics of the headphone and not how humans hear. As I have stated earlier we do not use a HATS system to get objective performance characteristics from a loudspeaker but feel obligated to do so with a headphone. This would seem to be an error in objective science. The pinna and shape of the ear canal are not part of a headphones performance but part of the individual humans hearing retrieval system. It would appear that we should separate these two things and look to derive a performance metric outside of human physiology, very much the way we do with electronics or loudspeakers. These HRTF's are variables that are a nuisance to understanding what the headphone is doing with the signal. We do not need to know what our human physiology is doing (unless that is the aim of our study) , we need to know what the headphone is doing all by itself outside of anyone's HRTF's. Many of the high frequency anomalies are often related to reflections inside the measurement rig (in some cases). Without delving into the minutia of why or what a meaningful transfer function might be, I would submit we re-evaluate how we are conducting these measurements industry wide. IMHO
 
@Jaxx1138 PSY stand's for psycho acoustic. Equal loudness compensation has more than 100 scientific experiment done on both with all together similar results (backing each other) with all together only ever even remotely relevant statistic sample done in accustic research. So the error is yours only. The research is based both on human hearing and ears and disregarding any aspects impacting it is opposite of scientific. I did a lot of social studies in my time.
 
I have always found it quit curious that measurement rigs for headphones/earphones often simulate human physiology. Since we would never do this to get a metric from a loudspeaker in a free field, why would we do it for headphones? It confounds me as to why we would think that the physiological acoustic filtering, independent to each individual, has anything to do with retrieving objective performance metrics on any kind of a headphone. The acoustic loading might be the only area that we should be concerned with for achieving an appropriate measurement. Our goal should be to ascertain the performance characteristics of the headphone and not how humans hear. As I have stated earlier we do not use a HATS system to get objective performance characteristics from a loudspeaker but feel obligated to do so with a headphone. This would seem to be an error in objective science. The pinna and shape of the ear canal are not part of a headphones performance but part of the individual humans hearing retrieval system. It would appear that we should separate these two things and look to derive a performance metric outside of human physiology, very much the way we do with electronics or loudspeakers. These HRTF's are variables that are a nuisance to understanding what the headphone is doing with the signal. We do not need to know what our human physiology is doing (unless that is the aim of our study) , we need to know what the headphone is doing all by itself outside of anyone's HRTF's. Many of the high frequency anomalies are often related to reflections inside the measurement rig (in some cases). Without delving into the minutia of why or what a meaningful transfer function might be, I would submit we re-evaluate how we are conducting these measurements industry wide. IMHO
I have had this debate a number of times already. The answer is always the same. The different acoustic 'load impedance' that is applied due to the 'air inside the ear cup' and reflective surfaces differing and those small 'load' differences could be important. I have never seen any evidence it is with over/on ears in the sense that current draw would differ. It does differ (the impedance peak of a driver differs) when measuring impedance in a sealed manner and 'open' manner.

Research had been done with combining 'free air' measurements and sealed measurements but of course this does not work.

With dynamic IEM's I would expect to see similar effects (with seal as ear canals in reality are never as circular as a tube).

This simply means that the 'errors' that are created (due to the modifications of the emitted signal from pinna and ear canal) have to be corrected (undone) to get a feel of what has been emitted and arrived at the mic.
Unfortunately this differs per situation so there is no 'perfect' correction possible. There is always an error that is introduced as the correction is 'averaged'.

The pleasant part of a standard is that the mechanical construction and tolerances is described in a standard so repeatable as long as it complies to a standard across manufacturers as long as they adhere to that standard. This is great for research purposes as the conditions are the same.

It goes sideways when different standards are used AND because real world 'auditory systems' differ in 'incoming sound modification' from that standard.
This has been thoroughly researched and that research shows real world situation vs standards can (and do) vary substantially (as in well over 20dB here and there).

But ... we DO need standards so we can compare devices.

Then we can add different 'target curves' on top of that, and there are a lot of them, mounting on HATS issues (like seal and positioning) and you get many different squiggles and it becomes a mess.

That is what should have been explained in the video more clearly than just the comment that Harman curve compliance is not a must and showing the difference between 2 fixtures and creating the illusion that this would be enough to show 'how much a headphone might deviate on a real head'. It isn't it just shows 2 different squiggles obtained in 2 different fixtures in some specific condition and is proof of nothing really ... other than a headphone measures different on 2 different standard fixtures ... duhhhh.

A confounding problem is that when one spends tens of thousands of $, € on very expensive test equipment you are going to have to use (and trust) it.
One makes a choice (usually to test different fixtures by loaning them), makes a choice and runs with it.

Yep... it is still wild west in headphone measurements but at least the conditions are captured in standards which, again, is very important to science.
Now ... can we expect perfect correlation with real world ears and conditions as well as production spread of the tested articles... NO of course not.

But there is a silver lining and that is despite all the errors that do occur one can measure a response and base EQ on it and that will (sorry.. can) improve tonality in real world situations assuming the delta between real world and 'standard' measurement is not too big.
Deviate it will, how much remains the question.

And then I have not even started to include taste and the tonal balance of recordings (which often has deviations as well) into the mix.

Look for a headphone you like In comfort and sound as it is sold.
If you want to take the scientific route look for 'common' issues of a measured model taken on different fixtures and check for those 'common across fixtures' errors that should be corrected and apply an 'average' correction for those errors.
Then use some tone control to make it fit your ears/brain.

Or the less scientific route ... just use a tone control to make it sound the way you like or pick a headphone that sounds nice as it is. Nothing wrong with that.
 
Knowing the exact location is helpful from a data perspective. But it's only really important if your getting into highly expensive products. For lower prices stuff, knowing a close approximation in the HF and having accuracy in the midrange and low end is just fine, I think.
That makes little sense: it's either meaningful data or not. Price has nothing to do with it, apart from subjective expectations towards performance perhaps. There's a reason sites like this review products of all price ranges, as they should.
 
Last edited:
I have always found it quit curious that measurement rigs for headphones/earphones often simulate human physiology. Since we would never do this to get a metric from a loudspeaker in a free field, why would we do it for headphones? It confounds me as to why we would think that the physiological acoustic filtering, independent to each individual, has anything to do with retrieving objective performance metrics on any kind of a headphone. The acoustic loading might be the only area that we should be concerned with for achieving an appropriate measurement.
The only clear reason why someone engages in erroneous thinking like that is when you identify as a measurement device. Hehe.
 
@Jaxx1138 PSY stand's for psycho acoustic. Equal loudness compensation has more than 100 scientific experiment done on both with all together similar results (backing each other) with all together only ever even remotely relevant statistic sample done in accustic research. So the error is yours only. The research is based both on human hearing and ears and disregarding any aspects impacting it is opposite of scientific. I did a lot of social studies in my time.
I am not sure you are addressing what I said. Perhaps you misunderstood my comment.
 
The only clear reason why someone engages in erroneous thinking like that is when you identify as a measurement device. Hehe.
Could you elaborate??? Erroneous ?
I have had this debate a number of times already. The answer is always the same. The different acoustic 'load impedance' that is applied due to the 'air inside the ear cup' and reflective surfaces differing and those small 'load' differences could be important. I have never seen any evidence it is with over/on ears in the sense that current draw would differ. It does differ (the impedance peak of a driver differs) when measuring impedance in a sealed manner and 'open' manner.

Research had been done with combining 'free air' measurements and sealed measurements but of course this does not work.

With dynamic IEM's I would expect to see similar effects (with seal as ear canals in reality are never as circular as a tube).

This simply means that the 'errors' that are created (due to the modifications of the emitted signal from pinna and ear canal) have to be corrected (undone) to get a feel of what has been emitted and arrived at the mic.
Unfortunately this differs per situation so there is no 'perfect' correction possible. There is always an error that is introduced as the correction is 'averaged'.

The pleasant part of a standard is that the mechanical construction and tolerances is described in a standard so repeatable as long as it complies to a standard across manufacturers as long as they adhere to that standard. This is great for research purposes as the conditions are the same.

It goes sideways when different standards are used AND because real world 'auditory systems' differ in 'incoming sound modification' from that standard.
This has been thoroughly researched and that research shows real world situation vs standards can (and do) vary substantially (as in well over 20dB here and there).

But ... we DO need standards so we can compare devices.

Then we can add different 'target curves' on top of that, and there are a lot of them, mounting on HATS issues (like seal and positioning) and you get many different squiggles and it becomes a mess.

That is what should have been explained in the video more clearly than just the comment that Harman curve compliance is not a must and showing the difference between 2 fixtures and creating the illusion that this would be enough to show 'how much a headphone might deviate on a real head'. It isn't it just shows 2 different squiggles obtained in 2 different fixtures in some specific condition and is proof of nothing really ... other than a headphone measures different on 2 different standard fixtures ... duhhhh.

A confounding problem is that when one spends tens of thousands of $, € on very expensive test equipment you are going to have to use (and trust) it.
One makes a choice (usually to test different fixtures by loaning them), makes a choice and runs with it.

Yep... it is still wild west in headphone measurements but at least the conditions are captured in standards which, again, is very important to science.
Now ... can we expect perfect correlation with real world ears and conditions as well as production spread of the tested articles... NO of course not.

But there is a silver lining and that is despite all the errors that do occur one can measure a response and base EQ on it and that will (sorry.. can) improve tonality in real world situations assuming the delta between real world and 'standard' measurement is not too big.
Deviate it will, how much remains the question.

And then I have not even started to include taste and the tonal balance of recordings (which often has deviations as well) into the mix.

Look for a headphone you like In comfort and sound as it is sold.
If you want to take the scientific route look for 'common' issues of a measured model taken on different fixtures and check for those 'common across fixtures' errors that should be corrected and apply an 'average' correction for those errors.
Then use some tone control to make it fit your ears/brain.

Or the less scientific route ... just use a tone control to make it sound the way you like or pick a headphone that sounds nice as it is. Nothing wrong with that.
The implication is that there is a "relative" standard rather than an actual reference. Could you explain why we do not use a head and torso with a pinna and a simulated head to measure loudspeakers? I would submit that starting with the reference currently utilized for loudspeaker measurements could be used to derive a proper acoustic transfer function for headphone responses based on existing physics of acoustics. Rather than complicating it with filters that are not fixed and have a variance or a window or range of functions.
 
There are standards.
The test fixtures are built to the specifications outlined in that standard and have to comply to them within the given tolerances.
What comes out of them is dependent on several conditions as mentioned above.

HATS are used for measuring acoustics for example. For speakers you just need to measure SPL (in certain conditions and axis) and really don't want to modify the sound (HATS) and (try) to undo that later on to get intuitive plots.

Speakers also don't interact with HATS so there is no reason to use them where with headphones the interaction is clear but also is headphone and positioning dependent.
Also with measuring microphones you can measure pretty reliable and well above 8kHz which you can't with HATS.

The current target curves are derived from speakers in (standard) rooms by measuring speakers in a certain relative position and then trying to match the found response with those of headphones mounted on them.
Harman curve is the most popular one but many people are now proponents of tilted DF (diffuse field with a downward slope).
 
Last edited:
I am not sure you are addressing what I said. Perhaps you misunderstood my comment.
No that part is pretty certain. We can academically discuss about slight more boost in low to sub bass, is it error or fixture of it, not even impacted much with age and so on.
I believe in resources and we will see how far will Shon manage to go this way. It's possible to quantities results and make genre/specie classification model (formal deductive logic, today popular called AI). And more by resolving "special" cases (which differ for causes and reasons you determine by researchers directly as this stays in induction line). Which needs a lot of data and work. So far no one even expressed serious intentions to do or fond any serious scientific work in audio reproduction even if future of hearing health (a pressing issue) depends on it (well sort of you can't enforce it to the end regarding use).
I don't believe it will happen until too late and then enforced with low. There are some such lows already particularly in EU but are stupid (max output in mV on mobile phones) and don't really resolve anything.
It's much bigger problem you see.
 
Last edited:

No that part is pretty certain. We can academically discuss about slight more boost in low to sub bass, is it error or fixture of it, not even impacted much with age and so on.
I believe in resources and we will see how far will Shon manage to go this way. It's possible to quantities results and make genre/specie classification model (formal deductive logic, today popular called AI). And more by resolving "special" cases (which differ for causes and reasons you determine by researchers directly as this stays in induction line). Which needs a lot of data and work. So far no one even expressed serious intentions to do or fond any serious scientific work in audio reproduction even if future of hearing health (a pressing issue) depends on it (well sort of you can't enforce it to the end regarding use).
I don't believe it will happen until too late and then enforced with low. There are some such lows already particularly in EU but are stupid (max output in mV on mobile phones) and don't really resolve anything.
It's much bigger problem you see.
I worked with Sean Olive. I am an acoustic engineer. I am fairly clear on the matter at hand. I am not certain that many others understand the problem at the obfuscation that is corrupting the politics of headphone measurements.
 
Back
Top Bottom