• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

HBK Headphone Measurement Talks from Head-Fi and Sean Olive

thewas

Master Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
6,897
Likes
16,898
(good chance I have a way oversimplified idea)
If you put a flat measuring (on a microphone) loudspeaker into an anechoic chamber and measure it using a dummy head&torso rig, don't you get its compensation (neutral target) curve just like that?
This way you would get the free field (FF) compensation curve which was used in the early days but sounds too bright as we don't listen usually to only direct sound. The other extreme is to place loudspeaker(s) in a highly reflective room (for example reverberation chamber) and they you get the difuse field compensation curve which is also though not realistic as normal rooms are somewhere in between and there comes the Harman approach. Additionally the compensation curve changes according to the angle of incidence and used ears/head/torso which is also usually different to the one of individual humans...
 

thewas

Master Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
6,897
Likes
16,898
Stupid Question, how do we know that the B&K is inaccurate while the GRAS is more accurate instead of the other way around?

I mean is there like a 'reference' headphone out there that we're absolutely certain matches the Harman Target that we can use to weed out the bad measurement rigs?
In the end more accurate will be the one that matches the ear geometry and HRTF of an individual human and this already shows its limitation, for exampe rig A might be closer to human C and rig B to human D. For general use though like making generic corrections I would chose the one that is closer to the average of the human population.
 

thewas

Master Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
6,897
Likes
16,898
Do any of the commercial options out there come with a calibration curve?
A given calibration is unfortunately not enough as in the end its behaviour will depend also on the ear channel geometry and placement, so best is to calibrate it in your own ear using a known linear source like a loudspeaker.
 

abdo123

Master Contributor
Forum Donor
Joined
Nov 15, 2020
Messages
7,446
Likes
7,955
Location
Brussels, Belgium
A given calibration is unfortunately not enough as in the end its behaviour will depend also on the ear channel geometry and placement, so best is to calibrate it in your own ear using a known linear source like a loudspeaker.

my neighborhood doesn't have an anechoic chamber :p and i don't think it would be much of a calibration without it even with a linear source.
 

thewas

Master Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
6,897
Likes
16,898
my neighborhood doesn't have an anechoic chamber :p and i don't think it would be much of a calibration without it even with a linear source.
A free field calibration could be done also above the modal region by windowing the measurements but the question is also would you really want a free field compensation? ;)
 

MayaTlab

Addicted to Fun and Learning
Joined
Aug 15, 2020
Messages
956
Likes
1,592
Well, how do you measure that? Would be naive to suggest putting microphones in a bunch of human ears?

That's the basis of that study : https://www.aes.org/e-lib/browse.cfm?elib=17699
Since its conclusions is that on average the GRAS fixture used + custom pinna represents best the average leakage on humans, I believe that this may be the basis for Sean Olive's question at the end of the presentation whether or not the 5128 overestimate leakage ?

Rtings also measures frequency response at low frequencies on five real human subjects and merge it past a few hundred herts with their HATS results :
It's the only resource I'm aware of that does this systematically as part of a headphones review, but Resolve also measures the on-head response to check whether or not he gets the same bass response as on his test rig and Oluv extensively uses in-ear mics as well.

In the same way that we strive to create room independent loudspeakers, should we be trying to create cheek independent headphones?

For frequencies lower than 800Hz or so, they're called "good ANC headphones with a robust feedback mechanism and with a good earcup / pad / yoke / headband design” :D.
Their feedback mechanism ensures that they can deliver an exact dB value in that range for most users, regardless of small leaks (glasses), pad compression, pad deterioration (within reason), effective "volume" of the front volume, etc.

In Rtings' evaluation they tend to be the ones with the most consistent response below 800Hz :
https://www.rtings.com/headphones/1-5/graph#565/7914

In regards to the feedback mechanism’s resilience against varying degree of compression, this is an illustration of their behaviour on my head, as measured with blocked ear canal entrance microphones (mic n°2 in the photo below the graph), during the same measurement session (mics not moved between measurements) :

Screenshot 2021-10-01 at 10.27.18.png

Screenshot 2021-07-25 at 19.11.38.png

Important notes :
  • Individual measurements, these aren’t averaged, right ear only. 1/48 smoothing.
  • The measurements were made with sweeps. For the ANC headphones measurements done with noise were made to check whether or not the signal affected the results (it didn't).
  • The traces were not normalised.
  • The blue trace is the HPs as they naturally sit on my head. The red traces are varying degrees of compression or even pull.
  • The compression / pull was made manually. This is why the repeatability of the exact degree of compression is poor, and some traces are noisy at lower frequencies. But that should be enough to get an idea of how the feedback mechanism works, at least compared to no feedback.
  • The absolute values are incorrect, only look at the relative values between the traces for the same headphones.
  • The relative values between the headphones are most likely incorrect in between around 1.5-4.5kHz (perhaps above), they react differently to the load of the ear canal (remember : these are blocked ear canal measurements), and in different ways, and some are more sensitive to it than others. I repeat : only look at the relative values between the traces for the same HP.
  • The Bose QC45 has a volume dependent EQ. Initial quick tests suggest that it’s a very dumb implementation, that only looks at the internal volume setting to apply an equal loudness compensation curve, and doesn’t look at the digital values of the incoming signal. So you’ll get different results if you test them at internal volume 20% vs. 80%, but not if you compare a test signal at -10dB vs -40dB. 50% (on my Mac or iPhone) corresponds to the volume Iimit I don’t really like to go above when listening to music with them. They’re new and the pads haven’t broken in yet (may play a role, or not).
  • The pads weren’t warmed up before measuring the headphones. I’m not certain but I believe that it plays a small role for some of them (K371 mostly).
You can see that the feedback mechanism successfully maintained a similar response in the range where it operates (up to around 500-600Hz for the Sony, 700-800Hz for the APM, 1-1.5kHz for the Bose).

So that’s as good as it gets for “cheek independent headphones” :D. At least below 800Hz or so.

Above I’m actually starting to think that ANC headphones may show quite a bit of variance across listeners, and / or aren’t superbly well measured ATM in regards to how they behave on an average user’s head, both as a result of their feedback mechanism keeping the FR stable below a certain frequency but not above (so it comparatively “tilts” the response more than most other HPs), and perhaps some extra sensitivity for acoustical reasons. For example, the QC45’s pads are quite thin (there isn’t that much to compress to start with) and yet even minimal variance in compression resulted in pretty large changes in the ear canal gain region (very audible when pressing on one channel for example). Pad compression here is used as a proxy for other things that could affect the front volume. Perhaps varying anatomy around the ear may change what happens in the front volume for example, independent of pad compression, I have no idea.

I don’t know of any way to try to make the FR more predictable / controlled / desirable for a large swath of listeners at higher frequencies. I may be wrong but I believe that it’s a complex interplay between individuals’ own HRTF and the variance of HPs’ response on their own head. Some variance may be desirable, if it goes in the direction of their individual HRTFs (what it would have looked like if they themselves had been measured in a "decent room with decent speakers", for stereo recordings at least), some may not if it doesn’t.

The AKG N90Q’s calibration system supposedly worked up to a few kHz but I don’t know anything about it.

I'm tempted to think that above 800Hz open headphones that have been extensively measured, with low spatial averaging variation, fairly good consistency between test rigs, and for which we have some idea of how they behave on real humans, such as the HD6... series, might be our best option for something as "cheek independent" as possible.

Slide 84 of @Sean Olive's presentation mentioned that the headphones were tested with varying degree of compression, intended to "study interaction between damping of mechanical resonances (cup/headband) and leakage effect and frequency response", and that they also measured the "response inside cup with MEMS microphone to be compared with same measurements made on human subjects", I'd love to learn more about that if "Harman Legal" allows it. Seems like "part II" of the 2015 study.
 

someguyontheinternet

Active Member
Joined
Apr 16, 2021
Messages
194
Likes
335
Location
Germany
I am not sure how nobody so far didn't notice that 9khz peak, it sounds just horrible to me, also dunno why it doesn't show up in Amir's measurements, but that peak is noticeable in the other measurement to some degree.
To me the headphone sounded just horrible, not sure why many claim it to be the "world's best closed headphone". a complete letdown soundwise.
If noone else has a 9k peak, the reasonable assumption to me would be that your ears create that peak since your case is the odd one out.
It would be helpful if you provide links to the "other measurements" you are referring to so we can get a more complete understanding on how you reached the conclusion that the 9k peak may be inherent to the headphone rather than your ears.
 

oluvsgadgets

Member
Reviewer
Joined
Feb 27, 2021
Messages
29
Likes
99
If noone else has a 9k peak, the reasonable assumption to me would be that your ears create that peak since your case is the odd one out.
It would be helpful if you provide links to the "other measurements" you are referring to so we can get a more complete understanding on how you reached the conclusion that the 9k peak may be inherent to the headphone rather than your ears.
the peak is visible in the other measurement posted by Mr. Olive.
Could be my ear too who knows, but HD600, DT770 etc sound all normal to me, comform well with others' measurements when measured with my ears.
 

someguyontheinternet

Active Member
Joined
Apr 16, 2021
Messages
194
Likes
335
Location
Germany
Stupid Question, how do we know that the B&K is inaccurate while the GRAS is more accurate instead of the other way around?

I mean is there like a 'reference' headphone out there that we're absolutely certain matches the Harman Target that we can use to weed out the bad measurement rigs?
"Inaccurate" would be depend on the purpose you are using the system for. Since Harman research was conducted on specific fixtures the target curve does not necessarily translate well to a fixture that is different.
It's not that one is necessarily more "inaccurate", but the fact that frequency response will be measured differently by different systems. How well a specific system matches the average human or your own ears specifically can be a bit tricky to determine.
 

MayaTlab

Addicted to Fun and Learning
Joined
Aug 15, 2020
Messages
956
Likes
1,592
Do any of the commercial options out there come with a calibration curve?

In addition to @thewas comment, if I may, I think that there is a more salient problem. Above 800Hz or so, the results you'll get will vary in absolute terms depending on where the hypothetical microphone is in your ear (in your concha ? protruding from the ear canal entrance ? Slightly recessed from the ear canal entrance ? Near the DRP ?), and whether or not the ear canal is open or blocked, regardless of its calibration.

Even worse, the relative difference between headphones might also be inaccurate past a certain value or in certain ranges of frequencies depending on these factors.

As an illustration to that, something that I quite like to do so far with my own in-situ measurements is to take a pair of headphones with known low seatings to seatings variation (HD650 for example), and calculate the relative difference between it and some of my other headphones for each one of the in-ear mics I use.

Here's an example of that using the HD650 as a reference, and plotting the difference between it and the H400SE (fuchsia traces), HD560S (blue traces), Hi-X65 (green traces), for the microphones in the photo in my post above. These measurements were all made the same day with the headphones in the same condition. All four headphones were measured for each mic during the same session (microphone wasn't moved). They're averages of five seatings.

Screenshot 2021-07-25 at 19.00.40.png

Remember, we're only looking at the relative difference between headphones. Ie "how do they differ".
Even with that limited ambition we can see that the microphones start to disagree at around 1.5-2kHz and that it gets worse as the frequency increases.

The articles that I've seen that attempt to characterise how headphones behave on real humans past 1kHz or so tend to use either ear canal entrance mics or probe tube mics like these : https://www.etymotic.com/product/er-7c/

Probably the sort of research that's quite difficult to do.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,050
Likes
36,420
Location
The Neitherlands
Jude Presentation
I was very surprised and disappointed in most of this presentation. Bulk of time was spent showing and ridiculing the DIY measurement rigs members of his own forum had created. He showed pictures of them and while correctly stating some issues with them, I just could not figure out what he is after.
He probably wanted to make a point only measurements done with calibrated fixtures adhering to certain standards tell the truth.
Some of the DIY rigs are abominations though.

Hobbyist are creating these measurements because headphone companies are not providing them. If Jude wanted to improve things, he should have complained about lack of such measurements from that sector which would be likely to be in this conference, than DIY people.
Yes, great point. There are many folks measuring these days but mostly the mainstream headphones. The more obscure one, understandably, not so much.
Especially for those attempting to 'improve' headphones at least will have some idea of the changes they made.
When manufacturers would provide plots, should they adhere to standards ?
I have seen plots from manufacturers that seem to be drawn by hand even and have little relation to reality.
Fortunately the second half of the talk was better in that he showed a bit about how they measure headphones which seems to be following what Tyll did with use of square wave and such. I am personally not a fan of driving headphone or speakers with square wave, especially a low frequency one. This can be hard on the transducer with the long duty cycle essentially being DC. I saw little justification for this method other than a hack to show the frequency response, sort of, using crude FFT.

Yep, have blown up 2 of those super sensitive 20mW rated drivers already. :(

The other thing he mentioned is that they no longer calibrate at one frequency and instead use white noise. I don't understand the merit of this either as matching the measurements to target needs to be done in a way that relates to the research. Credit to him he asked for feedback from audience but none was provided. Target matching is a visual thing for humans anyway so ultimately it doesn't matter per se.

Assuming the goal is to put a number on how loud a headphone is perceived when a certain voltage or power is applied.

I have to say I am kind of a proponent for this but NOT with white nor pink noise. I would think a noise band from 300Hz to 3kHz would do the trick. This requires some calibration steps though. I will elaborate. Some headphones have dips or are elevated at 400Hz (Amir) or 1kHz (some others) and 400Hz and 1kHz points don't always 'line up'.
Bass frequencies that are elevated or lower than the target hardly contribute to how loud headphones are perceived. Only if they are found bassy or bass shy. Pink and white noise thus should not be used to determine an average.
Frequencies above 4kHz still have an influence on how loud a headphone sounds but is more about how 'sharp' or 'bright' a headphone is.
How loud a headphone is perceived is mostly (but not completely) determined by how loud the average level is in the 300Hz to 3kHz range.
When headphones have lots of upper mids or lots of lower mids has a similar-ish impression of loudness.
Dips (or peaks) at a fixed frequency can also skew results. The measurements at 400Hz and 1kHz in that case are only correct for those frequencies and can be several dB's 'off' with the average.
For this reason the band limited noise method seems like a quick way of obtaining a relevant number.

I do this by using the known efficiency of a headphone (HD650) and measuring the DUT on the same volume setting.
Then I allign the mid range to the average of both headphones on sight and the difference in level (easy to see in REW) is added, or subtracted from the known HD650 voltage sensitivity.

In QA section he was asked what his favorite headphone was. He said the Sennheiser HE-1 ($45,000). He was asked if he had measured it. Shockingly he said no! Gosh that was awkward when you are in a measurement seminar and you don't believe in this stuff to practice it.
I have to say.. the best headphone I ever heard was HE-1 (caveat: approx. 15 mins for what that's worth with 'demo' music). Also heard a few other ones that could kind of compete on certain aspects and haven't heard some TOTL headphones that I may like even more.
But I haven't measured most of them. Still... based on what I heard, but not measured, still makes them favorite(s).

Dr. Olive Presentation

They developed a compensation curve for 5128 relative to Harman target but alas, not all headphones showed the same differential. Using this new target, the above headphone showed an error of I think 6 points. But there are others that cannot be fixed this way.

I would assume all of these test rigs will comply to standards when it comes to FF or DF measurements in an-echoic conditions.
This shows that measuring headphones is not an exact science. I mean the results are there, raw or compensated, but different headphones will show different deviations on different fixtures. Which one is the most 'correct' is a valid question with headphones. Could be rig A for headphone X and rig B for headphone Y. The delta between headphones and rigs may say something about that.
 

someguyontheinternet

Active Member
Joined
Apr 16, 2021
Messages
194
Likes
335
Location
Germany
the peak is visible in the other measurement posted by Mr. Olive.
Could be my ear too who knows, but HD600, DT770 etc sound all normal to me, comform well with others' measurements when measured with my ears.
After comparing a couple of measurements
Airpod Max ASR - Airpod Max Headfi
HD650 ASR - HD650 Headfi
HD800s ASR - HD800s Headfi
There seems to be a big difference between the BK used by Headfi and the Gras used by Amir especially in the treble area. Both fixtures try to approximate some average human ear.
However looking at the Empyrean on Gras by Oratory vs the Empyrean on BK by Headfi the measurements are much closer.

It's a bit difficult to draw conclusions here. My speculation is that there are some resonances involved that require a certain fit (or earcup size/depth) or chassis material to appear with different ear models (or human ears). But since I have neither the equipment or experience necessary to further research into this I will leave that topic for the more qualified to discuss.
 

Scgorg

Active Member
Joined
Jun 20, 2020
Messages
129
Likes
425
Location
Norway
But that is due to your own ears and mic compensation, for example on Oratory both the Elex and Utopia show a dip around 9 kHz


and its logical that if those measure neutral there for you the Stealth will have a peak there.
Important to note that the GRAS rig is supposed to have a 9-10khz dip, and that is an effect of the artifical ears used. If something measures without a dip around that frequency on a 43AG that most likely means it is too bright. If you look at 43AG measurements you'll find that a whole lot (most?) of them show that exact same high-Q dip.
 

nyxnyxnyx

Addicted to Fun and Learning
Joined
May 22, 2019
Messages
506
Likes
475
I think it’s more likely Jude named the HE1 as his favourite headphone as they are not realistically in competition with any of his sponsors. They are a safe bet to keep the sponsors happy.
It could be that too, so I think it's fair to assume both ways rather than thinking so one-sided like "he says it cuz $$$$" or "he says it cuz HE1 is obviously the best" because we can't truly know what a person really thinks.
 
  • Like
Reactions: 617

Sean Olive

Senior Member
Audio Luminary
Technical Expert
Joined
Jul 31, 2019
Messages
334
Likes
3,065
Well, how do you measure that? Would be naive to suggest putting microphones in a bunch of human ears?

In the same way that we strive to create room independent loudspeakers, should we be trying to create cheek independent headphones?

Perhaps a good consumer test of a headphone should include different cheek shapes. Ergonomics are huge in headphone quality and cannot be separated from audio quality.
That's exactly how I would do it. That's how Todd Welti did it in his previous study. The 2nd picture shows the FR of 10 headphones (rows) on eight subjects (columns) measured with a microphone measured at the blocked entrance of the ear canal. compared to measurements made on on a flat plate (no pinna) and the GRAS45CA with the old pinna prior to the anthropomorphic pinna.

You can see that the leakage varies quite a lot among subjects. The flat plate underestimates leakage on humans and the old pinna overestimates it.

1633103529737.png

1633103583856.png
 

CedarX

Addicted to Fun and Learning
Forum Donor
Joined
Jul 1, 2021
Messages
510
Likes
819
Location
USA
How difficult would it be to develop test procedures and fixtures intended at calibrating an empirical (mathematical) model of the headphone instead of characterizing its response on a particular test fixture? The model could be used to correlate and predict responses on GRAS, BK, or even an individual HRTF. The ‘targets FR’ would become ‘target models’ correlated to represent specific populations groups.
Too complex? Dream only?
 

617

Major Contributor
Forum Donor
Joined
Mar 18, 2019
Messages
2,433
Likes
5,383
Location
Somerville, MA
How difficult would it be to develop test procedures and fixtures intended at calibrating an empirical (mathematical) model of the headphone instead of characterizing its response on a particular test fixture? The model could be used to correlate and predict responses on GRAS, BK, or even an individual HRTF. The ‘targets FR’ would become ‘target models’ correlated to represent specific populations groups.
Too complex? Dream only?
I think we need to think big if we ever want to have meaningful reviews of headphones.

Maybe it will become a situation where the consumer has to know what 'inner ear shape category' and 'face bumpiness category' they are in. Like when you buy running shoes, you need to know whether your under or over-pronate, or when you buy clothes you need to know your proportions.

On the other hand, after reviewing the slide posted above with the 8 human measurements, it doesn't look like the human variation is vast. Significant, but not vast.

Also, you know what headphone I would love to see analyzed in this context? The famous AKG K1000. I wonder if the added distance and driver positioning create a more ear-invariant acoustic environment.
 
Top Bottom