• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

AES 2025 Paper: New targets for the B&K 5128 GRAS 45CA-10

For IEMs, hell yes. Paradoxically, even more so for active ones, since you can't even use the Harman IE target for them, even when the load impedance should be of a lesser importance for them in the range where their active systems operate.
For over-ears, not a necessity, but not a bad thing either. Having both is even better :D.
I'm not sure it's that cut and dry. All I know that is that a lot of enthusiasts on the internet started getting concerned about small deviations above 3kHz, where individual anatomy is known to be the bigger factor of FR variability. Often citing BK5128 measurements as having superior accuracy, which sounds suspiciously like a marketing point. The data I've seen shows a vastly different FR every time you reseat an IEM on BK5128, whereas 711 at least has repeatability going for it, which has to be a subset of accuracy.

Acoustic engineers of note refrain from one-sided points about BK5128 superiority, Herbert Zheng said "No more accurate, just different"(paraphrased). Oratory1990 is hardly in a hurry to discontinue his GRAS 43AC/43AG data, in fact he licenses it for commercial use currently.

Lastly, if B&K didn't end up dominating IEC 60318-7 with BK5128, through Headphones.com and Jules efforts, maybe GRAS and HEAD would see more organic competition. Competition is the driver of progress in audio tech, why would you want less of that?
 
Acoustic engineers of note refrain from one-sided points about BK5128 superiority, Herbert Zheng said "No more accurate, just different"(paraphrased). Oratory1990 is hardly in a hurry to discontinue his GRAS 43AC/43AG data, in fact he licenses it for commercial use currently.

Not that the GRAS data is any more accurate or not! The statistics do not support any equivalence between the brands.

GRAS is the standard. To use the B&K would mean throwing out the old GRAS data and starting over again.


Where the B&K 5128HATS may have an advantage is a audio crash test measurement tool for inside a automobile where a head transfer function would have an advantage.

Harmon, Dolby and the rest do a lot of audio measurements inside the cabin of a car. Lots of money in that market.
 
I have said these differences are audible. Question is whether they get us closer to the truth. Take a topic that you are familiar with: channel consistency. Such variations easily occur and then some in the treble region. The research targets are highly smoothed and averaged. We can't take them and use them as gospel to this level of accuracy.
I'd say unit to unit variation of headphones can certainly be already at that range of deviation, but for me it's more about getting the average more accurate if you took a population of units of a headphone and EQ'd them to the target - the accuracy of the target would make a difference in that scenario in "on average" how accurately each unit was EQ'd.
 
I have a question for the experts that is somewhat on topic here.

What prevents the usage of established mathematical methods like curve fitting from deriving the behavior of a headphone into any acoustic load?
It would seem to me that taking a series of measurements into known loads that vary from high to low could be used to compute the response into any variable load that simulates any ear canal/ear drum.

Let's say we take one coupler that has high impedance across all frequencies, another that has low impedance across all frequencies, another one that slopes low to high and another that slopes high to low, all of them encompassing the acoustic impedance range of all human ear canals.
Could we not mathematically derive the behavior of the headphone when loaded with any arbitrary human ear from those 4 curves?
 
Let's say we take one coupler that has high impedance across all frequencies, another that has low impedance across all frequencies, another one that slopes low to high and another that slopes high to low, all of them encompassing the acoustic impedance range of all human ear canals.
Putting aside the work of measuring a headphone four times, interpolation would not generate any new data, unless the correlation is linear between those points which is not going to be. But let's say it did do that. How would you know your ear's acoustic impedance?

Also, the deal with acoustic impedance is a hypothesis to be proven. There could very well be other factors that explain some of the variations.
 
For IEMs, hell yes. Paradoxically, even more so for active ones, since you can't even use the Harman IE target for them, even when the load impedance should be of a lesser importance for them in the range where their active systems operate.
You have published listening tests to back this? What does "hell yes" mean anyway? What correlation are you getting from measurements to listening preference with 5128 vs GRAS 45CA?
 
I have read this paper a few times and it seems to me that a 'standard' acoustic impedance measurement of headphones sounds like it should be part of every future headphone review. This seems like a 'missing link' between the frequency response deviations from a target and the deviations on human head. That information might get headphone manufacturers to move closer to a more consistent experience on more users head in a similar way to the SINAD measurements moved manufacturers closer to a distortion free device or off axis measurements helped speakers measurements explain in-room response deviations.
 
Putting aside the work of measuring a headphone four times, interpolation would not generate any new data, unless the correlation is linear between those points which is not going to be. But let's say it did do that. How would you know your ear's acoustic impedance?

Also, the deal with acoustic impedance is a hypothesis to be proven. There could very well be other factors that explain some of the variations.
It would save money on expensive couplers and theoretically it could generate a personalized frequency response for prospective buyers. The same way people get custom IEM impressions, maybe they could get their ears measured for acoustic impedance. Then a web tool like squig.link could have the added functionality of generating a personal response from uploaded measured ear canal impedance.
Looking at the significant variations in the same headphone response depending on coupler used, having own acoustic impedance would help with purchasing decisions.
Maybe 4 simple couplers is overkill. Maybe a algorithm could do the math based on two or three?
 
Looking at the significant variations in the same headphone response depending on coupler used, having own acoustic impedance would help with purchasing decisions.
There is not significant difference between what was used in research and GRAS 45CA with KB5000 (what I use). The research fixture generated targets that were preferred by majority of people. Nothing is going to increase that percentage to 100 or even 90%. There is going to be an unknown factor which I attempt to correct for in developing my EQ. Combine the two and you have the solution you are looking for.

Anyone wanting to use the B&K 5128 has the problem you mention. Folks who have jumped both feet in before asking question are the ones that need to perform proper research to develop a target with high correlation with listening tests. It is not my problem to solve as I rejected the 5128 as a suitable fixture for my reviews.
 
I have read this paper a few times and it seems to me that a 'standard' acoustic impedance measurement of headphones sounds like it should be part of every future headphone review. This seems like a 'missing link' between the frequency response deviations from a target and the deviations on human head. That information might get headphone manufacturers to move closer to a more consistent experience on more users head in a similar way to the SINAD measurements moved manufacturers closer to a distortion free device or off axis measurements helped speakers measurements explain in-room response deviations.
Again, that is an issue for people using 5128. The GRAS 45CA/KB5000 does not have such sensitivity. It produces far more consistent results.

We have the real deal here: a fixture (GRAS), a target, and listening tests behind development of a target over a nearly a decade of work. Folks who want to screw around with this formula need to first prove there is any improvement to be had.

As for SINAD, we are there and then some. People need to produce headphones/IEMs that come close or match the Harman curve. Customers then need to apply a bit of EQ or pick from the variations in this theme that better first their preference. Get the industry here and we will be miles ahead of where we are now where only a small portion of the market does this.

5128 adds unneeded complexity to above. It will produce different response for the same headphone, leading to confusion. I have already had this issue with a company wanting to trust their 5128 measurements more. But there was no there, there. Fitment of both IEMs and headphones is also more difficult and hence variable on 5128.
 
I have read this paper a few times and it seems to me that a 'standard' acoustic impedance measurement of headphones sounds like it should be part of every future headphone review. This seems like a 'missing link' between the frequency response deviations from a target and the deviations on human head. That information might get headphone manufacturers to move closer to a more consistent experience on more users head in a similar way to the SINAD measurements moved manufacturers closer to a distortion free device or off axis measurements helped speakers measurements explain in-room response deviations.

I think it could be a useful proxy measurement, but like all proxy measurements it may only really say something about the confidence you can have that the measurements you've performed on an ear simulator are more or less likely to be representative of what a majority of individuals will experience.
An issue that could arise is if headphones A have a somewhat low-ish acoustic impedance over most of the range, but an appalling physical design that's prone to leakage or slippery fit on most heads, and headphones B have a higher acoustic impedance but an exceptional physical design that minimises leakage across most heads, for example.
Ideal scenario I think remains to test HPs on several mannequins representative of real humans, or on the latter directly like Rtings.

It would save money on expensive couplers and theoretically it could generate a personalized frequency response for prospective buyers. The same way people get custom IEM impressions, maybe they could get their ears measured for acoustic impedance. Then a web tool like squig.link could have the added functionality of generating a personal response from uploaded measured ear canal impedance.
Looking at the significant variations in the same headphone response depending on coupler used, having own acoustic impedance would help with purchasing decisions.
Maybe 4 simple couplers is overkill. Maybe a algorithm could do the math based on two or three?

B&K went to a lot of trouble to get the acoustic impedance of real human ears, I'm not sure the methodology used is feasible as a routine measurement : https://arxiv.org/abs/1811.03389
Companies like Apple, Bose, perhaps others (possibly Huawei) and hopefully Harman soon, already have IEMs in the marketplace that can evaluate via the inward facing mic the impedance of the ear canal / eardrum / middle ear, and apply a compensation, but the process is not the sort of maths I understand :D. Harman published an article recently on the subject : https://aes2.org/publications/elibrary-page/?id=22943
I'll leave more knowledgeable people respond your question using known volumes, if you encounter Mad_Economist on HP.com's Discord channel he probably could give you good insights into it.
Also, eardrum / middle ear impedance is not the only factor for inter-individual variation you may need to take into account.
 
Also, the deal with acoustic impedance is a hypothesis to be proven.

Man, if only somebody had already posted in this thread a rather interesting article on the subject published by none other than Harman, one of which authors is Sean Olive ! Oh wait I did :D. Second time in this thread I suggest you to read it, not sure it will help.
TLDR - and not just on the strength of that article - in 2025 it's not an hypothesis.
What I'd like to learn more about however, is whether or not this is the sort of inter-individual variation that's desirable, or not (but for eardrum / middle ear impedance, my sentiment is that it's not, and I'd need to learn more about it or make more experiments of my own).

There could very well be other factors that explain some of the variations.

Indeed there are.

You have published listening tests to back this? What does "hell yes" mean anyway? What correlation are you getting from measurements to listening preference with 5128 vs GRAS 45CA?

I'm not sure you realise yet, despite repeated attempts to enlighten you on the subject, that Harman's body of research says actually rather little about how a given pair of real headphones will be preferred when it's measured on an ear simulator ? Or, to put it more precisely, that with one exception, all notions of inconsistent transfer functions between different headphones and individuals were ignored ? What it does is correlate preferences with different curves, as reproduced by a limited set of specifically chosen headphones or IEMs.

In other words that, in practice, your capacity to predict what this means in terms of preferences :
Screenshot 2025-10-20 at 08.04.58.png

Is made a lot less reliable by this ?
Screenshot 2025-10-17 at 20.58.15.png

Now as far as the 5128 fixture is concerned, it won't help in this regard for the most part, but for IEMs you'll at least start on better footings given that it's more representative of an average human canal (and if you don't understand why that's important, I'll repeat it again, try to answer the question I asked earlier about the Anker A40 with ANC turned on).

Again, that is an issue for people using 5128. The GRAS 45CA/KB5000 does not have such sensitivity. It produces far more consistent results.

We have the real deal here: a fixture (GRAS), a target, and listening tests behind development of a target over a nearly a decade of work. Folks who want to screw around with this formula need to first prove there is any improvement to be had.

As for SINAD, we are there and then some. People need to produce headphones/IEMs that come close or match the Harman curve. Customers then need to apply a bit of EQ or pick from the variations in this theme that better first their preference. Get the industry here and we will be miles ahead of where we are now where only a small portion of the market does this.

5128 adds unneeded complexity to above. It will produce different response for the same headphone, leading to confusion. I have already had this issue with a company wanting to trust their 5128 measurements more. But there was no there, there. Fitment of both IEMs and headphones is also more difficult and hence variable on 5128.

We've tried to explain it to you several times to no avail, this won't help, but you have it backwards. The more prevalent issue here is the headphones being poorly designed, not the fixture.
The main conclusions to draw from the article this thread is about is "some of DCA's headphones (and others) aren't engineered well enough", not "5128 bad".
 
We've tried to explain it to you several times to no avail, this won't help, but you have it backwards. The more prevalent issue here is the headphones being poorly designed, not the fixture.
Nonsense. A fixture needs to quantify both good and bad headphones. No way can it be turned around and say it is the fault of the DUT that the measurements are wrong!
 
The main conclusions to draw from the article this thread is about is "some of DCA's headphones (and others) aren't engineered well enough", not "5128 bad".
That's some logic and tap dance. Someone gives you a random headphone to measure. How do you know if it is "well engineered" vs 5128 backfiring generating unpredictable results?
 
In other words that, in practice, your capacity to predict what this means in terms of preferences :
Screenshot 2025-10-20 at 08.04.58.png
I had no trouble verifying the accuracy of that measurements vs listening. From the review:

Dan Clark NOIRE XO Listening Tests
Company asked me to listen before measuring and that is what I did. I very much liked what I heard although the temptation to measure was strong so that is where I went after a few hours. :) Once I saw the measurements, I decided to test their audibility using equalization:

index.php


Note that the values are tuned by ear.

Overall, I like the effect of EQ better. The headphone handles bass beautifully so might as well goose it up a bit more. :) The other two filters work with it to generate a bit more exciting experience. With or without the filters, the sound is so enjoyable that I have been listening to the XO all week as my everyday headphone. Bass as noted, is clean and deep. The rest of the response reminds of tonality of excellent studio monitors. Spatial qualities are slightly better with the EQ.


This type of evaluation is the only way to reach higher levels of confidence. You can screw around all you want with fixtures, targets, gray bars, etc. and you are not going to be able to move the needle forward. Heck, right now the needle has moved backward with 5128!

Above is very much consistent with Harman research in the way listening tests were performed with equalization of a surrogate headphone.
 
Man, if only somebody had already posted in this thread a rather interesting article on the subject published by none other than Harman, one of which authors is Sean Olive ! Oh wait I did :D.
You did post it. But you don't even know what the research is about let alone draw the conclusion you made. The paper is about using the headphone as a sensor to estimate the response of the ear. It has nothing to do with explaining the misfiring of the 5128 being due to impedance mismatching which is what you are claiming. Or else, the paper we are discussing would not exist.
 
Who on this planet has the exact same head- and ear-shape and skin properties and/or hear and seal as any industry standard fixture ?

Exactly ....

People need to realize that industry standard fixtures are standard so one can get consistent comparative results on the same standard all around the world.
That's why they exist.

So... this means headphones that are measured are only accurate to that particular standard.
There are different fixtures and most can even be configured in many ways, different standards exist, different targets exist (mostly form sounds not a few cm's from the side of the head).

Different headphones measure differently on different fixtures and have different impedance ear canals and different pinnae.
That different impedance in ear canals is responsible for different 'peaking' in the response of the fixture.
Then there is a difference in 'leakage' due to the shape of the fixture where the pads meet the fixture and surface area which also alters the 'load impedance' to the headphone due to leakage. This not only affects low and mid-low frequencies but also higher frequencies.

It is wishful dreaming to think they are accurate to how people perceive them, just as there is no single correct 'target' for all people.
There can only be a best fitting median across a very large group of 'heads' and that might just be slightly 'off' from current standards.

I do not understand the wish for a 'best target' and 'most accurate test fixture' as there is none.
Well there is but only to a specific standard (production tolerances).
They are ALL compliant to some industry standard(s) or not even that and if one is lucky enough to have a similar tonal preference and even seal as then count your blessings.
When this doesn't align that could be for a myriad of reasons.

Amir measures to a standard he chose for reasons that were compelling and chose a target for again logical reasons.
Of course there are different standards, different fixtures and more will be created in the future.

All are/will vary a little leading to somewhat different results. The differences will be MUCH smaller than actual differences between all headphone users.

Perhaps look at the larger picture instead of biatching about this or that fixture or target or numbers.
The differences between them are much smaller than the differences on actual human heads with actual usage.

Sure... maybe this or that fixture and this or that target is closer to a median of some population but that does not mean that a slight difference in found medians will magically lead to better EQ or all headphones sounding very similar.

People want choices in comfort, drive-ability and 'sound'.
Measurebators want accuracy to a standard and want/need to trust (and preferably correlate) the measurements.
Science guys love the science, papers, research and the numbers.
Consumers want reviews that steer them towards something they think they look for.
Some folks want numbers/plots others prefer (flowery) descriptions.
All people have different preferences, biases, knowledge levels and budgets.


/rant.
 
Last edited:
This type of evaluation is the only way to reach higher levels of confidence. You can screw around all you want with fixtures, targets, gray bars, etc. and you are not going to be able to move the needle forward. Heck, right now the needle has moved backward with 5128!

Above is very much consistent with Harman research in the way listening tests were performed with equalization of a surrogate headphone.

I think this is an example of how 5128 measurements confuse the situation.

DCA E3 - 5128

1760950401256.png


1760950501286.png


Listener using measurements from 4128 and 5128 come to a similar EQ to Oratory's.
1760950235193.png

Deviations seen in 5128 are ignored up to 2kHz and states the confidance is low. If someone tried to auto-EQ using only the 5128 measurements, they’d end up with a completely different and wrong-sounding EQ.


Meanwhile with GRAS is clear:
DCA E3 Amir
1760949935066.png


No EQ needed

DCA E3 Oratory
1760949899866.png


No EQ needed, but Oratory provided 4-band EQ.
Preamp: 0.1 dB
Filter 1: ON PK Fc 450Hz Gain -0.7 dB Q 1.4
Filter 2: ON PK Fc 1200 Hz Gain 1.1 dB Q 1
Filter 3: ON PK Fc 2650Hz Gain -1.5 dB Q 3.5
Filter 4: ON HS Fc 2500Hz Gain -1 dB Q 0.71
 

Attachments

  • 1760950111129.png
    1760950111129.png
    43.8 KB · Views: 53
I think this is an example of how 5128 measurements confuse the situation.
Good grief... Those ups and downs are incredible.... Situation is worse than I had imagined!
 
I think this is an example of how 5128 measurements confuse the situation.

DCA E3 - 5128

View attachment 484458

View attachment 484459

Listener using measurements from 4128 and 5128 come to a similar EQ to Oratory's.
View attachment 484457



Meanwhile with GRAS is clear:
DCA E3 Amir
View attachment 484454

DCA E3 Oratory
View attachment 484453

Ignoring deviations from improper measurements like improper seating on the fixture etc., big measurable difference of the "same" HP on different fixtures would lead to the following conclusions:
-if not the same exact headphones were used in each of these tests, the unit variation between HPs of this model is fairly high *or*
-the HP in question shows a fairly large variation between individual listeners *and*
-the HP was developed on the fixture where the measurements corelate well to the expected target curve

How people can draw the conclusion "the 5128 is an improper measurement fixture that produces unreliable and inaccurate data", while variation between humans is far greater than variation between the two test fixtures discussed here plus B&K's incredible track record of producing high quality measurement equipment for decades is beyond me. Do you think Sean would put so much effort into research on this platform if he came to the same conclusion?
 
Back
Top Bottom