ASR Headphone Testing and BK 5128 Hats Measurement System

Soniclife · Aug 6, 2020

JohnYang1997 said:
All new hires head reduces natural ear canal resonances which lead to false measurement results.

How do you know it's wrong? This implies there is a truthful measuring system out there to compare to.

hege · Aug 6, 2020

Please don't waste your precious time on this, there's enough good sources already.

Kouioui · Aug 6, 2020

How about an in-depth measurement and subjective listening test review for a device such as this using typical audiophile headphones? Use Amir's Salon2s for speaker modeling:

Smyth A16 Realiser 2U - SMYTH Research

Tks · Aug 6, 2020

amirm said:
There are some big negatives however:

1. The 5128 extends the simulation limit of the older "711" standard substantially. But with it, it also makes the measurements non-standard so existing research may be difficult to apply to it.

2. The cost. Man, oh the cost. The full HATS has a retail cost of $41,000! There is a truncated one that is a bit cheaper (just the head and no torso). This is a stunning amount of money to spend to measure headphones.

See, it's this kind of stuff that confirms my suspicions, that headphone manufacturers have no idea what they're doing, and the drivers they put into their headphones are just generally "good enough" performing pieces of gear. It's no wonder this industry is plagued with rarely progressing performance with respect to end-point devices (speakers, headphones, IEMs). Every company, and actually the same company at times, puts out new SKU's with ZERO performance relation to the prior SKU. IEM's for example will have sometimes something like what Moondrop Kanas Pro's achieved (extremely low distortion), and every single release after that, has had worse THD metrics. They're all just shooting in the dark.

Look at Sennheisers dumb (or pathetic) ass... The fuck is the HD800 and it's siblings? You put out a relatively good headphone with the 600 series, and then you release this kilobuck+ 800 series plagued with lower end distortion, ON TOP of being a treble spike ear rape nightmare without EQ. And then they supposedly quelled that treble spike (A BIT, not completely or anywhere close to such) with the HD800S, but made the bass distortion worse.

So if a company of that pedigree and size can suffer this sort of nonsense, and here we have a test rig in the $40,000+ range with compromises.. It only stands to reason 95%+ of players in this field are just shooting from the hip anytime they introduce a new product-line.

I personally would've wished we either went the headphone route instead of speakers (those things look doomed as shit from a progress perspective, spending thousands on passive speakers and power amps that are being slapped around for LITERALLY half a decade+ by the Benchmark ABH2). Speakers seem more to be progressing if they're coming in powered variants moreso than passives. But anyway, yeah headphones seem interesting and somewhat manageable (still don't understand how speakers with separate drivers for Bass, handled by the subwoofer, mid-range handled by the standard speaker woofer, and treble handled by the tweeter) can still produce so many awfully performing speakers, with awful crossover points and just distortion galore without a sub woofer. Meanwhile headphones with one driver still somewhat are doing okay (yeah yeah, I know, night and day power demands leading to easier time with lower powered devices requirements).

So even with my favoritism towards headphones..

$41K for this is just crazy. At that price I wouldn't want concessions, I'd want people calling me and paying me to measure prototypes for them for a fee.

solderdude · Aug 6, 2020

Robbo99999 said:
Ah, each measurement rig has a compensation curve then, kind of like a microphone calibration file.

Exactly. for mics this is kind of easy to do. For HATS there are standards to which this should be done.
Unfortunately these methods are not the same as when headphones are tested. On top of this not all headphones are the same so larger drivers will differ from smaller ones even when they would be perfect. Also when slightly different but standardized pinnae are used there will be differences between them.
Furthermore the correction that is applied will differ from real world situations.
One measures what arrives at a certain place (average 'ear canal' and 'average' Pinna) and then applies an 'averaged' single correction curve.
That curve may well fit nicely with certain headphones but won't with all of them. That's what makes a 'standard' a 'standard yet may well deviate from real world, Since there is no 'confidence' number anywhere how do we know the 'standard' is more exact and matching to real life ?

Sure... EQ based on this will very likely improve things which is what is heard. An exact EQ thus will measure really well on the same rig but be many dB's off in reality. Chances of that happening at higher frequencie without a pinna are less likely. Of course a Pinna is more accurate between 1kHz and 5kHz compared to a flatplate. Saying that something is acc. to a standard only means it measures conform a standard.

Robbo99999 said:
My prior idea about measuring known headphones on this rig and comparing against Oratory1990's could still be a way to get to a known standard though. How valid that is depends on how good you think Oratory's measurements are, at the moment I'm not convinced by a couple of you people saying they're no good, especially if intimated from Solderdude who is measuring them on a flat plate, he's got his own thing going on in his approach which is fine but totally different, it's not comparable.

That's where calibration comes in. The problem is calibration to what. perfect headphones do not exist so calibration must be done in other ways.

Robbo99999 said:
EDIT: different EQ profiles could be provided from Amir, one set based on "calibration" to a known standard, e.g. by comparing against Oratory to create a "calibration file", and the other set of EQ profiles could be one without calibration and just EQ'd directly to the Harman Curve. Then people could see which they liked better.

yet more and different correction files on yet another rig ?
My opinion is to look for common traits (like the beyer treble peaks for instance) on different rigs with their own corrections and make an 'average' EQ based on the obvious deviations. This will lead to more exact EQ than just banking on one specific rig with a specific compensation.
regardless of how professional the operator is, how expensive the gear is and experience. Just saying ... this is a (the best) standard and thus it is most 'realistic' is incorrect. Of course, an owner of said device must stand by it and believe what comes out is the absolute 'truth'.

In that sense all 'generated' EQ by definition is most likey wrong but will be similarish on average.
Then apply the similarities and you got rid of test-rig particular errors.

That's what I try to do and 'validate' by ear and FP which is compensated below 1kHz to match HATS performance.
Then apply 'bass correction' based on 'averaged preference' or based on other methods/theories.

So if Amir should test it will be similar to what's out there already and have its own 'deviations' from reality. I also think one should do more than just measure. That last part is tricky and very time consuming. One must 'reset' ones brain with 'references' but then again what is a reference to whom and who declared what is a reference.

restorer-john · Aug 6, 2020

I actually think it's a great idea @amirm My personal opinion on loudspeaker testing was to not do it. I'm happy to say now I was totally wrong- it has proved to be a fabulous addition to ASR.

If B&K supports you though this, I cannot see why you couldn't be on board with their HP test rigs as they evolve and become better in time. There are definite benefits to the test gear companies (AP, Klippel and B&K) as the exposure and usefulness of their gear is demonstrated to a much wider buying audience. If you aren't getting a commission on sales of AP and Klippel, you should be.

There's absolutely nothing wrong with promoting or endorsing objective SOTA test gear her on ASR and that may be a way to perhaps offset some of the pain of purchase.

Vini darko · Aug 6, 2020

How is this better than a block of wood and a fancy microphone? £41000 seems ridiculous to me. Still it'll be fun to play with for a couple of days.

cistercian · Aug 6, 2020

I have wondered for some time how accurate headphone measurements are, particularly over 5kc.
This range always seems to be a mess of peaks and valleys. My experience hearing how good speakers
hear some tones that sound notched out on some phones has been revealing, to me at least, that much
more work needs to be done here. This does not mean I think buying this device makes sense as much as I would like
to see testing on the test rig itself!
I seriously wonder how much differences in different peoples ear canals and volume of them in particular can
result in a frequency response that varies from person to person with headphones...particularly those of closed design.
I think more work needs to be done in this area to move the SOTA further for phone design.

solderdude · Aug 6, 2020

Vini darko said:
How is this better than a block of wood and a fancy microphone? £41000 seems ridiculous to me. Still it'll be fun to play with for a couple of days.

A fancy mic in a block won't be adhering to specific standards. The expensive HATS will.

Accuracy above 5kHz for all headphones is more questionable on HATS than block of wood.
Accuracy between 1kHz and 5kHz will be more accurate on the HATS.
Accurace below 1kHz can be the same when calibrated.

Listening for peaks is easy and can be done by comparing to reference. Listening for dips is difficult.
Using test tones can result in sharp nulls and peaks,
Use narrow noise bands to check with reality. Sweep with sine is possibly by ear but also can be incorrect.

Testing test rigs with a bunch of different headphones could be interesting indeed.
Measuring headphones is very different from speakers.

MZKM · Aug 6, 2020

TBH, I would not be too interested in this (I am interested in knowing how good a model is, but knowing that is more than just FR measurements). While ignoring their scoring, RTINGS is still the best in my opinion, they measure so many aspects including weight, clamping force, heat increase, as well as using multiple people to show FR deviation, which as a glasses-wearer is something very important to me.

For those unfamiliar, here is an example of their headphone review:
https://www.rtings.com/headphones/reviews/jbl/quantum-100

It is likely too involved for your liking, but is needed to paint a full picture.

Also, so many don’t know that we have our own personal target curves for in-ears as it bypasses the pinna, so even if adjusted to the Harman target, a person needs to know how they personally differ from the Harman target in order to gauge tonal balance. This causes a good deal of arguments online as one person hears it as boomy and another as accurate.

Thomas savage · Aug 6, 2020

Soniclife said:
He should get the full torso version, something to hug after she leaves.

@amirm , this is your future!

NDC · Aug 6, 2020

amirm said:
I like the flexibility of the artificial ear/pinna and better reliability and repeatability that this brings.

There are some big negatives however:

1. The 5128 extends the simulation limit of the older "711" standard substantially. But with it, it also makes the measurements non-standard so existing research may be difficult to apply to it.

2. The cost. Man, oh the cost. The full HATS has a retail cost of $41,000! There is a truncated one that is a bit cheaper (just the head and no torso). This is a stunning amount of money to spend to measure headphones.

I have asked BK to give me an evaluation unit to test. After all, I still don't know if this is a good solution or not. They have been kind enough to say Yes and the unit will arrive soon. I only have a few short days when I get it to test and then return it. Let me know what you think I should be measuring/doing with it.

Whatever you all do, don't mention a word about this to my wife! I honestly don't know how to go and tell her I want to spend $5,000 on this let alone nearly 10X that!

Anyway, any and all feedback is welcome including whether we should even bother doing this.

I think it'd be a great addition to the site. I've enjoyed your iterative improvements to measurements over the time I've been reading reviews (even before I joined as a member). I'm sure the same will happen with headphones and we'll all learn from it. I hope you get it and run some measurements!

Jimbob54 · Aug 6, 2020

As a non speaker person I would be infinitely more interested in headphone reviews and measurements than speakers.

BUT I think it would be a world of pain for you as others have said. There's a wealth of measurements out there, all of which have a degree of inconsistency. You would get endless grief.

I suspect it might generate more site traffic though. Its pretty clear what the really popular threads are about, dacs and headamps. Seeing some of the real top end cans debunked as not really much better than mid range or lower might be v popular. But I doubt the manufs would send samples.

Also $40k is silly money

crinacle · Aug 6, 2020

If I may throw my hat (heh) into the ring...

I'm also in the process of procuring an industry standard headphones measuring rig for my headphone measurement database, though I've decided on a GRAS 43AG-7 mainly to keep parity with other pre-existing databases for easier comparability. Of course, good ol' Oratory1990 currently has the largest (public) database of the GRAS "hi-res" types of headphone measurements, and Resolve from Headphones.com has also acquired a 43AG-7 and has been updating pretty regularly. I was halfway considering purchasing the HMS II.3 that was used on InnerFidelity (now in the possession of Dekoni) but it seems to deviate a little too far from IEC60318-7 spec and also pretty unwieldy given my usecase.

The 5128 looks to be a pretty drastic change on both ends of the frequency spectrum, having some anti-resonance properties in the higher end (like the GRAS RA040X, though far less pronounced and IMO more accurate) and also being less sensitive in the lower frequencies (i.e. measures with less bass). The changes were based on research performed by B&K in late 2018, and the paper is linked here in case anyone wants to dissect it. However, given that it is so drastically different (in the context of industry-standard measurement equipment) to B&K's previous iterations of HATS, GRAS' current lineup and comes as pretty much a challenge to the IEC60318-7 standard, anyone who adopts the 5128 would face issue revolving around user readability given that the community hasn't really got the chance to get "used" to 5128-style graphs yet.

But yeah, just some ramblings and my 4,100,000 cents on the matter. The audio community can always use more measurements.

Vini darko · Aug 6, 2020

Welcome crinacle

Jimbob54 · Aug 6, 2020

crinacle said:
If I may throw my hat (heh) into the ring...

I'm also in the process of procuring an industry standard headphones measuring rig for my headphone measurement database, though I've decided on a GRAS 43AG-7 mainly to keep parity with other pre-existing databases for easier comparability. Of course, good ol' Oratory1990 currently has the largest (public) database of the GRAS "hi-res" types of headphone measurements, and Resolve from Headphones.com has also acquired a 43AG-7 and has been updating pretty regularly. I was halfway considering purchasing the HMS II.3 that was used on InnerFidelity (now in the possession of Dekoni) but it seems to deviate a little too far from IEC60318-7 spec and also pretty unwieldy given my usecase.

The 5128 looks to be a pretty drastic change on both ends of the frequency spectrum, having some anti-resonance properties in the higher end (like the GRAS RA040X, though far less pronounced and IMO more accurate) and also being less sensitive in the lower frequencies (i.e. measures with less bass). The changes were based on research performed by B&K in late 2018, and the paper is linked here in case anyone wants to dissect it. However, given that it is so drastically different (in the context of industry-standard measurement equipment) to B&K's previous iterations of HATS, GRAS' current lineup and comes as pretty much a challenge to the IEC60318-7 standard, anyone who adopts the 5128 would face issue revolving around user readability given that the community hasn't really got the chance to get "used" to 5128-style graphs yet.

But yeah, just some ramblings and my 4,100,000 cents on the matter. The audio community can always use more measurements.

Interesting you mention Dekoni. Thats something I meant to put in my post on this thread. The extent to which pad swaps and simple mods impact on questions like these. Lets face it, I suspect a lot of the folks who have an interest in measurements also have an interest in tweaking that performance but presumably a lot of the impact of those tweaks are then judged by ear. I'd be really interested to see stock measurements alongside well known non-invasive mods and pad swaps . I know you and @solderdude often have alternate pad measures, particularly on Beyers etc where there is a lot of possible combos within the range, but not some of the other hacks and pads.

Thats where I see that value to the community can be added. Although software EQ is probably in the ascendancy.

Patrick1958 · Aug 6, 2020

amirm said:
Hello everyone. I am out of my mind to be thinking about getting into headphone testing. But the itch exists and I am reminded of it often in private communication with people asking me why I am not testing headphones.

I had tabled the whole thing for many reasons, most of which is that I am not happy with state of the measurement systems out there for headphones.

There has been a development which may make the measurement situation better. It is the Brüel & Kjær 5128 HATS (head and torso simulator). Here are a couple of quick promotional video on it:

I like the flexibility of the artificial ear/pinna and better reliability and repeatability that this brings.

There are some big negatives however:

1. The 5128 extends the simulation limit of the older "711" standard substantially. But with it, it also makes the measurements non-standard so existing research may be difficult to apply to it.

2. The cost. Man, oh the cost. The full HATS has a retail cost of $41,000! There is a truncated one that is a bit cheaper (just the head and no torso). This is a stunning amount of money to spend to measure headphones.

I have asked BK to give me an evaluation unit to test. After all, I still don't know if this is a good solution or not. They have been kind enough to say Yes and the unit will arrive soon. I only have a few short days when I get it to test and then return it. Let me know what you think I should be measuring/doing with it.

Whatever you all do, don't mention a word about this to my wife! I honestly don't know how to go and tell her I want to spend $5,000 on this let alone nearly 10X that!

Anyway, any and all feedback is welcome including whether we should even bother doing this.

Hi,
Being a headphone enthusiast I would applaud such an endeavor and looking forward to such measurements, not only to be able to apply a custom curve according to your measurements, but also to compare to other measurements found on the net. Oratory being one of my favorites however not all of his EQ presets work evenly good to my ears. But then, an EQ preset can only be a guideline from what you work with and adjust to your own preferences. No two ears hears the same. Solderdude’s measurements are equally good because on some occasions it reveales run to run manufacturing variations (see below).

A few concerns I like to raise.

The headphones measured :

1 : is the headphone provided by manufacturer;
2 : is the headphone provided by members, is the headphone one you purchased yourself or one of your collection;
3 : stock earpads, revised earpads (Sennheiser hd 580, 600, 650 come to mind as used in the latest hd 660s and being different to the older HD 580, 600 and 650 and now only available for the older models), new earpads, medium used earpads, old earpads, custom earpads?

First question :

A headphone provided by manufacturer, wouldn’t that be a golden sample measuring in line with what the manufacturer measured on their equipment, a sample within manufacturing variations? This would also imply the stock earpads are provided. This would not translate to the version I have (see below).

Second question :

To be able to provide an accurate EQ setting one has to rely on the measurements provided by the manufacturer compared to your own measurements, deviations need to be minimal. You need to be sure stock earpads are still present and used for measuring and take into account if they are stock, new, medium used, old or custom designed earpads. Logical conclusion would be new stock earpads, can you confirm this beforehand? That still leaves us with the questions on old or medium used earpads.

Third question :

Are the earpads new, medium used or old, research and measurement found on line show there are measurable differences between old, new and stock earpads, custom earpads measure completely different to stock old, medium used and new earpads. And then I haven’t even mentioned the width of someone’s head translating into the pressure a headphone puts on the head also resulting to a different measurement compared to your measurement rig. The position of the headphone on your head, the measurement parameters of your head.
I understand it all comes to averages, but when is an average no longer an average (read below).

All three question have two common nominations :

What are acceptable variations?
Are run to run variations taken into account, or is that even possible?

To illustrate :

One of my first high end headphones purchased is/was the Sennheiser HD 650. On first impression ….. NOT IMPRESSED. Lack of lower extension, muddy lows and a severe lack of high extension. Never impressed me. Not one of the EQ setting I found on line, not even Oratory’s could bring satisfying listening experience, not even my own experiments. Until a member on this form noticed his HD 650 had a dented driver. Just out of curiosity I checked mine, and behold, the right driver of my headphone also had an inverted dent. After contacting Solderdude he agreed to measure my HD 650 with and without dent and without guarantees trying to rectify the fault/dented driver. Solderdude was able to rectify the fabrication error/fault and in the fowwoling measurements it came to light that I had a version that measured quite different to other/standard models he measured. Lows started to drop of earlier, highs had about 3/4 dB less than the standard models he measured in the past. Again, thanks Solderdude. With his findings I could apply some custom PEQ settings on top of Oratory’s settings which in turn made the Senn HD 650 a very enjoyable headphone to listen to. Hence my earlier statements somewhere in a tread on this form I liked the Drop HD 58X more than the HD 650. His findings and measurements, scroll to bottom.
The Philips X2 HR. I liked the sound signature, however something wasn’t sounding completely right. Again to the rescue Solderdude. Turns out my unit had severe channel imbalance. His measurement were the key to properly eq them on top of the Oratory’s measurements and PEQ measurements.
The Hifiman HE560, apparently there is the original version (first version), but also 2 silent revisions. How to know what version you have? What measurement to rely on? How to apply PEQ? (what sounds best could be an argument, on the other hand I like to EQ according to an EQ conforming to the model measured with version statement and then start to adjust to my liking)
The Audeze EL8 (open or closed) also two silent revisions, same questions.

These four examples (and I’m sure I can come up with more) illustrate the run to run variations so returning to question 1 and 2.

Have you pondered on those?

In the event you will go forward, buy the equipment and start measuring could you please give some insight as to how you will implement and go forward?

Again, I’m all pro for your endeavor, still, some questions and practical applications and usefulness stay open. Meaning, a measurement without PEQ is useless (for me) taking into account you measured a sample within manufactured variations. A 10 band graphic EQ is acceptable. A convolution sample would be ideal if mirrored to the Harman curve (I like the most, also for the graphic and PEQ settings). The Harmon curve sounds the most natural to MY EARS.

I must confess that I just briefly read the speaker measurement and don’t follow any discussion or comments, mostly because what you measured so far is hardly available in Europe…. And to most extend, I seldom listen to speakers living next to a student dormitory trying to respect their time to study. Any speaker listening is only in the weekends (and I take into account if students are present during weekends to study time prior to/or during their exams) and on vacation time.

Don’t consider these comments to be critical to what you intent is or the rig you want to buy, just some questions that come to mind. I’m 100% sure the rig you have in mind will measure accurate, biggest question is how will it compare/translate to listening experience of the average listener?

Patrick.

kaka89 · Aug 6, 2020

I would like to see more speaker measurement first.

q3cpma · Aug 6, 2020

Not that I'm telling you what to do, but aren't you already very (too) busy with the speakers and electronics?

maverickronin · Aug 6, 2020

q3cpma said:
Not that I'm telling you what to do, but aren't you already very (too) busy with the speakers and electronics?

Yeah. Amir is going to burn out at this rate.

ASR Headphone Testing and BK 5128 Hats Measurement System

Major Contributor

Senior Member

Active Member

Major Contributor

Grand Contributor

Grand Contributor

Major Contributor

Senior Member

Grand Contributor

Major Contributor

Grand Contributor

Member

Grand Contributor

Member

Major Contributor

Grand Contributor

Senior Member

Active Member

Major Contributor

Major Contributor

Similar threads