• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

ASR Headphone Testing and BK 5128 Hats Measurement System

OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,720
Likes
241,538
Location
Seattle Area
What ear did you measure (or is this the average?)? Is this a single measurement or an average?
Thanks a bunch! That was great work you did there.

This was a quick test. I did measure both ears and response is generally the same. Repeatability was also very good in a bit of playing I did. The pinna is incredibly lifelike and supple.

I will capture more measurements to quantify this today.

Question remains on what is the truth here? How do we determine that? There are much cheaper rigs if we can't build confidence in higher frequencies.
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,720
Likes
241,538
Location
Seattle Area
Sure, but I think its still a good overview to the headphone measurements topic and issues.
I watched it a while ago. It was useful but not for someone who already knows the topic. :)
 

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
7,007
Likes
6,874
Location
UK
At long last, the unit arrived! Spent the last couple of hours trying to reformat the calibration files to something AP software can accept.

Here is the raw response of the HD-650:

View attachment 77842

It seems to generally agree with other measurements out there until about 5 kHz or so. How trustworthy the rest is, is the question.

I took the free-field correction from 5128 (0 degree elevation and azimuth) which looks like this:

View attachment 77844

I inverted it, used it as EQ and remeasured to get the green curve:

View attachment 77843

That peak around 9 kHz is not like anything shown with other gear. There may be pilot error but this is what I have in the first 2 hours of messing with it. :)

Comments welcome.
Awesome that it's up & running! No time for me today to contribute or do any analysis, but tomorrow I'll read others' posts on it & contribute any thoughts I deem worthy tomorrow when I have time. Good to see it up & running!
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,068
Likes
36,481
Location
The Neitherlands
Question remains on what is the truth here? How do we determine that?

I assume your speakers are pretty good sounding and have the proper tonal balance. I assume 'flat response' as in 'including room compensation' so max realism in the sound. I am pretty confident you know realism in sound.
May I suggest to play music (or pink noise) and try to equalize (by hand) the HD650 to sound near identical to your speaker(s)
Then run a measurement (or a bunch with slightly different positions) and look at the raw plot.
Then see if you can create a compensation curve to get a 'horizontal' line as a result (or close to it) and use that as a compensation curve.
The next challenge is to apply that curve to other headphones and verify with EQing these acc. to the plots and hear if these too sound close to the speakers.
This is going to consume a lot of time though. I also did this to calibrate my rig but only with HD650. The T50RP I often used back then has a lot more pinna activation which I lack but the HATS does not. I know my rig can be 'off' between 1kHz and 6kHz or so. I usually check with Rtings for pinna effects or look at Oratory to see how far I am 'off' in that area and adjust EQ if needed in that band but not outside.

I suspect/belive that a combination of both methods is likely to produce more accurate 'exact' EQ than just trusting one method only.
 

Mad_Economist

Addicted to Fun and Learning
Audio Company
Joined
Nov 29, 2017
Messages
555
Likes
1,621
At long last, the unit arrived! Spent the last couple of hours trying to reformat the calibration files to something AP software can accept.

Here is the raw response of the HD-650:

View attachment 77842

It seems to generally agree with other measurements out there until about 5 kHz or so. How trustworthy the rest is, is the question.

I took the free-field correction from 5128 (0 degree elevation and azimuth) which looks like this:

View attachment 77844

I inverted it, used it as EQ and remeasured to get the green curve:

View attachment 77843

That peak around 9 kHz is not like anything shown with other gear. There may be pilot error but this is what I have in the first 2 hours of messing with it. :)

Comments welcome.
The fairly high Q feature in the treble is definitely a compensation artifact here - that high Q notch in the 0/0 FFHRTF is one of the reasons free field does not sound right to people, although it's dodgy at a number of levels.


Ofcourse my cheap pinna and earcanal less totally incorrect and not trustworthy measurements of what comes from the driver tells a different story.
I'm not here to re-litigate our past dialogues regarding the omission of pinnae and canal impedance, but I must point out that this isn't right. If you want to measure "what comes from the driver", you'd need to measure it in free field. Any coupled measurement is a measurement of the system formed by the driver and the front chamber of the headphone (and its contents) - leaving the ear out doesn't change that, it just makes the environment dissimilar to how it will behave with a human listener.

Is the new HATS Amir is testing what they were talking about ?
I believe that will be discussing GRAS' high-res ear simulators and anthropomorphic pinnae, which are based on the IEC711/60318-4 standard, unlike the new ear sims in the 5128.


Question remains on what is the truth here? How do we determine that? There are much cheaper rigs if we can't build confidence in higher frequencies.
I would generally assume that "truth" in this case would take the form either of "closing the circle of confusion" type approaches of "measurement approximates subjective perception of frequency response", or the "consumer utility" approach of "measurement shows deviation from most likely to be preferred response"; notably, Olive's work generally indicates that there's a substantial overlap between those two...

What are you looking for here? Your data looks within the bounds of what I'd expect thus far, but I'm not sure what makes this a worthwhile endeavor from your standpoint...
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,068
Likes
36,481
Location
The Neitherlands
I'm not here to re-litigate our past dialogues regarding the omission of pinnae and canal impedance, but I must point out that this isn't right. If you want to measure "what comes from the driver", you'd need to measure it in free field.

I actually did that as well. Aside from the massive roll-off in the lows the treble part was remarkably similar but obviously not the same. In any case it did not give the massive (up to 20dB) differences we see in HATS.
I even had a lengthy discussion with some that wanting to combine freefield for the treble and closed for the lower frequencies. He couldn't make that work and combine properly.
I know it does not conform to any standards and the pinna is essential (it is up to 6kHz anyway) but sweeping and white noise tests compared to speakers showed me the peaks that some HATS show are not there in reality. At least I do not hear it and high frequencies correlate better to my (also corrected) treble measurements. So... I stick to that.

I hope Amir also tries to correlate what he hears with what he measures and compares this to standards. I think you will have to agree that the vast spread between HATS above 6kHz is caused by the HATS, the headphone itself and the compensation used.
How can this be 'trusted' ?
 
Last edited:

Mad_Economist

Addicted to Fun and Learning
Audio Company
Joined
Nov 29, 2017
Messages
555
Likes
1,621
I actually did that as well. Aside from the massive roll-off the treble part was remarkably similar but not the same. In any case it did not give the massive (up to 20dB) differences we see in HATS.
I even had a lengthy discussion with some that wanting to combine freefield for the treble and closed for the lower frequencies. He couldn't make that work and combine properly.
I know it does not conform to any standards and the pinna is essential (it is up to 6kHz anyway) but sweeping and white noise tests compared to speakers showed me the peaks that some HATS show are not there in reality. At least I do not hear it and high frequencies correlate better to my (also corrected) treble measurements. So... I stick to that.

I hope Amir also tries to correlate what he hears with what he measures and compares this to standards. I think you will have to agree that the vast spread between HATS above 6kHz is caused by the HATS, the headphone itself and the compensation used.
How can this be 'trusted' ?

I have measured drivers in free field as well - it's an interesting opportunity to look at their behavior "ceteris paribus", and IMO quite worthwhile in the process of driver selection and design. The congruence of free field measurements and a coupled measurement omitting a pinna, however, will depend in part on the construction of the front volume of the headphone (inclusive of the earpads and any space between the driver's front output and the pads) - I would imagine that if you measured in free field with the earpads, you would get something closer to what an earless flat plate shows, but this of course would have the same issues.

Variation in response above 1khz or so has a few main components:
* positional/placement variation with the headphone itself, which generally rises as a function of frequency from the midrange, but is more extreme for some models than others
* when different units of a given headphone are used, the variation between the units in question - this isn't just a question of manufacturing tolerances, even small differences in pad wear can be quite substantial in frequency response impacts
* variation in the compensation applied (ex. all HRTFs will differ in the >3khz band, so if one plot is DF compensated and another is free field 0-0, you will see very large differences from this alone), including the way that the compensation data was derived (a 1/3oct smoothed population average free field 0-0 HRTF isn't going to spit out the same output as an individualized 1/12oct smoothed one)
* variation in the HRTF of the measurement systems, although in premise barring "individual hrtf-headphone interactions" this should be cancelled by the use of a standardized compensation that is specific to the measurement system in question.

Human HRTFs have a very broad spread in the 4+khz band, so looking at raw responses, it is unsurprising when variation occurs in this area; indeed, the mannequins we use have lower variation due to their intention as population averages. Consider the spread observable in 40 human DF-HRTFs from Hammershøi & Møller's Design Criteria for Headphones (1995):
H&M 1995 HRTF.png

Human head/ear anatomy is simply rather variable, yielding differing HRTFs for different people in the same circumstances... including, as their other neat paper from 1995 shows, headphones:
H&M 1995 HpTF.png


Somewhere up-thread I'd written about my complaints regarding subjective matching between different apparent acoustic sources, so I'll leave that aside for the moment, but I have to ask, what are we to compare to with "what we hear" even at the best of times? Ignore the SLD effect and all that, my speakers - or Amir's speakers, or what have you - are the definitive reference for headphones? But what's the reference for my speakers? I'm rehashing Olive, as always, here, but it has a lot of pertinence when you're thinking about what we hear, what we measure, and how they "should" relate.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,068
Likes
36,481
Location
The Neitherlands
I know there is tons of research done on this subject.
From the plots and reasons you mentioned (which I looked into as well) from a simple engineers pragmatic viewpoint and not from a scientists viewpoint it is very clear to me (and to you) that both methods have their drawbacks and incorrectness's.
Yet, when looking at measurements made on a HATS (which should encounter the same hurdles as a human head) I find the smoothness of even not smoothed plots above 5kHz to be more inline with what I perceive than the plots seen with some HATS.
Yes, I am aware of all the research and not claiming it is nonsense. In my pragmatic approach I simply found more correlation

I am curious to see what's coming out of the HATS Amir is reviewing but expect this to be more of the same.

Of course you are correct about references. Speakers in rooms are no real reference of course. If the goal is to 'mimic' reference speakers in a room then I would say a headphone should sound like a good pair of speakers in a room.
My method is a pragmatic one that achieves this but relies on the listener to know what noise should sound like, experience as it will.

So as Amir is an experienced listener, has a good set of speakers, access to measurement equipment and no access to Harman standard room nor Olive's correction file (which I suspect he has) you have to do 'something' and this is what 'worked' for me using the HD650 which Amir happens to own and is familiar with. So I proposed this and if Amir sees this mere proposal to have some merit he can try this to find 'a' reference to what Amir has a reference. At least I expect Amir has his speaker setup setup for enjoyment of music, not as a lab.
Amir wondered about a reference and while not scientifically sound gave him my opinion.

I have no papers, research, scientific formulas to share. I looked at the evidence around us (which is obvious and documented), listen to music and noise, compared it and draw my own conclusion.
I have been told many times what I do is wrong. I don't mind. It has a high correlation to me and is what started this for me so I can EQ that works for me.
 

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
650 response looks acceptable. Needs a little bit adjustment in the eyes that's it. But what about in ears what's where the issue lies.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,068
Likes
36,481
Location
The Neitherlands
Fortunately for me I hate in-ears so never looked into that can of worms.
Insertion depth, seal, ear canal shapes.

personally I don't believe measurement systems will ever be created that work within 1dB of reality with all headphones. No matter how much research is done. Solution... audition headphones. Listen what suits you and feels O.K. and enjoy. Adjust to our taste.
Do-able, practical can be done by everyone and leads to satisfactory results.
Measurements of all sorts can give some 'guidance' but no measurements should ever be regarded as 'objectively correct' just because it is measured to 'a standard'. As said measurements help as a guide but above 5kHz should be take with a shitload of salt.
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,720
Likes
241,538
Location
Seattle Area
FYI we have been working behind the scenes non-stop trying to develop a target curve and get to some basic level of measurements that make sense. As soon as we converge to that, I will start to post results.
 

vkvedam

Addicted to Fun and Learning
Forum Donor
Joined
Apr 12, 2019
Messages
583
Likes
807
Location
Coventry, UK
As alternative I recommend just get a Headacoustics head. It will be consistent with Tyll's and Rtings' measurements.
Hehe, this we use for our tests and assessments for the cabin noise :), I was going to suggest that :D
 

Mad_Economist

Addicted to Fun and Learning
Audio Company
Joined
Nov 29, 2017
Messages
555
Likes
1,621
Last edited:

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
7,007
Likes
6,874
Location
UK
I have measured drivers in free field as well - it's an interesting opportunity to look at their behavior "ceteris paribus", and IMO quite worthwhile in the process of driver selection and design. The congruence of free field measurements and a coupled measurement omitting a pinna, however, will depend in part on the construction of the front volume of the headphone (inclusive of the earpads and any space between the driver's front output and the pads) - I would imagine that if you measured in free field with the earpads, you would get something closer to what an earless flat plate shows, but this of course would have the same issues.

Variation in response above 1khz or so has a few main components:
* positional/placement variation with the headphone itself, which generally rises as a function of frequency from the midrange, but is more extreme for some models than others
* when different units of a given headphone are used, the variation between the units in question - this isn't just a question of manufacturing tolerances, even small differences in pad wear can be quite substantial in frequency response impacts
* variation in the compensation applied (ex. all HRTFs will differ in the >3khz band, so if one plot is DF compensated and another is free field 0-0, you will see very large differences from this alone), including the way that the compensation data was derived (a 1/3oct smoothed population average free field 0-0 HRTF isn't going to spit out the same output as an individualized 1/12oct smoothed one)
* variation in the HRTF of the measurement systems, although in premise barring "individual hrtf-headphone interactions" this should be cancelled by the use of a standardized compensation that is specific to the measurement system in question.

Human HRTFs have a very broad spread in the 4+khz band, so looking at raw responses, it is unsurprising when variation occurs in this area; indeed, the mannequins we use have lower variation due to their intention as population averages. Consider the spread observable in 40 human DF-HRTFs from Hammershøi & Møller's Design Criteria for Headphones (1995):
View attachment 77964
Human head/ear anatomy is simply rather variable, yielding differing HRTFs for different people in the same circumstances... including, as their other neat paper from 1995 shows, headphones:
View attachment 77968

Somewhere up-thread I'd written about my complaints regarding subjective matching between different apparent acoustic sources, so I'll leave that aside for the moment, but I have to ask, what are we to compare to with "what we hear" even at the best of times? Ignore the SLD effect and all that, my speakers - or Amir's speakers, or what have you - are the definitive reference for headphones? But what's the reference for my speakers? I'm rehashing Olive, as always, here, but it has a lot of pertinence when you're thinking about what we hear, what we measure, and how they "should" relate.
That's some interesting data re the measured variance of human subject HRTF (Fig 5), I find the standard deviation graph to be one of the most useful there, as it's showing the bracket within which 68% of the population would fall. Up to 3kHz it's showing that 68% of humans would fall only +/- 1dB of the Target Response (mean). So for sure you can say HATS in terms of HRTF are valid up to 3kHz. From 4-7kHz you've got about +/-2.5dB variation, and above 7kHz you got about +/-5dB variation. So taking that all into consideration then I would think measurements are meaningless above 7kHz unless individuals responses above 7kHz are a predictable/consistent "Shelf dB" above or below the mean, in which case a person could experiment with Shelf EQ's above & below the calculated EQ (e.g. Oratory1990 for example)??

My previous paragraph, that just addresses HRTF calculated from "speakers in a room", but then you've got your HpTF that you mentioned on top of that as an additional variable and source of variance within a population. This is what you were showing in Fig 6. I think. Regarding variance of HpTF as seen in those graphs it seems to be about the same level of variance in HpTF as there is with HRTF and following the same patterns across the frequency range in terms of greater variance at the higher frequencies.

So given we have two sources of variance, both HRTF [(calculated from "speakers in a room") which is used to create any Target Frequency Response we use for headphone EQ] and HpTF which just shows the variance of a given headphones Frequency Response for any individual then where does that leave us in terms of how accurate EQ's can be that are based on measurements on Dummy Heads? I suppose combining the two variances of HRTF and HpTF this is further magnifying the overall error, I suppose you could work out mathematically how much that variance increases when you combine them given that variance of each variable seems about equal - would it double the overall variance or is it something like a 1.5 factor? EDIT: although in a previous post you said you've found HpTF and HRTF are often linked, so then that would indicate that you think when combining the variance of HpTF & HRTF that the overall variance is not as large as one would initially think, what kind of an increased variance factor would it be do you think?

Going back to an observation I made in the first paragraph that most variance is above 7kHz, can we use Shelf EQ's above 7kHz to manipulate EQ's that have been created on dummy heads to experiment ourselves as to what sounds most accurate, or does an individuals ear frequency response above 7kHz not follow the general trends at all, and thereby "Shelf EQ technique" above 7kHz holds no water?

Again, a lot of the validity of headphone measurements along with the validity of Target Responses will come down to how far an individuals physical anatomy varies from the dummy standard as a whole, although it seems pretty darn reliable & indisputable up to the 2kHz.

As for the validity of the B&K 5128 and the validity of the ASR headphone project.....I think we have to accept that there are all these variances within the population that we've talked about here, and if we're gonna do it then it would have to be something above & beyond what is offered on other sites. If the B&K 5128 itself is not inherently more "accurate" than the equipment being used by other sites then that's not a differentiation point either.....cursory evaluation seems to suggest B&K 5128 offering very similar results to other sites for HD650 it seems (I've not looked at this in detail, just gone off other members good posting in this thread). So any differentiation to other sites will come down to what we do with the measurements in terms of interpretations related to comparative headphone quality between different headphones (that might come under things like distortion & other measured variables incl frequency response), and perhaps of course we can also offer headphone EQ service in terms of offering filters for people.......I think we gotta do something different if B&K 5128 itself isn't proving to be "next gen" or "anything special".
 
Last edited:

vkvedam

Addicted to Fun and Learning
Forum Donor
Joined
Apr 12, 2019
Messages
583
Likes
807
Location
Coventry, UK

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
7,007
Likes
6,874
Location
UK
I have them but I am not sure if they are supposed to be confidential or not. Will ask.
If they're providing the data in the form of a graph, then I would imagine the raw data is not confidential because the graph is showing the exact same information....just the raw data is more convenient & accurate for usage in calculations. Hopefully they'll say it's ok for us to use the raw data so that we can have a go at calculating a "Paul Struck Harman Frequency Response Target", which Mad_Economist has enabled for us via his spreadsheet.
 
Last edited:

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,300
Location
China
Thanks. I found a few of those. It is interesting that there is no standardization on scales or aspect ratios. This makes them hard to compare.

I will try the diffused field next. The one I looked at is much more gentle and doesn't have the big troughs.

On the raw measurements, should we not get agreement to a few Kilohertz with other HATS???
There are standards. But few follows.

In terms of targets, diffuse field + house curve is the way to go. You can just apply the diffuse field target then add an overlay of the house curve target to compare just like speakers' in room estimate graph.

As per HATS' response. We can only certain to have agreement up to 500hz. After 500hz the coupler's response start to kick in and after 1khz the pinna starts to show. It's really about interpretation and experience. Description and demonstration over the bare response are very important. Which peak, dip, bump is representative to what you hear, and which is not. And when we get used to reading the graph on certain equipment everything becomes much clearer.
 
Top Bottom