Resolve's B&K 5128 Headphone Target - you can try the EQ's.....

Resolve · Mar 26, 2023

usern said:
Instead of isolating, you could show the variance by measuring multiple reseats and presenting mean and +-stdev.

Well yes, you generally show all of them. It's just for this particular project, with the HD 650 I wanted one for each common result people report hearing. I will say... REW kind of sucks for showing things well, so we're working on a better way to represent this stuff too.

Resolve · Mar 26, 2023

isostasy said:
I admit I'm still not entirely satisfied with some of the explanations given by resolve, particularly: 'What folks need to realize with headphone measurements is that it’s really not worth caring about the fine-grained minutia, since it’s unlikely to be a predictive element'. But 'fine-grained minutia' is the only thing being addressed by the EQ profile in the treble response of the HD600 and HD650.

It's a fair point. I did go into the fine-grained stuff for this project because I wanted to go as close as possible, even if I don't think folks need to do that for their own personal EQs unless something is way out of whack. With the HD 600, I think bass changes were more recognizable, but the treble changes didn't seem to be that meaningful to people. It's also still unclear how much using an unsmoothed target matters vs say, Harman's 1/2 smoothing. But I will also concede that it's nice to explore the more fine-grained adjustments on the rig and see how that translates to how I hear it on my head. I will say, sometimes it works great, but not all the time, even though things match on the rig - and to me this means we need to have a clearer picture of HpTFs. So I'll be looking into that more.

isostasy · Mar 26, 2023

Resolve said:
It's a fair point. I did go into the fine-grained stuff for this project because I wanted to go as close as possible, even if I don't think folks need to do that for their own personal EQs unless something is way out of whack. With the HD 600, I think bass changes were more recognizable, but the treble changes didn't seem to be that meaningful to people. It's also still unclear how much using an unsmoothed target matters vs say, Harman's 1/2 smoothing. But I will also concede that it's nice to explore the more fine-grained adjustments on the rig and see how that translates to how I hear it on my head. I will say, sometimes it works great, but not all the time, even though things match on the rig - and to me this means we need to have a clearer picture of HpTFs. So I'll be looking into that more.

Thanks for taking the time to reply. I agree the bass changes are definitely the most recognizable, but I understand your intention to get as close as possible a bit more now. I suppose as much as I may not see the point in having such fine-grained adjustment, it makes as little sense not to, given you have the equipment. Thanks for your ongoing dedication to audio as an active hobby rather than just consumerism disguised as a hobby, which is what it often looks like to me.

Robbo99999 · Mar 26, 2023

Resolve said:
It's a fair point. I did go into the fine-grained stuff for this project because I wanted to go as close as possible, even if I don't think folks need to do that for their own personal EQs unless something is way out of whack. With the HD 600, I think bass changes were more recognizable, but the treble changes didn't seem to be that meaningful to people. It's also still unclear how much using an unsmoothed target matters vs say, Harman's 1/2 smoothing. But I will also concede that it's nice to explore the more fine-grained adjustments on the rig and see how that translates to how I hear it on my head. I will say, sometimes it works great, but not all the time, even though things match on the rig - and to me this means we need to have a clearer picture of HpTFs. So I'll be looking into that more.

I think it's worth trying to explore the fine grained aspects first. There could be some good common detail in that between people. There's no point in rounding up or down if you're not sure that you have to! (We know that unit to unit variation ,etc, can cloud that to various degrees depending on headphone, but I think it's worth starting out with the fine grained approach).

EDIT: or you could offer the coarse grained approach at the same time: so Option A the fine grained approach & Option B the coarse grained approach. Might help reveal some stuff if you capture enough data from people. But I don't think you should ignore the fine grained approach at least.

usern · Mar 27, 2023

Nice example from @oratory1990 - multiple reseats measurements with 90% confidence corridor and avg. This is what I'd like to see for headphones sensitive to positional changes, or any headphones really:

https://twitter.com/i/web/status/1516350384650002433

Would be nice to also provide number of measurement samples.

oratory1990 · Mar 27, 2023

usern said:
And the same question to @oratory1990 - do you present just one measurement in the SPL Frequency Response graphs or is it an average over multiple measurements? What about plotting standard deviation in the same graph?

For the EQ graphs I always (!) make many measurements and calculate the average of them (a spliced average of well-sealing measurments, to get a best-case-scenario (but still realistic) value for sealing at low frequencies). In between individual measurements I take the headphone off and place it back on the measurement fixture, in a slightly different position. I always take multiple measurements at a central position, plus measurements with the headphone shifted up/down/back/forward.

Yes, plotting the resulting difference from the placement variation is useful, as some headphones show more variation in SPL than others. Some of it expected (on-ear headphones vary a lot more than over-ear headphones, closed-back headphones vary more than open-back heaphones).

I am already planning to incorporate this in a future project that is to be announced this year (hopefully).

IAtaman · Mar 27, 2023

oratory1990 said:
For the EQ graphs I always (!) make many measurements and calculate the average of them (a spliced average of well-sealing measurments, to get a best-case-scenario (but still realistic) value for sealing at low frequencies). In between individual measurements I take the headphone off and place it back on the measurement fixture, in a slightly different position. I always take multiple measurements at a central position, plus measurements with the headphone shifted up/down/back/forward.

Is there a well defined protocol for this? If not, do you think it might be possible/there might be value in creating one?

isostasy · Mar 27, 2023

oratory1990 said:
For the EQ graphs I always (!) make many measurements and calculate the average of them (a spliced average of well-sealing measurments, to get a best-case-scenario (but still realistic) value for sealing at low frequencies). In between individual measurements I take the headphone off and place it back on the measurement fixture, in a slightly different position. I always take multiple measurements at a central position, plus measurements with the headphone shifted up/down/back/forward.

Yes, plotting the resulting difference from the placement variation is useful, as some headphones show more variation in SPL than others. Some of it expected (on-ear headphones vary a lot more than over-ear headphones, closed-back headphones vary more than open-back heaphones).

I am already planning to incorporate this in a future project that is to be announced this year (hopefully).

Did you do this with the HD650 units measured for this graph:

https://twitter.com/i/web/status/1564622387177574403

or was that just an average of single measurements per unit?

I'm wondering this, because I'm interested is the reality really as bad as it looks on that graph? Because if it's the case that most of the measurements broadly follow that black line, but each show different small variations all in different places, such that when you plot the confidence interval it looks quite broad, I guess that isn't so bad and it will at least be obvious they're the same headphone model. But if you have one unit which follows the black line, and another that follows the bottom of the confidence interval, another that follows the top of the confidence interval, another still which is near the top of the confidence interval in some places and the bottom in others, those are all going to sound like completely different headphones. This would be quite pertinent to the discussion we were having with resolve earlier I think.

n.b. I'm assuming they're all silver screened HD650/HD6XX and no older versions with black paper or silk screen?

oratory1990 · Mar 28, 2023

IAtaman said:
Is there a well defined protocol for this? If not, do you think it might be possible/there might be value in creating one?

protocol for what? "move the headphone a few mm up"?

oratory1990 · Mar 28, 2023

isostasy said:
Did you do this with the HD650 units measured for this graph:
https://twitter.com/i/web/status/1564622387177574403
or was that just an average of single measurements per unit?

I'm wondering this, because I'm interested is the reality really as bad as it looks on that graph? Because if it's the case that most of the measurements broadly follow that black line, but each show different small variations all in different places, such that when you plot the confidence interval it looks quite broad, I guess that isn't so bad and it will at least be obvious they're the same headphone model. But if you have one unit which follows the black line, and another that follows the bottom of the confidence interval, another that follows the top of the confidence interval, another still which is near the top of the confidence interval in some places and the bottom in others, those are all going to sound like completely different headphones. This would be quite pertinent to the discussion we were having with resolve earlier I think.

n.b. I'm assuming they're all silver screened HD650/HD6XX and no older versions with black paper or silk screen?

Taking multiple measurements of one unit ("spatial averaging") is how I obtain the result of one unit.

The graph you linked to shows the results of multiple units averaged (the result of each individual unit obtained in the above way)
That's 21 units of different ages, some of them very old, some of them from 2022.
No observable trend.

those are all going to sound like completely different headphones.

Not completely different but yes, the difference would be audible in ABX tests.

IAtaman · Mar 28, 2023

oratory1990 said:
protocol for what? "move the headphone a few mm up"?

Is it all there is to it, just move it around a few mm and as long as it seals well it is a good measurement? Interesting. I thought there would a more elaborate, repeatable process to follow not unlike speaker measurements done with Klippel.

amirm · Mar 28, 2023

IAtaman said:
Amir says in his headphone review preface for example, that he does not do averaging. I am not sure what is his logic.

I explained at length when I first started to measure headphones. Averaging is a type of low pass filter. It is used to gain insight into data that to humans seems random, or hard to quantify. It is also highly sensitive to extreme values (geometric mean is better in this regard).

In the case of headphone measurements, the graph is not hard to understand at all. I give you two instances in stereo measurements. Your brain can easily eyeball what the average of those two is, and you are welcome to average them if you like.

In my view, it is fool's gold to try to get to high accuracy in headphone measurements. Nothing about is precise. Targets are averaged. Fixtures comply with some average. Position variations, part variations, etc. all work to make actually resolution of the data far lower than 100%.

The measurements give us a guide to follow and confirm. This is what I do with EQ testing and listening tests in tandem. I deviate from measurements as needed to get pleasant sound.

A key goal of the target seems to be lost in all of this: in some ways, it doesn't matter what the target is. We just need one. Not five, but one. If every headphone complied with it, both in production of music and consumption, then we as consumers can EQ to taste and be done. With multiple people chasing some target with different fixtures, we lose this. For this reason, I am disappointed to see a couple of reviewers jumping on 5128 bandwagon. Why on earth would you do this? Is it some kind of race to keep up with head-fi? Why on earth would you adopt a fixture that research shows needs a well researched target to produce correct target? Makes no sense at all.

IAtaman · Mar 28, 2023

amirm said:
In the case of headphone measurements, the graph is not hard to understand at all. I give you two instances in stereo measurements. Your brain can easily eyeball what the average of those two is, and you are welcome to average them if you like.

You are right, averaging on its own is not unlike putting a low pass filter on measurement results, however the point of multiple measurements is not to get to an average, it is to calculate a std dev and upper & lower control limits so one can have a better understanding of the "sensitivity of the headphone's FR to the placement and manufacturing tolerances" in my opinion, which I understand is better in some headphones than in others, and is an important quality of the product. If I am not at least somewhat confident that I am getting a product close to what you measured then those measurements are not gonna do me a lot of good. Am I missing a point?

solderdude · Mar 28, 2023

I believe that's what some now try to achieve (make a target for 5128).
That target may not have the same 'Harman type bass shelf' as that was based on a filter that was used during testing of bass level preference.
The 5128 does have a 'closer to human ear simulation' so kind of does make sense.

I agree on the headphone measurements being merely indicative and no absolute. That makes 5128 measurements just slightly different 'indicative' and not necesarilly more accurate at being indicative. Also on the EQ generation based on a single plot (regardless how it was obtained).
People seem to believe it is the only proper way.
I like that Oratory also uses some listening and notes where one should adjust to taste and am convinced he knows about how 'accurate' headphone measurements are in reality.

The leakage plots I have been doing for a long time and seems a valuable addition so is the new reporting (with the confidence band).
Have been thinking about making a 'standard' kind of thing for leakage but in reality there are also differences in head shape and skin thickness/compliance and also pad wear (or softening at least)

It is a lot of money for a fixture with associated gear connected to it to get slightly more accurate ? (at least that does seem to be the promise) acoustical load and perhaps better indicative plots ?

IAtaman · Mar 28, 2023

solderdude said:
The leakage plots I have been doing for a long time and seems a valuable addition

I find them to be very useful since I have stopped pretending I can see without glasses and started to wear them more often.

markanini · Mar 28, 2023

amirm said:
Why on earth would you do this? Is it some kind of race to keep up with head-fi? Why on earth would you adopt a fixture that research shows needs a well researched target to produce correct target? Makes no sense at all.

To me it makes total sense, it's about clouding objective stats for new product launches by making it hard to compare to previous measurements. I'd go as far as speculating the reviewers got help with the funding of the new measurement rigs for such reasons. A few reviewers like Crinacle are upfront about being associated with brands and retailers but most arent so when they keep you in the dark it's perplexing.

MayaTlab · Mar 28, 2023

amirm said:
For this reason, I am disappointed to see a couple of reviewers jumping on 5128 bandwagon. Why on earth would you do this? Is it some kind of race to keep up with head-fi? Why on earth would you adopt a fixture that research shows needs a well researched target to produce correct target? Makes no sense at all.

While I am not expecting major new insights using the 5218 for over-ears, for IEMs this is a different story. It can quickly become a bit of a moot point whether Harman's IEM target is well researched or not if 711 couplers introduce inaccuracies when, for example, comparing active IEMs with a feedback mechanism and passive ones.

In both cases however we'll still be limited to testing headphones on a singular fixture and not on a system that happens to reliably and repeatedly reproduce the sort of variation we can expect on a cohort of real humans (and then leaves us ponder whether or not these variations are desirable, at least at higher frequencies), which in my opinion is by and large the main issue (with over-ears at the least), alongside sample variation / wear.

usern · Mar 28, 2023

amirm said:
I explained at length when I first started to measure headphones.

I would suggest making this information more visible. Latest headphone reviews do not reference methodology, nor is it pinned here and I could not find it in the site headers

Robbo99999 · Mar 28, 2023

amirm said:
I explained at length when I first started to measure headphones. Averaging is a type of low pass filter. It is used to gain insight into data that to humans seems random, or hard to quantify. It is also highly sensitive to extreme values (geometric mean is better in this regard).

In the case of headphone measurements, the graph is not hard to understand at all. I give you two instances in stereo measurements. Your brain can easily eyeball what the average of those two is, and you are welcome to average them if you like.

In my view, it is fool's gold to try to get to high accuracy in headphone measurements. Nothing about is precise. Targets are averaged. Fixtures comply with some average. Position variations, part variations, etc. all work to make actually resolution of the data far lower than 100%.

The measurements give us a guide to follow and confirm. This is what I do with EQ testing and listening tests in tandem. I deviate from measurements as needed to get pleasant sound.

A key goal of the target seems to be lost in all of this: in some ways, it doesn't matter what the target is. We just need one. Not five, but one. If every headphone complied with it, both in production of music and consumption, then we as consumers can EQ to taste and be done. With multiple people chasing some target with different fixtures, we lose this. For this reason, I am disappointed to see a couple of reviewers jumping on 5128 bandwagon. Why on earth would you do this? Is it some kind of race to keep up with head-fi? Why on earth would you adopt a fixture that research shows needs a well researched target to produce correct target? Makes no sense at all.

I think there's potentially an argument with regards to "neutrality" re the B&K 5128. I think it's possible that if the B&K is a closer match in anatomy than GRAS then it's possible to create a more neutral target than on GRAS. Yes, it's hard to get down into the weeds when talking about "neutrality" in headphones, and the ultimate in that quest might be something like the Smyth Realizer calibrated to some really great reference speakers in a well setup room or studio, but I think there's something to be gained. I don't have any issues with people like Resolve trying to use the B&K and to come up with a good target for it. I look at it from my own point of view - do the resultant EQ's ultimately sound better than the GRAS Harman ones. I know you can tweak GRAS Harman EQ's and it's expected that you tweak things like bass and perhaps other areas in a broad way to meet your own best sound, but I can imagine that there could be parts of the frequency response that may not be optimised as a starting point if it's based on anatomy that doesn't closely mimic an average human. Additionally I think there could be some value in retaining the resolution of the target curve (not too heavily smoothed), which I understand Resolve has done by using the diffuse field measurement of the B&K and then applying a Harman style room slope to it.

In my experience I've enjoyed Harman EQ's based on GRAS, and I've also enjoyed the Resolve B&K Target - they are a bit different, but if any headphone manufacturer was looking to follow either one of those curves then they would still be pretty good sounding headphones, certainly a lot more than the conventional "wild west" of headphone frequency response that is often talked about. I personally welcome the spirit of endeavour that Resolve is showing in creating & testing that B&K Target.

Robbo99999 · Mar 28, 2023

Resolve added one more headphone to the B&K EQ list today:

Meze Audio Elite with leather pads

(first post of this thread updated)

Resolve's B&K 5128 Headphone Target - you can try the EQ's.....

Active Member

Active Member

Senior Member

Master Contributor

Senior Member

Member

Major Contributor

Senior Member

Member

Member

Major Contributor

Founder/Admin

Major Contributor

Grand Contributor

Major Contributor

Major Contributor

Addicted to Fun and Learning

Senior Member

Master Contributor

Master Contributor

Similar threads