• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Mad_Economist Canjam Presentation - These are the Dark Ages

Please see Sean's multiple-regression model for the impact of direct sound, I wasn't pulling that number out of a hat.
You have misunderstood the nature of that modelling. The variables in the preference rating have cross-dependency. Predicted in-room response for example includes on-axis performance.

The modelling was also a brief attempt for a set of speakers. It is simply not possible to grind down a bunch of frequency response measurements to a single number. It is for this reason that I don't use the preference metric in any of my reviews (speakers or headphones). And Harman never used it either for development of its own speakers.

Instead of going by formulas, it is best to understand the psychoacoustics which is what I stated and which you can read from Dr. Toole such as this post of his:

Separating tasks. It turns out that the direct sound does more than define direction. It becomes the initiation of the precedence effect, in which later arrivals are audible but do not substantially modify the localization cue. It turns out also that the direct sound is a major factor in determining sound quality ( see part 3 of the Sean Olive experiments described in Section 5.7.3, p. 140 in the 3rd edition.). It seems to be a reference against which later arrivals of the same sound are compared.

From the earliest experiments I did, the principal identifier of a "neutral" loudspeaker was its on-axis - direct sound - performance. If the off-axis performance was also smoothly non-resonant the ratings were even higher.

As far as I am concerned, headphone measurements on GRAS-45CA using Harman target give us the equiv. of "anechoic on-axis" response of speakers. Neither is everything but both go quite far in determining good sound.
 
"Two different listeners may hear a given headphone meaningfully differently"

Yeah, that's a bit like saying that two people may see different colors in the same perfectly calibrated computer monitor. ( and they might ). In the science of colorimetry they solved most of these problems with solutions such a "standard observer", which is nothing but a *reasonable* model of a what an actual observer might see. This "standard observer" would tell you the "truth" which is not an absolute nor "universal" truth, but rather an objective truth. Which is what science actually cares about. So many people are confused with this.

And just like that, you don't care anymore if someone thinks the white color doesn't look white. Because you got objective standards that eliminate many variables and subjectivities that are not really very meaningful nor useful. That's how, I think, we got out of many dark ages.
This is in response to the bit on HRTF, I assume? I'd generally agree, that's why it was the last and shortest (memory serving) section of the presentation.
 
I'm particularly interested in your views regarding the emergent interaction of headphones and ears, and the visualization of positional variation in headphone response - I'd be interested in them.
You answered your own question in your presentation. As I have repeatedly said, unless proper listening tests is performed, these ideas thrown at the marketplace simply serve to confuse enthusiasts and give manufacturers a reason to do whatever random thing they want. We must go after a playback standard. It doesn't need to be perfect. Users will have to EQ to make it as perfect as they can. But it can't be wild west.

In the last few years, reviewers/influencers seem to be in a race to differentiate from each other rather than have a proper view of where the industry needs to go. Folks buy expensive fixtures first, then think of what they are going to do with them.
 
You have misunderstood the nature of that modelling. The variables in the preference rating have cross-dependency. Predicted in-room response for example includes on-axis performance.
Well, yes, tautologically, in-room response depends on axial response, along with every other axis. To some degree, this gets us into pedantic territory - my point is that the principal components of preference can be reduced to axial response, directivity, and low-frequency extension, in roughly that order of impact on preference. However, I do not believe that we have data to suggest that any of the three is a majority (>50%) of the total.

I think you really undercredit the value of models in specifically testing assumptions specifically like that one, though - being able to quantify the effect of the constituent parts beyond vague generalization is very useful, and it's good to be able to catch ourselves when we are making unsupported assumptions.

I naturally agree with Dr. Toole's statements here, but I'm not sure I see how they're contradictory to my own commentary - I'm not arguing that axial flatness is smaller than DI (although I'd earnestly love to see a really well controlled test of exactly which matters more and under what circumstances), but rather that a model based on it is incomplete.

As far as I am concerned, headphone measurements on GRAS-45CA using Harman target give us the equiv. of "anechoic on-axis" response of speakers. Neither is everything but both go quite far in determining good sound.
Interestingly, I think we agree almost completely on the first sentence here. I say "almost" because Olive 2018 shows that multiple very non-Harman headphones were comparably scored to Harman, which would not cohere with preference for speaker axial response as I understand it. I'm a bit confused by the follow-up, though - you wouldn't be happy with a reviewer measuring solely free-field on-axis speaker response, right?

As I have repeatedly said, unless proper listening tests is performed, these ideas thrown at the marketplace simply serve to confuse enthusiasts and give manufacturers a reason to do whatever random thing they want.
This strikes me as hard to reconcile with what I'm talking about in the acoustic Z and positional variation sections - we don't actually need a listening test to verify those effects, because...they directly change FR. Which we know, by the Olive research, to be the dominant factor in listener preference by far. Like, if anything, these would impose strictures on manufacturers' habit of producing headphones that can't couple on most people's heads, which is a legitimately growing issue in the headphone space.
 
Can't believe how adamant amirm is in this specific discussion.
Can't believe how easy to understand Mad_Economist is.
This feels like a rather emotional discussion.
How come?
 
The short version of the presentation (IMO) is:
HP measurements and perceived sound don't always go hand in hand because of several factors.
  • HP measurements on different fixtures give different squiggles for several reasons
  • Human heads differ from that of the few 'standard' fixtures that are available
  • Some fixtures are better suited than others for OE/IE measurements
  • Seal and positioning matters both on fixtures as well as actual heads
  • There is no single correct squiggly so any EQ based on a single squiggly also is not going to be 'perfect' either but can bring it closer to some chosen target on a specific fixture in a specific circumstance (which created the squiggle for that headphone). At least it can provide a starting point to work from.
Displaying tolerance bands is a good idea but only relevant to positional variances if only shown for that headphone (when displaying squiggles in various positions).
Displaying average tolerance bands is a good idea but only relevant to the particular fixture it is measured on and depending on personal circumstances could be worse than that.

Doing a review based on how it 'sounds' to a certain individual also is equally suspect as measurements are but at least measurements add some measured data, even if the measurements aren't coinciding with other measurements.

Headphones are just like people. They all differ and interact differently.
There still is no real standard nor will there ever be one for the above reasons. (There can only be one does not apply)
Science is fun to certain people, it might not be to others.
Science progresses. It is good to see progress in headphone (measurement) science but will not lead to consensus.
People need education but have to be 'open' for it.
When people have formed an opinion it might be hard to convince them to look at things in a different way.

I agree this does not invalidate Harman research but shows what was already shown in Harman research but is not highlighted enough.
The Harman target merely suits the majority of people, not all people.

Then there is the other issue that also can change the sound. The interaction between electrical output impedance of the source in combination with the electrical impedance of the headphone.
Then there is product variance (not all drivers are created equal).
 
I'm happy to field questions/comments/angry mobs here if folks have thoughts on this!
Thanks for the seminar. I'm not by far a audio (technical) specialist:facepalm: but had some questions.

I listen to the Sony WH-1000 XM2. I use Oratory Meauserd FR. I enter this measurement in Wavelet because I can change the corrected FR in Wavelt, which seems to do this linear (see link). In most cases 100% is just a bit too much correction, subjectively (to me) it becomes a bit too shrill. However, at 63% it starts to sound better to me. I offent change the % if i listen to different music/recordings so it will match what i like. Overall a more or less Harman curve comes close what i like.

For a practical reasons, is this is not a simple elegant way (Wavelet functionality) to individually correct too much or too little FR compensation or are there other methods apps that does a similar job. A custom-made measurement seems very expensive to me + over time, measurements can change due to wear, age other variables.
 
Last edited:
In the last few years, reviewers/influencers seem to be in a race to differentiate from each other rather than have a proper view of where the industry needs to go
This is extremely astute and may represent a fundamental flaw with the social media age. To get likes we have to be different. Consensus does not lead to popularity.
 
Harman should be the end goal for manufacturers, not for consumers. I'm guessing that a lot of confusion around the Harman target and the spawning of more and more target curves is a result of not understanding the difference between manufacturer goals and consumer goals.

They are not exactly the same (even though a large overlap can be expected) and cannot be expected to exactly match unless we advance enough in science and technology to create automatic and accurate adjustments during use. In the mean time Harman is still the most reasonable end goal for manufacturers and consumers need to accept the fact that they need to adjust for their individual parameters, if they want the best experience possible.
 
Headphones are just like people. They all differ and interact differently.
There still is no real standard nor will there ever be one for the above reasons. (There can only be one does not apply)

The Harman target merely suits the majority of people, not all people.

It is not the job of science to validate all the subjective perceptions and circumstance of all people. But it seems that a lot of people in this hobby want just that.

I mean, you could go on and on. Your perception of how a headphone sounds might significantly depend on the previous headphone you have been listening to for a week prior. That's something that will not show up in the frequency response. The psychoacoustic effects of loudness, the design, branding and marketing of the headphone. You broke up with your girlfriend, now the iem she gave you sounds terrible. Whatever.

But that's precisely why you have to have objective standards, to get rid of all that noise. Why are people afraid to do this? I just don't get it. Because the alternative "Trust me bro" is far worse than whatever flaws we could find with actual standards.
 
It is not the job of science to validate all the subjective perceptions and circumstance of all people. But it seems that a lot of people in this hobby want just that.

I mean, you could go on and on. Your perception of how a headphone sounds might significantly depend on the previous headphone you have been listening to for a week prior. That's something that will not show up in the frequency response. The psychoacoustic effects of loudness, the design, branding and marketing of the headphone. You broke up with your girlfriend, now the iem she gave you sounds terrible. Whatever.

But that's precisely why you have to have objective standards, to get rid of all that noise. Why are people afraid to do this? I just don't get it. Because the alternative "Trust me bro" is far worse than whatever flaws we could find with actual standards.
There's a mix-up between subjective factors and actual anatomical differences that makes it hard for measurement gear to accurately predict individual frequency responses. Personally, I think understanding real variability factors is way more useful than creating strawmen and turning the discussion into a team sport.

Honestly, just reminding ourselves of scientific standards should be enough—they don't change much. Besides, why get so worked up about it? You'll just end up looking more irrational. After all, cool heads always prevail.
 
Last edited:
It is not the job of science to validate all the subjective perceptions and circumstance of all people. But it seems that a lot of people in this hobby want just that.
Of course not. Nor is that the goal of measurements nor the talk. Science can show and quantify that though.
Your perception of how a headphone sounds might significantly depend on the previous headphone you have been listening to for a week prior.
Yes, having a good reference is important when evaluating a headphone by ear.

That's something that will not show up in the frequency response.
Indeed but for reviewers this should be obvious but may not be so for the average headphone user trying some different models. It is not related to measurements and is what the talk was about.

The psychoacoustic effects of loudness, the design, branding and marketing of the headphone. You broke up with your girlfriend, now the iem she gave you sounds terrible. Whatever.
Yep, but this is not related to measurements just perception (the brain part). It cannot possibly be 'captured' but one can research aspects of course.
Loudness for one thing is mapped pretty well.

But that's precisely why you have to have objective standards, to get rid of all that noise.
You need standards to ensure measurements done according to a certain method using compliant test equipment (to that standard) so measurements are comparable.

The problem with standards is that there are multiple standards and on top of that multiple targets and on top of that multiple confounding factors (seal, seating).

Why are people afraid to do this? I just don't get it.
Well... one could adopt one standard and conform to it. Amir for instance uses a 'standard' fixture with a certain coupler and pinna and uses the Harman standard (which is not built exactly for that fixture but close enough). That should result in measurements that correlate to the majority of users but also means not everyone.
When choosing a single standard and target at least you get compliance to that standard and target and are comparative to other measurements done on the same standard and target of others.
The issue is ... not everyone agrees that this specific standard and target is the one with the best 'fit'.

With the coming of newer and potentially more 'accurate' (to an actual ear canal) fixture such as the 5128 the 'standard' would have to be figured out again.


Because the alternative "Trust me bro" is far worse than whatever flaws we could find with actual standards.
Yep, agreed when someone says 'trust me' one should put on the skeptic hat immediately.
Fortunately this isn't the case here as @Mad_Economist does not say 'trust me' anywhere in the video.
 
Last edited:
Lots of interesting info in this presentation. Thanks
And It is thanks to this kind of discussions that we get to enjoy significantly better tuned iems and headphones nowadays at much cheaper price. A far better landscape than few years before. The legacy of the research and products by the likes of Harman and Etymotic is awesome.
 
@Mad_Economist Thanks for the interesting presentation. It's a shame people in the audience felt the need to interrupt you so often.

I wonder how big an issue positional variation really is. If you put HP2 on your head and heard no bass you would adjust the positioning until you did. I assume people who are unable to do this, because of glasses for example, would simply decide early on HP2 will not be right for them. Surely that is a caveat to the main judgement, which should still be how well it broadly follows the Harman OE 2018 target?

Similarly, if a headphone sounds different on different heads, surely the main judgement is still how well it broadly follows the Harman OE 2018 target? In other words, is a headphone that has low rHpTF variation but poor compliance with the Harman OE 2018 target still not likely to be rated worse than a headphone with higher rHpTF variation but good compliance with the Harman OE 2018 target?

I can imagine an example where this isn't the case, which would be if a headphone measures exceptionally well (good compliance with target) but for some reason interacts with real people's heads so differently that it sounds nothing like expected on anyone's head. I don't know of any real examples of this.

Are these not essentially ergonomic issues we already know about and should hold in mind when designing/reviewing/choosing headphones?

On the other hand, I think I totally agree with your observations of IEMs in Section 6 but have no input besides guiltily preferring my Etymotic ER2XR EQed to Sennheiser IE200, thinking it sounds more like my HD6XX and speakers, but not understanding why.
 
@Mad_Economist Thanks for the interesting presentation. It's a shame people in the audience felt the need to interrupt you so often.
Honestly, @oratory1990 and I regularly do presentations together, and I appreciated his input. The main issue is that, for the recording, he was not picked up well. I'll get a crowd mic next time if possible!

I wonder how big an issue positional variation really is. If you put HP2 on your head and heard no bass you would adjust the positioning until you did. I assume people who are unable to do this, because of glasses for example, would simply decide early on HP2 will not be right for them.
This one is kind of a known unknown - we don't have a super robust body of data for how variable in situ FR is based on positioning on the same listener's head as a function of time. We can quantify the range of variation on a dummy head, or more ideally we can measure the in situ behavior on real human heads (MIRE), but we're getting an approximation.

I will say, I think you slightly overestimate how responsive people are to both of the types of variation that we see in situ (broadband bass level variation, and narrow band treble peaks and dips) - certainly, these are audible, but particularly if it's only one channel that has a "big" variation, you'd be surprised by what people will put up with and not even consciously identify.

Surely that is a caveat to the main judgement, which should still be how well it broadly follows the Harman OE 2018 target?
There's a separate presentation to be done on this topic, but if we look at the actual subjective preference ratings from Olive, Welti, & Khonsaripour 2018, while tracking with Harman OE2018 is unambiguously good, several headphones which very much did not track with OE2018 were scored comparably well.

Similarly, if a headphone sounds different on different heads, surely the main judgement is still how well it broadly follows the Harman OE 2018 target? In other words, is a headphone that has low rHpTF variation but poor compliance with the Harman OE 2018 target still not likely to be rated worse than a headphone with higher rHpTF variation but good compliance with the Harman OE 2018 target?
In a world where EQ didn't exist, sure? In the world of EQ, however, we can quite easily make broadband adjustments to correct failings which are visible on generic fixtures (excessive treble response, insufficient bass, etc), whereas individual-specific variations would require either in situ measurements on the individual (realistically, not happening for 99.99% of users) or correcting by ear (pretty bad at fixing narrow-band issues). Like, I kinda think that the proof is in the pudding of the headphones @Sean Olive chose for the experiments here: They are designs less likely to vary in situ on users for various reasons.

I can imagine an example where this isn't the case, which would be if a headphone measures exceptionally well (good compliance with target) but for some reason interacts with real people's heads so differently that it sounds nothing like expected on anyone's head. I don't know of any real examples of this.
By dint of anecdote, a number of high-Z closed designs have reports like this. This includes both cheap designs like the K371 and expensive ones like the Stealth. This is something I'd like to thoroughly document in a test which monitors real time in situ FR on human heads using dual channel FFTs, but I don't expect there to be any big surprises: the likely result is "people who dislike the Stealth/Expanse/371/K550/etc are getting different in-situ response than the mannequin response".

Are these not essentially ergonomic issues we already know about and should hold in mind when designing/reviewing/choosing headphones?
I mean, I guess you can frame them as ergonomic, but they're measurable acoustic effects which are, at present, not widely measured. Like, even among the real grognards, I know a lot of people who don't know that open headphones have more consistent bass response than closed designs, and while it's something we sometimes make reference to in reviews, we seldom actually, you know, quantify it. Which sticks in my craw, because it is quantifiable, and it's a quantifiable effect which can impact human preference for headphones, and those should be quantified.
 
Honestly, @oratory1990 and I regularly do presentations together, and I appreciated his input. The main issue is that, for the recording, he was not picked up well. I'll get a crowd mic next time if possible!


This one is kind of a known unknown - we don't have a super robust body of data for how variable in situ FR is based on positioning on the same listener's head as a function of time. We can quantify the range of variation on a dummy head, or more ideally we can measure the in situ behavior on real human heads (MIRE), but we're getting an approximation.

I will say, I think you slightly overestimate how responsive people are to both of the types of variation that we see in situ (broadband bass level variation, and narrow band treble peaks and dips) - certainly, these are audible, but particularly if it's only one channel that has a "big" variation, you'd be surprised by what people will put up with and not even consciously identify.
At the risk of sounding facetious, If the changes are usually not noticeable isn't this a sign they're not significant enough to matter?

There's a separate presentation to be done on this topic, but if we look at the actual subjective preference ratings from Olive, Welti, & Khonsaripour 2018, while tracking with Harman OE2018 is unambiguously good, several headphones which very much did not track with OE2018 were scored comparably well.
Why is this significant or even unexpected, though? If we want the sound of headphones to resemble the sound of speakers in a room, we want our definition of good headphone sound to somewhat resemble Harman.

In a world where EQ didn't exist, sure? In the world of EQ, however, we can quite easily make broadband adjustments to correct failings which are visible on generic fixtures (excessive treble response, insufficient bass, etc), whereas individual-specific variations would require either in situ measurements on the individual (realistically, not happening for 99.99% of users) or correcting by ear (pretty bad at fixing narrow-band issues). Like, I kinda think that the proof is in the pudding of the headphones @Sean Olive chose for the experiments here: They are designs less likely to vary in situ on users for various reasons.
Agreed, but what target are you using as a basis for EQ? ;)

By dint of anecdote, a number of high-Z closed designs have reports like this. This includes both cheap designs like the K371 and expensive ones like the Stealth. This is something I'd like to thoroughly document in a test which monitors real time in situ FR on human heads using dual channel FFTs, but I don't expect there to be any big surprises: the likely result is "people who dislike the Stealth/Expanse/371/K550/etc are getting different in-situ response than the mannequin response".
This sounds worthwhile to me.

I mean, I guess you can frame them as ergonomic, but they're measurable acoustic effects which are, at present, not widely measured. Like, even among the real grognards, I know a lot of people who don't know that open headphones have more consistent bass response than closed designs, and while it's something we sometimes make reference to in reviews, we seldom actually, you know, quantify it. Which sticks in my craw, because it is quantifiable, and it's a quantifiable effect which can impact human preference for headphones, and those should be quantified.
Sure, but isn't Harman still the starting point? All these potential negatives, e.g. seal issues, in-situ variation, are significant in the extent to which they cause the measured frequency response to diverge from a target ...which I don't see good reason to be anything other than Harman OE 2018.
 
At the risk of sounding facetious, If the changes are usually not noticeable isn't this a sign they're not significant enough to matter?
There are many things which impact subjective preference, but people are really bad at subjectively describing. Level matching is one of them, but for untrained listeners, FR variations are another.

Why is this significant or even unexpected, though? If we want the sound of headphones to resemble the sound of speakers in a room, we want our definition of good headphone sound to somewhat resemble Harman.
If responses other than the Harman target are equally preferred, why do we want our headphones to sound like "speakers in a room" instead of...anything else that's equally preferred? I mean, for some listeners, there were significant preference advantages for other responses over Harman.

Agreed, but what target are you using as a basis for EQ? ;)
I mean, ideally you'd start with a "well-scoring" target set and do some blind testing yourself - something analogous to Sean's IEM experiment where the Soundguys target and his new Harman IEM target were about tied, even though their FRs were the least similar in the test set.

Pragmatically, Harman is a fine starting point. Equally, if you feel inclined to play with filters, the body of the work implies you could start at an HRTF (DF, simulated in-room, whatever) and play with adjusting the bass and treble to taste.

Sure, but isn't Harman still the starting point? All these potential negatives, e.g. seal issues, in-situ variation, are significant in the extent to which they cause the measured frequency response to diverge from a target ...which I don't see good reason to be anything other than Harman OE 2018.
I think that there are arguments against Harman as the singular baseline which mostly rest on the fact that some quite disparate responses can achieve similar preference ratings, meaning that we may need a better model than "doesn't match this line" to predict sound quality. However you're correct, none of this presentation is in any respect an attempt to suggest that the Harman target, when it is actually achieved on the listener's head, is subjectively non-preferable. Like, if anything, my default recommendation to a person who bought headphones and has the patience to do some experimental EQ would be to start with Harman, then play with the shelf filters to get the general sound they want, then maybe get weird with peak filters if they thing there's something specifically bugging them.
 
Back
Top Bottom