Master Complaint Thread About Headphone Measurements

Robbo99999 · May 22, 2023

RafaelV said:
I discussed with Jaakko already about this. And had a look at the source code. For example the Chu has a huge treble dip around 10-12kHz, and large sub-bass deviation, but the regression they used only uses the region from 40Hz to 10kHz, so the result has very limited validity.

Yep, don't obsess on the Preference Score numbers, it's far more elucidating looking at the raw frequency response graph vs the target. The Preference Score number had to be created in order to provide somekind of statistical reference for their research, but it's not more elucidating than actually just looking at the raw frequency response itself & comparing it to the target, that's a lot more useful.

Robbo99999 · May 22, 2023

RafaelV said:
Poll: Which one does the regression score point model (https://github.com/jaakkopasanen/AutoEq/blob/master/results/RANKING.md#in-ear-headphones) sort out as more Harman compliant? Tripowin Olina or Moondrop Variations? Please try to give arguments for your choice.
View attachment 287397

I mean I think the Harman boys themselves said that not to compare anything within 7 Preference Points of each other, and those are closer than that, but even they don't think you should pay too much attention to the Preference Score alone. And both those frequency responses are close if you wanted to just get an impression by looking at the raw measurements. I think considering they're so close and have balanced imperfections in each then I'd have to listen to both to decide which I liked best, I couldn't decide just on the basis of that graph, although I suspect both would sound better with EQ.

Robbo99999 · May 22, 2023

RafaelV said:
Well, considering the lower bass below 100 Hz, the "mud" region between 100 and 400 Hz and the wild swings between 8 and 15 kHz of the Tripowin Olina, I think is is easy to make a strong point which one will sound better and even closer to Harman.

Not necessarily, I like the way the Olina fills in the 4-6kHz better, which might offset some of it's other negatives, and the 8-15kHz you reference is a less reliable area to judge, and regardless 8-15kHz looks ok to me in Olina. But you see, as we said the raw frequency response graph is so much more useful than just the Preference Score.

markanini · May 22, 2023

Robbo99999 said:
I mean I think the Harman boys themselves said that not to compare anything within 7 Preference Points of each other

It's also said that everything above 90 is basically equivalent. That leaves 83 as the cutoff point, which Variation falls below.

Robbo99999 · May 22, 2023

markanini said:
It's also said that everything above 90 is basically equivalent. That leaves 83 as the cutoff point, which Variation falls below.

Pfft, yeah, Preference Score, it's not needed to assess a headphone, it was needed during the research to slap a number on it to show a correlation, but it's far more valuable just to see the raw frequency response vs the target curve.....which a lot of us realise. Yep, we don't need to get hung up on the Preference Score.

markanini · May 23, 2023

Robbo99999 said:
Pfft, yeah, Preference Score, it's not needed to assess a headphone, it was needed during the research to slap a number on it to show a correlation, but it's far more valuable just to see the raw frequency response vs the target curve.....which a lot of us realise. Yep, we don't need to get hung up on the Preference Score.

Likewise, the interpretation of graphs alone isn't enough. That, including subjective impressions, are each tools to help evaluate a product. And, it's always good to know the limitations of each tool.

DualTriode · May 23, 2023

Robbo99999 said:
Pfft, yeah, Preference Score, it's not needed to assess a headphone, it was needed during the research to slap a number on it to show a correlation, but it's far more valuable just to see the raw frequency response vs the target curve.....which a lot of us realise. Yep, we don't need to get hung up on the Preference Score.

Hello @Robbo99999 and All,

My thought process came up with different words.

If you add up a large sample size of headphone users, say 1000, and average their personal headphone curves, 680 of those headphone users would say that the Harman curve is good enough. That is the research that Harman did. The sample size was statistically significant however smaller than n := 1000.

That is the Harman curve.

A Single headphone user may not know or care about the preference score. We all as individual headphone users, some of us more than others hold the firmly held belief that our preference is special and has little or no relationship to the Harman Curve. We are each a sample size of N := 1.

A marketing concept; A pair of Stealth headphones with a personal custom equalization curve for $10,000 a pair.

Thanks DT

Robbo99999 · May 24, 2023

DualTriode said:
Hello @Robbo99999 and All,

My thought process came up with different words.

If you add up a large sample size of headphone users, say 1000, and average their personal headphone curves, 680 of those headphone users would say that the Harman curve is good enough. That is the research that Harman did. The sample size was statistically significant however smaller than n := 1000.

That is the Harman curve.

A Single headphone user may not know or care about the preference score. We all as individual headphone users, some of us more than others hold the firmly held belief that our preference is special and has little or no relationship to the Harman Curve. We are each a sample size of N := 1.

A marketing concept; A pair of Stealth headphones with a personal custom equalization curve for $10,000 a pair.

Thanks DT

No one's saying the research is not valid, it is & I'm an advocate for it, but as a means of assessing the frequency response of a headphone in a headphone review then the single number that the Preference Score generates is not particularly useful, instead it's far more useful just to visually see the raw frequency response vs the Harman Target. Perhaps not so much if you're a newbie to measurements, but if you're used to seeing headphone measurements then the Preference Score is not particularly useful.

markanini · May 24, 2023

Robbo99999 said:
No one's saying the research is not valid, it is & I'm an advocate for it, but as a means of assessing the frequency response of a headphone in a headphone review then the single number that the Preference Score generates is not particularly useful, instead it's far more useful just to visually see the raw frequency response vs the Harman Target. Perhaps not so much if you're a newbie to measurements, but if you're used to seeing headphone measurements then the Preference Score is not particularly useful.

It's difficult as a human, to detect very broad tilts, to get a picture of the overall spectral balance. Especially when simultaneously accounting for the deviations over narrower ranges that jump out at you, something which most sets have. You would have to be a savant, or a naive optimist to think you can always do that better by eye. That's what makes Harman preference score one of the useful tools for someone like me that, probably spent hundreds of hours looking and measurements.

Robbo99999 · May 24, 2023

markanini said:
It's difficult as a human, to detect very broad tilts, to get a picture of the overall spectral balance. Especially when simultaneously accounting for the deviations over narrower ranges that jump out at you, something which most sets have. You would have to be a savant, or a naive optimist to think you can always do that better by eye. That's what makes Harman preference score one of the useful tools for someone like me that, probably spent hundreds of hours looking and measurements.

Yes, I'm pretty sure I can do better by eye than paying attention to the Preference Score number, I'd far rather see the frequency response, the Preference Score doesn't interest me when it comes to headphone reviews. (There's been numerous posts both probably in this thread & elsewhere here on ASR where it is highlighted the limitations of the Preference Score when it comes to headphone reviews, it can't show the full picture.)

markanini · May 24, 2023

Robbo99999 said:
Yes, I'm pretty sure I can do better by eye than paying attention to the Preference Score number, I'd far rather see the frequency response, the Preference Score doesn't interest me when it comes to headphone reviews. (There's been numerous posts both probably in this thread & elsewhere here on ASR where it is highlighted the limitations of the Preference Score when it comes to headphone reviews, it can't show the full picture.)

Why not simply use all tools that are helpful, instead of fixating on one? This black and white view doesn't shine a positive light on what you are are trying to express as your opinion. If anything it looks like a textbook example of the Dunning–Kruger effect https://en.wikipedia.org/wiki/Dunning–Kruger_effect

Loomynarty · May 24, 2023

markanini said:
It's difficult as a human, to detect very broad tilts, to get a picture of the overall spectral balance. Especially when simultaneously accounting for the deviations over narrower ranges that jump out at you, something which most sets have. You would have to be a savant, or a naive optimist to think you can always do that better by eye. That's what makes Harman preference score one of the useful tools for someone like me that, probably spent hundreds of hours looking and measurements.

In what sense do you mean it's difficult to detect very broad tilts? According to Floyd Toole and Sean Olive, wideband frequency response tilts are by far the easiest things to hear in blind testing. It's actually the exact opposite of what you say. Narrow Q changes are very difficult to detect, not broad Q changes. So if you have the means to compare two headphone graphs to the Harman target and one has a broadband tilt of say -2 dB from 20hz-20khz, it will most certainly sound warmer or darker than the one that doesn't have a tilt.

CedarX · May 24, 2023

Loomynarty said:
In what sense do you mean it's difficult to detect very broad tilts? According to Floyd Toole and Sean Olive, wideband frequency response tilts are by far the easiest things to hear in blind testing. It's actually the exact opposite of what you say. Narrow Q changes are very difficult to detect, not broad Q changes. So if you have the means to compare two headphone graphs to the Harman target and one has a broadband tilt of say -2 dB from 20hz-20khz, it will most certainly sound warmer or darker than the one that doesn't have a tilt.

Is it a case of “blind testing” vs. “in isolation”? In my modest experience, broad tilts are easy to spot when comparing two HPs (rapidly swapping them, not an actual blind test), but difficult to spot on any particular HP I listen to for, say an hour: I get used to it, and my ears accommodates to whatever broad response tilt it may have.

xnor · May 24, 2023

All I can say is that after spending 30 mins just using my ears, my test signals and a parametric EQ the result will sound significantly better to my ears than stock, regardless of how good the measurements are looking. In fact, I am quicker when I don't look at detailed measurements beforehand*.
I'm using sweeps for detecting peaks and dips and imbalances between channels; stationary signals for larger peaks and dips all the way up to overall balance. As for the type of waveforms I use mostly bandlimited noise, sinusoidal waves, but even something like a triangle wave can come in handy e.g. for quickly checking absolute polarity if you have no mic handy.

One could also exclusively use music for this, but I found that using simple, stable signals tailored for each band of interest to me more effective.

*) Virtually nobody has a head and ears that match the models based on averages, both on the measurement and evaluation ("target curves") side. Plus there's tolerances, errors. So when you look at generic measurements and you see a peak at X kHz and then you cut at that frequency, you'll most likely do the wrong thing.
And even if you don't do that, you will be biased by thoughts like "but the measurement showed too much energy in this area, so I should have to cut here".

A more objective way to do this is by using in-ear mics, but these come with their own set of problems and may not be applicable at all for some types of headphones.

Robbo99999 · May 24, 2023

markanini said:
Why not simply use all tools that are helpful, instead of fixating on one? This black and white view doesn't shine a positive light on what you are are trying to express as your opinion. If anything it looks like a textbook example of the Dunning–Kruger effect https://en.wikipedia.org/wiki/Dunning–Kruger_effect

Lol, no need to be so dramatic, Preference Score isn't any use to me, I can see near instantly from a headphone frequency response how easily or sucessfully it can be EQ'd to Harman, (when also combined with distortion measurements), so I don't need to consider anything else - it's not complicated for me - I have no need/use/insight to gain from the Preference Score if it were to be included in a headphone review. It most certainly doesn't have anything to do with Dunning Kruger. For newbies to headphone measurements that want to use their headphone at stock then there is some use in the Preference Score (albeit it's not fool proof), but beyond that not really when it comes to headphone reviews.

(I've done a lot of manual parametric EQ'ing of various headphones & target curves using REW & mostly not using the auto option both for myself & other people, so that's why I can tell almost instantly from a headphone frequency response if it would be possible to EQ it up to the Target Curve without leaving any dips or peaks (whilst using "good EQ practice" re sharpness/Q of filters) - so that's my general use case, Preference Score doesn't factor into that.).

EDIT: I wouldn't mind if a review site posts up Preference Scores of reviewed headphones, but it helps if people understand the limitations of it, and you wouldn't want the reviewer placing undue emphasis on it (ie potentially misleading readers as to it's importance or reliability when viewed as a standalone figure).

markanini · May 24, 2023

xnor said:
All I can say is that after spending 30 mins just using my ears, my test signals and a parametric EQ the result will sound significantly better to my ears than stock, regardless of how good the measurements are looking. In fact, I am quicker when I don't look at detailed measurements beforehand*.
I'm using sweeps for detecting peaks and dips and imbalances between channels; stationary signals for larger peaks and dips all the way up to overall balance. As for the type of waveforms I use mostly bandlimited noise, sinusoidal waves, but even something like a triangle wave can come in handy e.g. for quickly checking absolute polarity if you have no mic handy.

One could also exclusively use music for this, but I found that using simple, stable signals tailored for each band of interest to me more effective.

*) Virtually nobody has a head and ears that match the models based on averages, both on the measurement and evaluation ("target curves") side. Plus there's tolerances, errors. So when you look at generic measurements and you see a peak at X kHz and then you cut at that frequency, you'll most likely do the wrong thing.
And even if you don't do that, you will be biased by thoughts like "but the measurement showed too much energy in this area, so I should have to cut here".

A more objective way to do this is by using in-ear mics, but these come with their own set of problems and may not be applicable at all for some types of headphones.

This is basically what I've ended up doing. After selecting a high Harman scoring set I apply my subjective EQ and it has been the best listening experience for me. I only took care to switch often between multiple songs, to not make my EQ song-specific, and without referencing any graph.

xnor · May 24, 2023

markanini said:
This is basically what I've ended up doing. After selecting a high Harman scoring set I apply my subjective EQ and it has been the best listening experience for me.

Yes.
But let me make the point that it's not even necessarily about subjective preference. There are real objective physiological differences between people's heads, ears, hearing that a personally calibrated EQ can, well... equalize. You can test that by comparing in-ear measurements with dummy head/coupler measurements.

Preferences are added on top of that, though it is hard to separate the objective from the subjective without experience or a neutral reference.
And this is probably where people new to equalization go wrong.

Let me draw an analogy: EQ is like the thermal compound between the CPU and the cooler. Both parts are designed to have smooth, flat surfaces (which cannot be said for most headphones and humans

) but the paste still has to even out all the surface irregularities including those that are completely unique to those parts. Even if the layer is very thin, without it the CPU temperature will be high and performance will suffer.

markanini · May 24, 2023

xnor said:
Yes.
But let me make the point that it's not even necessarily about subjective preference. There are real objective physiological differences between people's heads, ears, hearing that a personally calibrated EQ can, well... equalize. You can test that by comparing in-ear measurements with dummy head/coupler measurements.

Preferences are added on top of that, though it is hard to separate the objective from the subjective without experience or a neutral reference.
And this is probably where people new to equalization go wrong.

Let me draw an analogy: EQ is like the thermal compound between the CPU and the cooler. Both parts are designed to have smooth, flat surfaces (which cannot be said for most headphones and humans ) but the paste still has to even out all the surface irregularities including those that are completely unique to those parts. Even if the layer is very thin, without it the CPU temperature will be high and performance will suffer.

Start the clock before someone confidently says "that's not objective", oblivious to the factors of unit variance and measurement accuracy.

Tks · May 25, 2023

PeteL said:
OK, what is "our" relaxed stance on high high treble content? Is it that measurements are unreliable up there, which I agree with, or is it that any issue in the treble, either measured or heard, is non existent and nothing can be done about it with EQ?

The former mostly.

SimpleTheater · Jun 8, 2023

The big fight between Amir and Shaur - oh, and a review of the Crinacle Reds.

Master Complaint Thread About Headphone Measurements

Master Contributor

Master Contributor

Master Contributor

Major Contributor

Master Contributor

Major Contributor

Addicted to Fun and Learning

Master Contributor

Major Contributor

Master Contributor

Major Contributor

Member

Addicted to Fun and Learning

Active Member

Master Contributor

Major Contributor

Active Member

Major Contributor

Major Contributor

Addicted to Fun and Learning

Similar threads