Amir recommendation criticism

KaiserSoze · Jun 9, 2020

I've only been looking at this site for a short while, and regardless of whether I do or do not agree with Amir's recommendations, I very much appreciate the testing he does. It is by leaps and bounds the best testing I've found, and covering many more speakers than, for example, Stereophile.

That said, I have found myself perplexed by some of his decisions with respect to the recommendations. When I am not able to discern a clear objective basis for how this decision is made, I have no choice but to infer that it is mostly subjective, and this bothers me because to me it disparages the rationale for doing objective measurements. One thing in particular that I'm puzzled about is how he decides where to draw that red line. The slope of the line is drawn differently for different speakers, and his assessment of whether the treble will be too strong or too weak seems to be strongly influenced by where he chooses to draw the red line. At least, this is what it looks like to me. It seems to me that the line should always be flat and positioned so that the bounded areas above the line and below the line are equal. His philosophy is apparently that the line should be sloped, which also confuses me. Is the chart where he draws the line the on-axis response, or is it supposed to be an averaged response or "room response"? As I sit I here I realize that I'm not at all sure. I thought it was the on-axis response, but now I think not, because if it were the on-axis response, there wouldn't be any reason for the line to slope downward toward increasing frequency. In fact, if it were the on-axis response, then the only way it would seem to make sense for the line to be sloped would be if it were sloped in the other direction, since in this case the on-axis high-frequency elevation would be accepted as compensation for the weaker high-frequency response off-axis. No, I'm not suggesting that the line should be sloped up toward high frequency, only that it seems to me that this would make more sense than for it to be sloped down. In any case, this is obviously something that I don't understand. I supposed I should read the writeup about the Klippel measuring system, and maybe this will be explained, but if by chance there is some explanation therein for why the measurements of speakers generally are downward sloping toward high frequency, I will be surprised if there is any explanation of how to determine the ideal slope for each given individual speaker. It just does not make sense to me that this slope should be different for different speakers. And if there is no purely objective way to decide what the slope should be for each individual speaker, then this is an inherent flaw in the measurement methodology. The only way that I can think of, for how this would be objective, is if the line were the very same slope for all speakers, dictated by some theoretical ideal. Oh well.

krabapple · Jun 9, 2020

aac said:
Do you actually know of any such experiments trying to measure fatigue?

You'd think DBTs were just invented recently, have only ever been used to test audition, and no one had ever considered fatigue as a factor in designing them.

( I wouldn't think that.)

(And no I'm not going to post links to the research. There are whole textbooks on sensory testing methods and design.)

echopraxia · Jun 9, 2020

Obviously, the pure objective data provided by Amir's speaker measurements are incredibly valuable -- an unprecedented level of quality objective data on a wide range of speakers. I think many objectivists also would claim that subjective impressions from a single trained listener offer less useful information to shoppers than the objective data. Perhaps the subjective impressions are useful primarily if the reader is fully aware of (and aligns with) with Amir's personal speaker preference (e.g. bass-boosted speakers capable of reaching extremely high SPL in a large room).

To be fair, I don't think Amir misrepresents the meaning of the "Recommended" status in the reviews themselves. The reviews I've seen all honestly disclose the subjective nature of such conclusions. But that doesn't mean there isn't confusion and unintentionally misleading data that result from that.

Unfortunately, I do believe that the phrasing does end up misleading readers (unintentionally, I am sure):

Specifically: I think it's quite fair for readers to assume that reading a "Recommended" status from a site called "Audio Science Review" will be interpreted as something reflective of objective measurements (or at least something resembling a scientific method). In that case, assigning the conclusion "Recommended" or "Not Recommended" to a speaker entirely based on the subjective review portion could be tragically misleading (even if unintentionally so) since it will inevitably lead to some shoppers missing out on speaker(s) that may have been better for them than just those from the "Recommended" list.

In contrast, a more accurate status descriptor (like "Amir's Subjective Score" or "Amir's Preference" or "Subjective Reccomendation") would completely solve this problem.

This misleading effect is unfortunately made worse by otherwise very helpful compilations like this: https://www.audiosciencereview.com/forum/index.php?pages/SpeakerTestData/. When I go to results compilations like the above, especially on a site focused on audio science and objective measurements, the first thing I want to do is sort via some kind of objective ranking! You can use the bars to filter for preference score min and max, but the even more prominent filter here is the unqualified "Recommendation" status which begs the user to filter to just the "Yes" entries.

Anyone I know trying to narrow down a selection of good speakers would first filter to the "Recommended = Yes" speakers, perhaps not knowing that this has absolutely nothing to do with the objective measurements or preference scores.

IMO the status "Recommended" without qualification in this list is perhaps more dangerous of misleading than anything else on this site, but it's not really the "fault" of the compilation: Compilations will always exist. This is why I want to emphasize how misleading the unqualified descriptor "Recommended" is, at least out of context of the review writeup itself.

amirm · Jun 9, 2020

tecnogadget said:
@Amir is it possible to add the room mode EQ FIX to the main post ?

Main post of what?

amirm · Jun 9, 2020

JohnYang1997 said:
But according to existing measurements data. R3 has one of the best mid to high distortion measurements.

I was not calibrating output levels at that time so you can't go by that. Newer reviews are at 86 or 96 dB SPL, enabling proper comparisons. Prior reviews kept the input voltage constant which works for speakers with identical sensitivity but not otherwise.

Jon AA · Jun 9, 2020

echopraxia said:
This misleading effect is unfortunately made worse by otherwise very helpful compilations like this: https://www.audiosciencereview.com/forum/index.php?pages/SpeakerTestData/. When I go to results compilations like the above, especially on a site focused on audio science and objective measurements, the first thing I want to do is sort via some kind of objective ranking!

Am I missing something or does that compilation disprove what you're saying along with the notion put forth by so many that his recommendations are wildly inconsistent and often disagree with the measured performance?

If you sort those speakers by preference score (calculated from objective measurements), the bottom 20 speakers get zero recommendations. Of the top 20 speakers, 15 get recommendations. That seems to be a decent correlation to me.

KaiserSoze · Jun 9, 2020

amirm said:
Main post of what?

Not sure about that, but I just stole this chance to respond to one of your posts so that I could ask you if would please explain how you decided to draw the red line, and what it represents. Is it part of the Klippel measurement per se, or something that you add? And what is the rationale by which it isn't simply drawn horizontal with equal bounded area above and below the line? I'm just trying to understand what this line represents, theoretically. For that matter, if you could shed some light on why the response curve most all speakers slopes downward toward frequency, i.e., what is it about the Klippel methodology that leads generally to this characteristic, this would be very helpful to me and I will appreciate it. Thanks for all the work you do. If I still had my old pair of Advent 5012 I would be wanting to ship one of them to you just to see how it measures!

amirm · Jun 9, 2020

Folks, this is simple. I have some 100 reference clips I have collected over the years in listening to my system and at audio shows. These tracks either sound great to me or they don't. If they don't, I can't bring myself to recommend a speaker. What am I supposed to say, "so when you listened to this speaker, did it sound good?" That it didn't but I still recommended it? It would be obvious that I personally would not be in the market for one so how would I still recommend it?

Now, it is a quandary that objective data is at times in conflict with my observations. There are some known reasons for this (e.g. room modes which I have mitigated since early days) but also unknown ones like distortion, dispersion.

There are also known factors like distortion and how loud a speaker can play. These are not well characterized or at all in the measurements. I set a high standard here both from my training in hearing distortion and playing in a large room with a ton of amplification.

Finally, my judgement could be in error. I don't have time, resources or motivation for rigorous listening tests. I play them for an hour or two and call it done.

Ultimately the world of speaker buying has been chaotic. We bring order to it with objective measurements and some reflections from me. Sorry it is not perfect. I don't know how to make it more perfect. I do know we will get better at it though with time as has already happened in less than six months. Measurement protocol has improved as has my subjective testing. It is where we are now.

richard12511 · Jun 9, 2020

Jon AA said:
Am I missing something or does that compilation disprove what you're saying along with the notion put forth by so many that his recommendations are wildly inconsistent and often disagree with the measured performance?

If you sort those speakers by preference score (calculated from objective measurements), the bottom 20 speakers get zero recommendations. Of the top 20 speakers, 15 get recommendations. That seems to be a decent correlation to me.

You also need to take price into account. Need a way to sort by price/score

*Edit: but yeah I'd still say there is a pretty good correlation there, which is good.

amirm · Jun 9, 2020

KaiserSoze said:
Not sure about that, but I just stole this chance to respond to one of your posts so that I could ask you if would please explain how you decided to draw the red line, and what it represents. Is it part of the Klippel measurement per se, or something that you add? And what is the rationale by which it isn't simply drawn horizontal with equal bounded area above and below the line?

Research shows that what we think of tonality is anchored by low frequencies. So I usually anchor the line around 200 Hz and go horizontally to the right. This more or less matches my subjective impression of tonality. The preference score that others are posting uses the more rigorous method like you are asking and is reflected in the radar charts.

echopraxia · Jun 9, 2020

Jon AA said:
Am I missing something or does that compilation disprove what you're saying along with the notion put forth by so many that his recommendations are wildly inconsistent and often disagree with the measured performance?

If you sort those speakers by preference score (calculated from objective measurements), the bottom 20 speakers get zero recommendations. Of the top 20 speakers, 15 get recommendations. That seems to be a decent correlation to me.

Of course it's correlated! The whole point of objective measurements is that they correlate to subjective impressions. But if a single person's subjective listening results were enough to make reliable purchasing decisions for everyone, they wouldn't be subjective would they? In such a case, we would have no need or value to gain from the more objective measurements that this site provides.

The truth is that the subjective impressions and objective scores do differ. That's fine! But we should just empower readers to make the most informed choices with maximal clarity of wording. My sole point here is that calling the subjective portion of a review "Recommended" vs "Not Recommended" is misleading on a site called "Audio Science Review".

amirm · Jun 9, 2020

KaiserSoze said:
For that matter, if you could shed some light on why the response curve most all speakers slopes downward toward frequency, i.e., what is it about the Klippel methodology that leads generally to this characteristic, this would be very helpful to me and I will appreciate it. Thanks for all the work you do. If I still had my old pair of Advent 5012 I would be wanting to ship one of them to you just to see how it measures!

Which response curve? On-axis? If so, it doesn't always point down. The predicted-in-room does because sound becomes directional at higher frequencies so when you take into account reflections for that measure, high frequency response tilts down.

Thomas savage · Jun 9, 2020

I find this bizarre lol

st379 · Jun 9, 2020

Jon AA said:
Am I missing something or does that compilation disprove what you're saying along with the notion put forth by so many that his recommendations are wildly inconsistent and often disagree with the measured performance?

If you sort those speakers by preference score (calculated from objective measurements), the bottom 20 speakers get zero recommendations. Of the top 20 speakers, 15 get recommendations. That seems to be a decent correlation to me.

The list is not updated. The Q350 is not there and it scored pretty well but Amir did not like them.
Maybe other speakers from last month also scored well I was just looking at Kef reviews because I heard good things about them.

amirm · Jun 9, 2020

echopraxia said:
My sole point here is that calling the subjective portion of a review "Recommended" vs "Not Recommended" is misleading on a site called "Audio Science Review".

I apply as much science as I can to the subjective part of the review. I have training in that regard. I play a single speaker always in the same location. Use tracks that are revealing of tonality differences, etc. So they are not wet thumb in the air stuff you read form other people. They are as scientific as I can make them.

To take it to the next level would require double blind testing of multiple speakers against each other with larger number of listeners which simply is not in the cards. Sometimes you have to settle for a telescope looking at a star than sending someone there to observe.

amirm · Jun 9, 2020

st379 said:
Maybe other speakers from last month also scored well I was just looking at Kef reviews because I heard good things about them.

How is "heard good things about them" ever enter this argument? We have that about everything in audio including stuff that does nothing. If you want to rely on such anecdotal information, then your bar is so low as to make my testing hugely valid!

amirm · Jun 9, 2020

KaiserSoze said:
I supposed I should read the writeup about the Klippel measuring system, and maybe this will be explained, but if by chance there is some explanation therein for why the measurements of speakers generally are downward sloping toward high frequency, I will be surprised if there is any explanation of how to determine the ideal slope for each given individual speaker.

This is the last speaker I tested:

The high frequencies are actually tilting up, not down. So I don't know what you are saying there.

As I explained if you mean the in-room prediction, then physics mandates that due to direcitivity increase as frequencies go up (see the red line above). The smaller the wavelength of sound relative to size of the driver producing it, the more directional it becomes.

echopraxia · Jun 9, 2020

amirm said:
I apply as much science as I can to the subjective part of the review. I have training in that regard. I play a single speaker always in the same location. Use tracks that are revealing of tonality differences, etc. So they are not wet thumb in the air stuff you read form other people. They are as scientific as I can make them.

To take it to the next level would require double blind testing of multiple speakers against each other with larger number of listeners which simply is not in the cards. Sometimes you have to settle for a telescope looking at a star than sending someone there to observe.

That's fair enough. But as you are aware, of course this is going to invite an endless stream of questions at best (and criticism and attacks of bias at worst, as we've seen) to any review where a speaker that measures quite well is not recommended, or a speaker that measures relatively poorly is recommended.

Then again, so will objective measurements (accusations of improper testing, etc. etc.), so I guess there's no winning in that sense.

Personally, it doesn't bother me as long as nobody is being mislead as to the true nature of each data point. I find immense value in these reviews (mostly the objective part, but also some of the subjective part), so I'm quite thrilled by the fact that it exists at all

st379 · Jun 9, 2020

amirm said:
How is "heard good things about them" ever enter this argument? We have that about everything in audio including stuff that does nothing. If you want to rely on such anecdotal information, then your bar is so low as to make my testing hugely valid!

Other people who listened to them. I was just intrested how they measured.
I have no problem with how you recommended I was just confused.
Maybe if you can call it something like "subjective recommendation" it will be more clear.
I can give a small example how taste vary.
My father like to put the treble in speakers all the way up and I like it all the way down.
It is very subjective.

amirm · Jun 9, 2020

st379 said:
Maybe if you can call it something like "subjective recommendation" it will be more clear.

It isn't that. It is the totality of what I see in measurements, my listening tests, build quality, ability to get loud enough, etc.

Amir recommendation criticism

Addicted to Fun and Learning

Major Contributor

Major Contributor

Founder/Admin

Founder/Admin

Senior Member

Addicted to Fun and Learning

Founder/Admin

Major Contributor

Founder/Admin

Major Contributor

Founder/Admin

Grand Contributor

Member

Founder/Admin

Founder/Admin

Founder/Admin

Major Contributor

Member

Founder/Admin

Similar threads