[YouTube] The Big Measurement & Listening Mistake Some Hi-Fi Reviewers Make - SoundStage! Real Hi-Fi

MattHooper · Jul 29, 2021

amirm said:
They do and as I mentioned, the subjective aspects provide nothing of value.

To you.

I have found plenty of subjective reviews, including in Stereophile, to be of value to me.

Some have led me to speakers that have given me great pleasure, where I found the subjective descriptions corresponded very well with my own subjective impressions.

MattHooper · Jul 29, 2021

First, I certainly agree that reviews that include both measurements and listening reports are more valuable than just listening reports. The more data the better.

However:

I don't think Amir can have it both ways: that is attempting to maintain some level of validity for his sighted listening tests (for speakers) but dismissing the subjective reports of others - e.g. the purely subjective reviewers, like Stereophile writers etc - as "of no value."

It just doesn't parse out.

Take these scenarios:

1. Imagine your sighted listening impressions of speakers always corresponded exactly with what you or any technically astute person would expect from the measurements. That is, they essentially predict what you could hear perfectly. If that were the case, there would be no need for the subjective portion. It would be moot. (At least to anyone who understood the measurements)

However, often enough, that is not what we get.

2. Sometimes there is an element of surprise in the listening tests, where certain measurements suggested an unacceptable coloration/problem, in practice the listening suggests otherwise. Or maybe taken all together, the balance of the speaker sounds better than one might infer directly from the measurements.

In THAT scenario, what are we to think of the worth of the listening test?

We can conclude that the listener lacked the acuity to hear the actual problems in the speaker (that maybe another better listener would hear).
Which suggests the the listening report is of no worth.

On the other hand, we can accept that the listener was ably describing the actual character of the speaker that another listener would be likely to hear as well. In which case the listening portion IS of some value "the measurements look worse than the speaker actually sounds, and here's what it actually sounds like..."

So to the extent your listening reports correlate to what the measurements imply, they are accurate and informative. But also to the extent to which you report subjectively surprising results not easily inferred from the measurements, again, if that's informative it still grants the subjective aspect primacy! The correlation between measurements is only as good as the reliability of the subjective report, and any disparities between measurements and subjective impressions are also predicated on the reliability of the subjective report.

In other words, the PURELY subjective report of a good listener who can translate what he hears in a way others can understand, is clearly valuable. And if a subjective report goes in to great enough detail, then it's a reasonable proxy, or at least short-hand, for most of what the measurements would tell you anyway!

The worthiness in the subjective report is in how it can give knowledge to the reader about how a piece of gear sounds, predict what "you" will hear too.

And I have found numerous subjective reviewers have provided just that type of service. I have found subjective reviews of many speakers I've heard or owned to correlate very well to the listening reports in the reviews of those speakers, as well as exchanging listening notes with some other audiophiles. So I reject Amir's apparent position that, apparently, only his subjective reports are trustworthy or of any value (with respect to the subjective review community).

And note, if the reply is "But many of my subjective notes correspond well to predictions of the measurements" that's good, except the issue has already been brought up about the influence of knowing the measurements first! People here complain all the time about how subjective reviews "prime" the audiophile to "hear what they expect from the description in the review." Yet this is precisely what one could say about looking at measurements first. Someone who infers sonic characteristics from measurements, who looks at the measurements first, is coming to the listening session already primed with an "impression of what he expects to hear" just as in the audiophile review scenario. It would be silly to worry about the bias in the first case and dismiss it in the second. Additionally, to say "I'm a reliable reporter of the sound because my subjective impressions often match what you can see in the measurements" sounds at first glance impressive...but if they are looking at the measurements first it seems more like a foodie saying "I'm so knowledgeable about Chef Thomas Keller's restaurant that I can tell you what items will be on the menu tonight at The French Laundry (Oh, and btw, I saw the menu before-hand)."

None of that is to argue that Amir is not a very skilled listener. I believe he is! And I like the subjective portion of his reviews.

BUT...I do take issue with the constant implications that OTHER writer's subjective reviews are "worthless" in comparison. I have found plenty to be accurate reports to what I hear. They are "worthwhile" in that respect in exactly the way Amir's subjective reports could be "worthwhile" to me.
And, further, the subjective reviews typically go in to much more depth than Amir's pretty brief reviews, and to me are more informative in that respect. That's the thing with the subjective review portions from the more objective-measurement oriented. The attitude tends to be that most of what you want is in the measurements, but here's a few brief words on the sound, and any more depth is dismissed as "getting all subjective/audiophile what they do in those naughty subjective reviews, and we don't want to go down that road."

But if listening impression reports are in principle valid and informative in the first place...we can accurately describe the sonic characteristics of what we are hearing, then there is lots more in the sound experience from a speaker than I typically see in reports on this site. I often go to the subjective reviewers who seem to care about certain aspects of sound, or at least care about describing it, than I get from the more objective-oriented reports.

amirm · Jul 29, 2021

So let me summarize. Our goal as reviewers is to produce reliable information that you can trust. When challenged, anything we say should have a solid foundation or something close to it so that said reliability can be challenged and verified.

A pure subjective review done by itself or prior to measurements fails this test. So what that the reviewer says the sound is this and that. No reliability can be established. Formal research has shown that reviewers are notoriously unreliable anyway in producing consistent opinions regarding tonality of speakers:

So the thing that reviewers insist on making them good, i.e. having purely listened to other speakers, does not prepare them to provide reliable information. What does is formal training which none have.

Even if the listener were good at this, the research once again says that speaker preference can only be determined in 3-way to 4-way listening. Without it, listener does not have proper frame of reference as to which speaker sounds right, and which doesn't.

Other than isolated studies, it simply is not practical, nor have I seen any motivation on behalf of subjective reviewers to embark on the multi-way, controlled testing. So this is simply not in the cards.

We are left then with just measuring. Or measuring, listening, EQ, listening which is what I do. This process teases out speaker performance not tested in measurements clearly (e.g. max playback SPL), which is critical to do. In addition, the equalization can be tried by other owners of said speakers and confirmation acquired of the efficacy of them and by implication, their reliability of tonality aberrations. Instant, blind testing can be performed with equalization in whole or by specific filter, by any owner of the speaker without any elaborate testing as would be required in multi-way listening tests. There is no such possibility with simple listening with or without knowledge of measurements.

Another aspect of EQ development is that it better visualizes the impairment in speaker tonality. I strive to find the most offensive and obvious failures in the speaker response. This separates the high level bit, from low level ones.

Importantly, equalization is a super powerful tool in improving performance of just about any speaker. As I have noted, Harman has used equalization to take the performance of a single surrogate headphone down to terrible and up to excellent in their research (to emulate the sound of other headphones). EQ at no cost could significantly improve the sound of a system.

For above reason, I think any speaker review that doesn't include a closed loop development of appropriate EQ is incomplete work.

In summary, I am not here to present a pure, subjectivist assessment of a speaker's sound. I don't know how to do that without the elaborate controlled testing that Harman performs. I am here to take measurements, put them to use and see to what extent they characterize the sound of a speaker. And further, how to develop an effective EQ that is confirmed in controlled listening to be effective.

I suspect speaker designers at Harman do the same but with changes to crossover. I am confident they perform quick AB tests in analog domain that way without elaborate double blind tests on every iteration. I am doing the same except that I am using electronic equalization which gives me speed and far more capability. As a reviewer, this is critical to the job I have to do.

Let me emphasize this point again: nothing has been more transformative for me in all the audio testing that I have done like equalization. It is like having a car and you spend a few minutes and make it have four times the horsepower, much better braking and handling! I live to find the right EQ that provides this increased capability. As an owner, you have a shot of instantly improving the sound you get. If you are not, you can still buy a speaker (or headphone) that doesn't work well out of the box, and make it so.

Listening tests enable me to develop this EQ. If I had not embarked on this phase of testing, I would have never landed in this position.

If I want a legacy moving forward, is to be remembered for these EQ profiles, than measurement data themselves. It is that critical and important!

amirm · Jul 29, 2021

MattHooper said:
To you.

I have found plenty of subjective reviews, including in Stereophile, to be of value to me.

Well, explain it to me. I just went to stereophile review page of speakers, and this is one of the first few in there: https://www.stereophile.com/content/falcon-gold-badge-ls35a-loudspeaker

"• They throw a wall-penetrating, precisely mapped, extraordinarily detailed soundstage.

• They deliver critical levels of timbral accuracy and speech intelligibility.
[...]
• It compresses macrodynamics.

• It can sound grainy."

How can you show any of these are true? Heck, how do you even define a thing called "macrodynamics?" And grainy? How do we know that was not in the recording or his imagination? Or bias due to this being a cheap speaker?

Critical levels of timbral accuracy? And speech intelligibility? On what basis?

You say I can't give you my subjective analysis if I don't accept theirs. How do you remotely compare what I write in this regard compared to what Herb writes above?

ahofer · Jul 29, 2021

The other problem with the big subjective reviewers is that, at best, they disguise any criticism and minimize flaws. Only when you get them to rank things do you get a sense of what they really like, and even then, they insist on price-stratification. I’m afraid I’ve decided it is just a kind of pornography, designed to fill readers with false expectations of what that one upgrade might do for them and justify their obsessions. At least in the audio realm, that sort of suggestion actually works for a little while and the primary victim is the user’s wallet.

MarkS · Jul 29, 2021

amirm said:
Well, explain it to me.

Not all reviews read like the one you quoted.

amirm · Jul 30, 2021

MarkS said:
Not all reviews read like the one you quoted.

Countless ones do. But here is another take: "Another area that occasionally stood out to me was in the 100-200Hz where some lower vocals tended to sound a bit “hot”. It was a rare occurrence but noticeable, nonetheless. "

What do you think of this one?

restorer-john · Jul 30, 2021

amirm said:
Heck, how do you even define a thing called "macrodynamics?" And grainy?

Oh Amir, you've gotta get with the program.

Just get yourself a copy of "High Fidelity Reviewers Dictionary and Thesaurus". Pretty sure the 2021 revision has an extra section with all the latest buzzwords and 50 bonus phrases to create compelling content.

MaxBuck · Jul 30, 2021

amirm said:
So let me summarize. Our goal as reviewers is to produce reliable information that you can trust.

I have no doubt that is your goal. But for many reviewers, the goal is to sell clicks.

MaxBuck · Jul 30, 2021

amirm said:
Well, explain it to me. I just went to stereophile review page of speakers, and this is one of the first few in there: https://www.stereophile.com/content/falcon-gold-badge-ls35a-loudspeaker

"• They throw a wall-penetrating, precisely mapped, extraordinarily detailed soundstage.

• They deliver critical levels of timbral accuracy and speech intelligibility.
[...]

Amir, I think presentation of a detailed, stable soundstage and intelligible spoken word (assuming the recording supports both these) are important characteristics of an excellent speaker, and they're entirely clear and understandable parameters (though somewhat subjective). The other stuff not so much.

Blaspheme · Jul 30, 2021

MaxBuck said:
Amir, I think presentation of a detailed, stable soundstage and intelligible spoken word (assuming the recording supports both these) are important characteristics of an excellent speaker, and they're entirely clear and understandable parameters (though somewhat subjective). The other stuff not so much.

Timbre is a longstanding concept in music. Basically it means the harmonic content of a note or chord played on an instrument (or anything that makes a sound). Frequency response linearity should cover it, but it's more complex in practice, because harmonic content changes as the envelope of a sound progresses (attack/sustain/decay).

When speakers reproduce sound in a room, the timing at different frequencies my not be consistent, contributing to subtle differences in timbral accuracy. Among other factors. Measurable to a degree, but a bit beyond the basics. People who create synthetic sounds do very complex analysis/models to try to get it right.

Macrodynamics shouldn't be hard. Speakers that move a lot of air quickly and are well controlled, and don't suffer compression or distortion with rising SPL will be good at this, for example. Within different speaker size classes, some will be better than others. And so on.

Blaspheme · Jul 30, 2021

restorer-john said:
Oh Amir, you've gotta get with the program. Just get yourself a copy of "High Fidelity Reviewers Dictionary and Thesaurus". Pretty sure the 2021 revision has an extra section with all the latest buzzwords and 50 bonus phrases to create compelling content.

You jest, but Stereophile's glossary is servicable.

Pennyless Audiophile · Jul 30, 2021

amirm said:
Well, explain it to me. I just went to stereophile review page of speakers, and this is one of the first few in there: https://www.stereophile.com/content/falcon-gold-badge-ls35a-loudspeaker

"• They throw a wall-penetrating, precisely mapped, extraordinarily detailed soundstage.

• They deliver critical levels of timbral accuracy and speech intelligibility.
[...]
• It compresses macrodynamics.

• It can sound grainy."

How can you show any of these are true? Heck, how do you even define a thing called "macrodynamics?" And grainy? How do we know that was not in the recording or his imagination? Or bias due to this being a cheap speaker?

Critical levels of timbral accuracy? And speech intelligibility? On what basis?

You say I can't give you my subjective analysis if I don't accept theirs. How do you remotely compare what I write in this regard compared to what Herb writes above?

These review are useful because they give me an idea of the impressions and sensations that I may have listening to a piece of equipment.
The value of your job is descriptive (why I may have those impressions) and prescriptive (do X and you should have good impressions).
Plus calling out some scammers, which is never a bad idea.

However the only guide I follow, are my sensations and impressions, which may be totally off with the measurements. I recently listened to the B&W 607 Anniversary, and, despite what you may say in your review, they sound flipping fantastic to me (I will probably buy them). Same thing with the Klipsch sound.
In the end, we are doing this just for fun, nothing else matters.

Blaspheme · Jul 30, 2021

amirm said:
If I want a legacy moving forward, is to be remembered for these EQ profiles, than measurement data themselves. It is that critical and important!

Interesting. I gloss over the PEQ section. If you've done mixing/EQ for music/soundtracks/etc it's been around for a long time. There is some practical novelty and utility using it as a learning/diagnostic tool for speakers, but I'm surprised you think it groundbreaking or memorable to the degree you describe. There are better methods for pure playback reproduction—a somewhat different case—but it's good you are learning something from the process.

amirm · Jul 30, 2021

Blaspheme said:
Interesting. I gloss over the PEQ section.

No wonder you have missed the point of this discussion despite me repeating this over and over again.

Blaspheme said:
If you've done mixing/EQ for music/soundtracks/etc it's been around for a long time. There is some practical novelty and utility using it as a learning/diagnostic tool for speakers, but I'm surprised you think it groundbreaking or memorable to the degree you describe. There are better methods for pure playback reproduction—a somewhat different case—but it's good you are learning something from the process.

Research tells us that flat on-axis response is what we want in the speaker. This can be achieved through electro-mechanic and analog means (i.e. crossover), or electronically using DSP. I use the latter, The designer uses the former. Superbly flat response speakers use the same approach I use in active crossover and correction using DSP. So there is nothing to be "surprised" about here unless you think there is some magic in an analog crossover that can't be replicated in a DSP/active speaker.

Equalization is completely a "thing" with headphones. That audience is far, far ahead of speaker listeners in embracing measurements and equalization. It is time speaker users catch up and realize the immense benefits equalization provides in correcting for on-axis response errors which are the #1 problem with speakers. So you best start to pay attention to the EQ sections of the review.

Ericglo · Jul 30, 2021

Erin posted this in replies on the youtube video.

Below is what I replied to the Soundstage Network video which was the impetus for the discussion over on the other site...

-----------------

I'll give my two cents, as a fellow reviewer who relies heavily on objective data to provide information.

For the record, I agree that listening should be done before measurements. We already know that ideally every step would be infinitely- quadruple-blind. But that is just not feasible. At least not for the regular folk like me. So, I try to avoid the biases I can... at least within the highest degree I can.

When I get a speaker my review process is:
1) Before I get the speaker, I (try to) avoid as much information as I can. I know this seems counterintuitive to what a reviewer should do. You'd think you want a reviewer to be well versed in what they're about to review. But for me, that's just one more area of potential influence. If I study the design and try to discover the purpose, what others thought of it, etc, then I go it to with those preconceived notions.
2) I listen. I take a notepad, my iPhone or a laptop and I take notes. I've been DIY'ing, tuning and "judging" systems for over a decade and have steadily worked to align things I hear with frequencies. I don't claim to have a golden ear ... I don't have perfect pitch, I don't have a musician's background. But what I do is to simply focus on the things I am familiar with. So with that and with some bit of logic, when I hear something in a song that stands out - for one reason or another - I am able to get a pretty reasonable estimate of the frequency it occurs. Maybe within an octave or so. Sometimes better. Sometimes worse. Sometimes it is not the fundamental that is the issue but the harmonic. I use a website called soundgym for ear training. It's worth checking out if anyone out there wants to work on their listening skills as a reviewer. I write down the frequency range I think is the problem. For example, if I hear a lower male vocal resonance then I might say "resonance ~150Hz". Otherwise, me saying "midrange" doesn't help me when I'm trying to fine tune. In a way, I suppose it is very similar to the method you mentioned where listeners were instructed to draw out the response they think the speakers had as they were listening. But doing this also helps me to separate the speaker from the room because if the anechoic response(s) doesn't indicate an issue then I can check the measured in-room/in-situ response to see if that might be the issue.
3) I measure the anechoic response and do my other measurements. The whole dang ordeal.
4) I look at my notes. I compare them with the measurements. I try to find things that correlate.
5) I re-listen with the measurements in mind. I sometimes use EQ to adjust the things I noted based on the measurements to see if that seems to resolve any issues I heard.
6) Do some research on forums and other reviews to see what others thought of the speaker and what questions folks have about the speaker that I can try to address.
*) Somewhere in there (after step 2) I will measure the in-room response with a spatial average. Usually I share this data as well but I've gotten to where I don't lately simply because I don't want to eat up the bandwidth and I don't see the benefit anymore.

I take all the above and... I tell people: "hey, here is what I did, what I heard, what I see in the measurements, and what I think all this means".

I know me. I know myself well enough - from years of experience building, measuring and manually tuning systems (usually only via DSP) that what I see in the measurements, I will be influenced by. I found that measuring first and then listening is counterproductive to me trying to improve a system because what typically would wind up happening is I would start trying to "fix" the things measurements showed to be a "problem" and while it sometimes worked, it most often resulted in me over-correcting the response or spending too much time worrying about things that were less problematic and I would miss other attributes that maybe should have gotten more attention. How does "tuning" a system relate to "evaluating" a system? Well, they're not the same for sure. But the end goal is the same: to discover things you like or do not like and then you work to improve them. In a DIY design that's necessary. In a completed speaker, there really is nothing you can do. But the process still bears the same useful information.

That's my personal process. It is what seems logical to me and it has worked well for me, my friends and (hopefully) those who choose to read/watch my reviews as well.

I am sure some will disagree with your video and therefore my methods as well. I can’t change how others review products. And frankly, I don’t care enough to belabor the point. I’ll worry about myself and provide my reviews in what I feel is most helpful to my audience and if there is pushback then I’ll reconsider my methodology. Having said that, I think it says a lot that when the studies were done wrt what listeners preferred, that the listeners were not exposed to the measurements first but rather instructed to draw the response curve and those curves were then compared to the data to find out what people generally liked/disliked. Per Dr. Toole: "When listening “blind,” without the biasing influences of price, size, brand and appearance, people turn out to be remarkably similar in [terms of] what they like and dislike." But, as I said above, without research funds to pay for the means necessary to do these kind of reviews, we do the best we can. Therefore, it is logical that by adding another previously-known variable (in this case "measurements") to the queue, you are even further away from the goal of removing biases. If there is a reviewer out there with their own Harman-esque speaker comparator for double-blind listening evaluation then they aren't sharing that info.

My long-winded two cents is now over.

- Erin

Blaspheme · Jul 30, 2021

amirm said:
No wonder you have missed the point of this discussion despite me repeating this over and over again.

I assume you misunderstanding is wilful/rhetorical? As described, I gloss over your PEQ method because I use DSP for playback EQ and room correction. Prior to that I used PEQ, and before that graphic EQ. No need to go back in time.

amirm said:
... So there is nothing to be "surprised" about here unless you think there is some magic in an analog crossover that can't be replicated in a DSP/active speaker.

I'm surprised you think PEQ is novel. I'm aware of how you may find it useful. The remainder of your response isn't apropos that surprise (I'm not surprised much, btw, but I was being polite).

amirm said:
Equalization is completely a "thing" with headphones. That audience is far, far ahead of speaker listeners in embracing measurements and equalization. It is time speaker users catch up and realize the immense benefits equalization provides in correcting for on-axis response errors which are the #1 problem with speakers. So you best start to pay attention to the EQ sections of the review.

Not apropos either, but I use DSP for headphones, a standard profile for a popular model a custom profile for a less common model. It isn't new to me.

amirm · Jul 30, 2021

Blaspheme said:
I'm surprised you think PEQ is novel.

PEQ is not novel at all. Building a library of hundreds and maybe 1000+ PEQ corrections for speakers of all kinds, is absolutely novel. I would quit today if such was available for speakers today.

Blaspheme said:
Not apropos either, but I use DSP for headphones, a standard profile for a popular model a custom profile for a less common model. It isn't new to me.

Seemed to be quite foreign to you in the context of correction profile for a speaker (NOT room). So much so that you said you skip that whole section of my review.

Blaspheme · Jul 30, 2021

amirm said:
PEQ is not novel at all. Building a library of hundreds and maybe 1000+ PEQ corrections for speakers of all kinds, is absolutely novel. I would quit today if such was available for speakers today.

Yes, I referred to that when I said "there is some practical novelty and utility using it as a learning/diagnostic tool for speakers". I should add, pedagogical (teaching) tool also. For me, for mixing/creating P/EQ obviously remains useful. Again for me, for playback of pre-recorded material, it's been superseded. That doesn't mean a library of PEQ has no value to anyone (I try and avoid those hyperbolic statements, even occasionally call out others when they make them).

amirm said:
Seemed to be quite foreign to you in the context of correction profile for a speaker (NOT room). So much so that you said you skip that whole section of my review.

You misunderstood and made an incorrect inference. Back at post #284, I said "I don't apply general purpose or room-agnostic EQ for others to use. I'm not reviewing speakers though, so my application isn't the same as yours." Clear now?

Edit: also my 'gloss over' means 'skim' not 'skip'. You can speed-read, yes?

amirm · Jul 30, 2021

Ericglo said:
Erin posted this in replies on the youtube video.

"For the record, I agree that listening should be done before measurements. "

And that is precisely why we are having this discussion. There is this false assumption that if you take away measurements from someone, his subjective reviews automatically become less biased and more correct. There is no evidence to support this whatsoever. If this is all that was required to get correct data, then Harman could conduct all their listening tests sighted and just not tell the testers the measurements!

Here is Dr. Toole in his book:

The marketing story is there, front and center when a speaker is sitting in front of you. So no way you can try to make a credible case that your subjective evaluation is proper and defensible.

As I have said, with equalization, we can test segments of frequency response error fully blind. Create a filter, switch it on and off and see which way preference falls. Since you are testing a speaker against itself, none of the bias factors come into play there. You either like the filtered response better or not.

What is also powerful about the above is that you instantly hear the effect of that response error. Not only do you get educated on what it sounds like, but also can quantify its impact.

I contend that steering a subjective reviewer with reliable measurements sharply increases the odds that they give us correct subjective impression. As consumers, it is to our benefit to have them be better informed this way vs wild guesses as to what they are hearing.

[YouTube] The Big Measurement & Listening Mistake Some Hi-Fi Reviewers Make - SoundStage! Real Hi-Fi

Master Contributor

Master Contributor

Founder/Admin

Founder/Admin

Master Contributor

Major Contributor

Founder/Admin

Grand Contributor

Major Contributor

Major Contributor

Senior Member

Senior Member

Active Member

Senior Member

Founder/Admin

Senior Member

Senior Member

Founder/Admin

Senior Member

Founder/Admin

Similar threads