• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The frailty of Sighted Listening Tests

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,159
Location
Riverview FL
Maybe the next step should be to train dogs to pick the best sounding speaker out of a line-up.

"There are reports of dogs that had definite tastes in music and some sense of what constitutes good music. A bulldog named Dan was owned by George Robinson Sinclair, the organist at Hereford Cathedral in London. He was a friend of Sir Edward William Elgar, best known for writing Pomp and Circumstance and Land of Hope and Glory. Elgar developed a fondness for Dan because he felt that the dog had a good sense of musical quality. Dan would frequently attend choir practices with his master, and would growl at choristers who sang out of tune, which greatly endeared him to the composer."

https://www.psychologytoday.com/us/blog/canine-corner/201204/do-dogs-have-musical-sense
 

beefkabob

Major Contributor
Forum Donor
Joined
Apr 18, 2019
Messages
1,652
Likes
2,093
Maybe the next step should be to train dogs to pick the best sounding speaker out of a line-up.

"There are reports of dogs that had definite tastes in music and some sense of what constitutes good music. A bulldog named Dan was owned by George Robinson Sinclair, the organist at Hereford Cathedral in London. He was a friend of Sir Edward William Elgar, best known for writing Pomp and Circumstance and Land of Hope and Glory. Elgar developed a fondness for Dan because he felt that the dog had a good sense of musical quality. Dan would frequently attend choir practices with his master, and would growl at choristers who sang out of tune, which greatly endeared him to the composer."

https://www.psychologytoday.com/us/blog/canine-corner/201204/do-dogs-have-musical-sense
The handlers bias will affect results.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,159
Location
Riverview FL
 

Rusty Shackleford

Active Member
Joined
May 16, 2018
Messages
255
Likes
550
You are very confused but you make the same mistake of overgeneralizing (something people with no knowledge of the topic do).

Take speaker cables. If someone said they sound different, we would all stand on one side of the fence and demand a double blind test. Why? Because measurements would show them to be identical in audible band. So the notion of subjective test resulting in a wildly different outcome requires a controlled test that is blind since the sighted one seems quite invalid.

Now take the identical situation and substitute a speaker for the cable. Now if someone says the two speakers sound different, do you demand a double blind test? Of course not. We know speakers can and do sound different. This is so obvious that it would be the height of silliness to demand a double blind test.

Take in-room measurement. A single microphone measurement can never show you what your two ears and a brain perceive. It simply can't. Whereas electronic measurements do not have that problem whatsoever. We can measure two channels with ease and there is no impact of room or listener location to worry about.

Bottom line, context is everything. When it comes to sound reproduction, we start with imperfection. Measurements and the right ones help a lot but they are nowhere near as conclusive as they are in electronics where we have ruler flat response and vanishingly low distortion amounts.

For many people, it’s “obvious“ that DACs and amps that measure similarly can and do sound different, but you insist that’s not the case. With the SVS speaker versus Revel speaker and others, we have an even broader suite of measurements that show several speakers are incredibly similar. Yet you say some sound great, and some sound awful. Your defense of that is that the measurements don’t capture all aspects of sound reproduction. Yet that’s exactly what people who say that DACs and amps sound different argue. To them, you say it’s nonsense and demand a blind test. Now, you dismiss the need for blind tests. Do you not see that you can’t have it both ways?
 

Rusty Shackleford

Active Member
Joined
May 16, 2018
Messages
255
Likes
550
I said I know more than the OP because he is quote a research which does not help his cause at all. Seeing how he is just going by the blog post, it is clear he has not read the paper. Have you read the paper?


There is no evidence because what you say about me is not true. I am not unbiased. I have corrected people on that multiple times including this thread. It seems that you all know that you have no case unless you exaggerate my position to the silly extreme of "no bias."

What I have said is that I am a trained listener and I have managed trained listeners. We use them in the industry routinely in sighted tests. Sean Olive as a trained listener also uses sighted evaluations as I noted earlier:

View attachment 76696

I am fully aware of the limitations of what I volunteer them all the time.

The position taken is that my subjective tests should not even be reported because by definition they are wrong. This is the absurd position we are arguing. Not that my subjective results are guaranteed to be right. If I thought this, I would not measure!

So take a seat and listen and learn the topic before jumping in with anger and emotion with zero contribution to the topic. Heaven knows we already have too many like you spoiling this dish.

I have read the paper and numerous others. As posted earlier, it makes clear that even experienced listeners are prone to bias in sighted listening. Once again, what is your proof that your training makes you less prone to bias, as you have asserted earlier?

Keep in mind that not only is Harman’s training now available online, but there are also countless ear training apps and courses, such as SoundGym. Moreover, a large portion of academic study to become an audio engineer involves such training. So, if your assertion is correct, you’ll be far from the only person who can claim to be able to make fine distinctions in sighted listening.
 

Rusty Shackleford

Active Member
Joined
May 16, 2018
Messages
255
Likes
550
Sparing me? What planet do you live in? Does it look like I am enjoying any protection from likes of you or OP?

Go after Thomas one more time with such nonsense and I will ban you permanently. His job is difficult enough without dealing idiotic conspiracy theories like this.

That was neither a personal attack nor a “conspiracy theory.” It was an opinion based observation.

It took this site weeks to ban Verum for spewing racist remarks, and you’re threatening to ban me for an anodyne observation like the above? Give me a break.
 

Archsam

Senior Member
Joined
Apr 8, 2020
Messages
326
Likes
516
Location
London, UK

ex audiophile

Addicted to Fun and Learning
Forum Donor
Joined
Jan 28, 2017
Messages
635
Likes
806
He knows what is faulty. He lists them in his papers! It is folks here who need correcting because a) they don't read the papers and b) run with talking points of very detailed research. Don't do that. Listen to someone who has read the papers, and can both defend and critique them.

And please don't try to be clever with debating tactics within your comment above. "You go talk to the researcher if you think you know more." I know more than you. That is the key point. Make yourself more knowledgeable so that you can defend yourself instead of using such tactics.

This is an opportunity to learn this topic from someone who practices. Don't blow it with comments like that.
I would not waste any more of your valuable time on this cretin. Relax, take a day off and come back knowing how much you are appreciated.
 

Rusty Shackleford

Active Member
Joined
May 16, 2018
Messages
255
Likes
550
I noticed the thread veered of in the direction of questioning Amirs motives/test methods again.
So many questions keep popping up thinking about the goal of this thread (while leaving dr. Olive out of this as he does not seem to be debated/questioned here.)



Who's opinion are you looking for ?
What is the goal of 'collecting' various opinions ?
Speaker reviews done here on ASR ? (Amir basically) or reviews done elsewhere ?
What controversies are you speaking about and was this settled or remained open with diverging opinions ?
You seem to be looking for other peoples thoughts on 'the reviewer's' experience.
Do you mean Amir's experience in particular or other reviewers in general ?
When you mean other reviewers which ones in particular there are so many ?
Have you based your opinion on the reviewer(s) in question on info found on the web, suspicions about him, personal biases or have you been present during a review by the reviewer(s) in question and came to a different conclusion ?
I mean, speakers, room, listening position, music choice, measurements, knowing limitations of measurements, experience in 'linking' measurements with (sighted or blind) observations.
Are the reviewers opinions gospel ? Are other peoples opinions/papers/publications gospel ? Can people make mistakes (be them honest or not) ? Are there financial gains in reviews (free product or otherwise) ? Is the integrity/authority of the reviewer(s) in question being questioned ?

Solder, the issue at hand is that the thrust of this site, including from Amir, has been that listening (particularly sighted listening) is unreliable and invalid. I don’t need to tell you the number of links to subjective reviews that have been posted at ASR and the harsh comments that have greeted them.

Yet, in defending his critical subjective evaluarion of the SVS speakers, Amir has said: “Don't get confused. We are not inventing science here. We are just evaluating an audio product. It is testing. Both objective and subjective.” Since when has that been the case on this site? He also said in the SVS thread that it was a mistake to reduce equipment to one score. Yet isn’t there a giant DAC graph that does exactly that?

He also asserted: “When comparing speakers, the above does not remotely hold. We can guarantee that if someone says two speakers sound different sighted, they would say the same in blind testing. So generalizations about bias absolutely do NOT hold with respect to speakers.” Yet the research shows that even experienced listeners have preferences that vary significantly blind versus sighted.

Now in this thread, he pulls the argument from authority card, calling others “Joe Blow” and claiming that his training and the sheer number of speakers he’s heard means he’s less affected by sighted bias. Once again, where’s the proof for that? There are undoubtedly others who have also been trained as listeners and who have heard more gear than Amir. Do we weight their subjective opinion higher than Amir’s? Does this become simply a CV comparison? After all of the discussion on this site of short audio memories, does anyone think Amir remembers what the Revels he heard weeks ago sound like compared to the SVS enough to say one sounds great, the other bad, despite incredibly similar measurements?

Personally, I have no problem with Amir including subjective evaluations. But he can’t claim some unique ability to provide them, insulting others as “Joe Blow” in the process. Nor can we avoid the fact that many of his new arguments about the fallibility of measurements apply beyond this one case.

This is an issue of basic consistency. It can’t be “bias for thee, but not for me.”
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,590
Likes
239,535
Location
Seattle Area
That was neither a personal attack nor a “conspiracy theory.” It was an opinion based observation.

It took this site weeks to ban Verum for spewing racist remarks, and you’re threatening to ban me for an anodyne observation like the above? Give me a break.
No breaks are given when members going after hard working moderators with pet theories because they don't like how the thread is going. You accused him of protecting me. That is a personal attack and completely baseless challenge of his ethics and without merit. As I said, try it one more time and you will be gone for good.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,590
Likes
239,535
Location
Seattle Area
I have read the paper and numerous others. As posted earlier, it makes clear that even experienced listeners are prone to bias in sighted listening.
You either haven't read it and only claiming to have done so, or read it but didn't understand a thing about it.

No one here is talking about "experienced listeners." We are talking about professionally trained listeners with critical hearing ability. That was not used in the study as I already showed but you did not bother to read:

index.php


Who in here is telling you to listen to sales and marketing people? Or engineers commenting on their own designs? No one. Therefore the study does not read on our discussion.

Second and most important is:

index.php


index.php


Three out of four speakers were Harman brands and all the testers were Harman employees.

In effect then, the study is about how employees working for a company can be biased about their own products and to some extent, how they looked (the cheaper one).

The bulk of that doesn't apply to me. I don't work for companies whose products I am testing. I did not design them either. And I have shown repeatedly how I don't care about the cost, brand or look of products when I judge them.

Now, how did you read the study and missed all of this?
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,590
Likes
239,535
Location
Seattle Area
Once again, what is your proof that your training makes you less prone to bias, as you have asserted earlier?
Where is your proof that it has? FUD is not evidence.

Trained, critical listeners have proven their worth in their much lower threshold of detection for aberrations. From peer reviewed, Journal of Audio Engineering Society paper, Selection and Training of Subjects for Listening Tests on Sound-Reproducing Equipment* SOREN BECH**, AES Member The Acoustics Laboratory, Technical University of Denmark, DK-2800 Lyngby, Denmark

1596648101182.png


Soren showed that with just 4 trials in the same set of tests, the sensitivity and consistency of listeners improved considerably. Trained, critical listeners go through months of training and achieve very high status of reliability and consistency. This is why we use them in industry in sighted tests.

Critical listeners are exceptionally good at going past superficial things like beauty of music to focus and find impairments. Let's agree that the music itself is one of the strongest biases. This is why people like just about every speaker because they play pretty music on them and they can't listen past that. Critical listeners have that ability as I have shown in repeated double blind tests that folks like you can't even dream of passing letting alone doing so.

Remember again: trained listeners are paid to be right far more times than average listeners. If they are easily biased, then they would be wrong most of the time and hence would not have such jobs.

To be sure there is no absolute protection against bias. I have given examples of me failing catastrophically in the past despite my training. But those occasions were few and far in between, leaving me with ability to give quick answers in sighted listening.
 

Rusty Shackleford

Active Member
Joined
May 16, 2018
Messages
255
Likes
550
You either haven't read it and only claiming to have done so, or read it but didn't understand a thing about it.

No one here is talking about "experienced listeners." We are talking about professionally trained listeners with critical hearing ability. That was not used in the study as I already showed but you did not bother to read:

index.php


Who in here is telling you to listen to sales and marketing people? Or engineers commenting on their own designs? No one. Therefore the study does not read on our discussion.

Second and most important is:

index.php


index.php


Three out of four speakers were Harman brands and all the testers were Harman employees.

In effect then, the study is about how employees working for a company can be biased about their own products and to some extent, how they looked (the cheaper one).

The bulk of that doesn't apply to me. I don't work for companies whose products I am testing. I did not design them either. And I have shown repeatedly how I don't care about the cost, brand or look of products when I judge them.

Now, how did you read the study and missed all of this?

You are selectively quoting, limiting the conclusions of the study in a way the author did not intend. It's akin to saying that a pharmaceutical study was designed to tests a drug's effects only on the experimental group, not on the public at large. Olive was clearly using his company's employees as a convenience sample with the conclusions still presumed to apply beyond the company.

Here is Olive's conclusion in the blog post you're highlighting:

Screen Shot 2020-08-05 at 2.06.42 PM.png


Here is his conclusion in the full study:

Screen Shot 2020-08-05 at 2.04.27 PM.png


It was not a study simply about how employees of his company might prefer their own products. The study's conclusion is "if you want to know how a loudspeaker truly sounds, you would be well advised to do the listening tests blind," a conclusion that "applies about equally to experienced and inexperienced listeners." It's presumed to apply to all people, not simply Harman employees.

Once again, where is your evidence that your specific training allows you to listen to two speakers, weeks or more apart, sighted and make meaningful subjective differentiations?

To be clear, I'm not saying that it's impossible for that to be true! I'm merely saying that the Olive study doesn't support that claim. Nor does the 1992 Denmark study, which merely shows training can improve reliability/consistency.

However, I am saying that, if it is true, then you are not the only one who will be able to claim that they can conduct valid sighted listening. Other people are highly trained, some even more than you. (Heck, a great deal of training is available online today.) This conclusion would challenge several common assertions made on this site about the necessity of blind listening and the (lack of) value of subjective reviews.
 
Last edited:

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,559
Likes
1,703
Location
California
For many people, it’s “obvious“ that DACs and amps that measure similarly can and do sound different, but you insist that’s not the case.

To be more precise, I think for many people, what is "obvious" about DACs and amps (tubes excluded), is that if they do sound different, the magnitude of that difference is usually very small. Speakers, on the other hand, have very large and immediately audible differences.

This implies that for SOTA devices, like DACs/amps, the potential effect on a listener's bias on perception is likely a lot larger than the actual differences of interest. Thus, there is a great need to control the listener's bias in order to know that the experimental results reflected actual differences - hence the need for blinding (to reduce bias). As an aside, since with SOTA devices, we know that FR/THD/SNR measurements are extremely predictive of perceived sound quality, we can use these measurements as a surrogate for large scale blind listening tests.

In contrast, the large magnitude of audible differences across loudspeakers is likely a lot greater than the effect of a listener's bias on perception. (And the magnitude of sighted bias on speaker preferences has been quantified in Olive's work.) So, in the case of loudspeakers, it is not absolutely necessary to control listener bias to achieve usable results, but of course the experimental results would be more reliable if you did.

For these reasons, blinded listening observations are usually necessary for DACs/Amps, but only "highly desirable" for loudspeakers when trying to demonstrate differences.

With the SVS speaker versus Revel speaker and others, we have an even broader suite of measurements that show several speakers are incredibly similar. Yet you say some sound great, and some sound awful. Your defense of that is that the measurements don’t capture all aspects of sound reproduction. Yet that’s exactly what people who say that DACs and amps sound different argue. To them, you say it’s nonsense and demand a blind test. Now, you dismiss the need for blind tests. Do you not see that you can’t have it both ways?

I can reconcile this. If two DACs/Amps have nearly perfectly flat FR curves and THD < 0.1%, we can be confident that they sound remarkably similar. However, to date, I have not seen two loudspeaker measurements that are anywhere near that level of similarity. Even when the smoothed FR curves across two loudspeakers look "pretty close," in reality, the measurements obtained are still different by several orders of magnitude compared to typical differences seen in SOTA devices.

If you're looking at spins from two different speakers that you think predict that they will sound exactly the same, post them here, and let the spin doctors here explain to you why and how they are actually different.
 

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,559
Likes
1,703
Location
California
@patate91, you seem to be trying to present yourself as an expert in psychoacoustics and scientific research. Yet something isn't adding up because myself and others are needing to explain basic things to you, and your replies so far have been quite atypical for anyone who has spent any time in the sciences.

My guess is that you are likely somewhere in your secondary school or university education and have taken a course or two that includes a laboratory section. Am I correct? And if not, please correct me and share what background/education you have in this field or in the sciences?

I won't awnser to that, this is unneccessary.

You actually just did. And I think I'm done here. It's been a pleasure talking to you.
 
Last edited:

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,193
Likes
3,754
To be more precise, I think for many people, what is "obvious" about DACs and amps (tubes excluded), is that if they do sound different, the magnitude of that difference is usually very small. Speakers, on the other hand, have very large and immediately audible differences.


By 'many people' do you include audio journalists and online audio pundits? Because if you think they routinely report DAC and amp differences as 'small' you have not been reading them for decades like some of us have.
 

Rusty Shackleford

Active Member
Joined
May 16, 2018
Messages
255
Likes
550
To be more precise, I think for many people, what is "obvious" about DACs and amps (tubes excluded), is that if they do sound different, the magnitude of that difference is usually very small. Speakers, on the other hand, have very large and immediately audible differences.

This implies that for SOTA devices, like DACs/amps, the potential effect on a listener's bias on perception is likely a lot larger than the actual differences of interest. Thus, there is a great need to control the listener's bias in order to know that the experimental results reflected actual differences - hence the need for blinding (to reduce bias). As an aside, since with SOTA devices, we know that FR/THD/SNR measurements are extremely predictive of perceived sound quality, we can use these measurements as a surrogate for large scale blind listening tests.

In contrast, the large magnitude of audible differences across loudspeakers is likely a lot greater than the effect of a listener's bias on perception. (And the magnitude of sighted bias on speaker preferences has been quantified in Olive's work.) So, in the case of loudspeakers, it is not absolutely necessary to control listener bias to achieve usable results, but of course the experimental results would be more reliable if you did.

For these reasons, blinded listening observations are usually necessary for DACs/Amps, but only "highly desirable" for loudspeakers when trying to demonstrate differences.



I can reconcile this. If two DACs/Amps have nearly perfectly flat FR curves and THD < 0.1%, we can be confident that they sound remarkably similar. However, to date, I have not seen two loudspeaker measurements that are anywhere near that level of similarity. Even when the smoothed FR curves across two loudspeakers look "pretty close," in reality, the measurements obtained are still different by several orders of magnitude compared to typical differences seen in SOTA devices.

If you're looking at spins from two different speakers that you think predict that they will sound exactly the same, post them here, and let the spin doctors here explain to you why and how they are actually different.

Olive doesn’t say “highly desirable.” He says “must be done blind.”

While I fully agree that, on average, differences between transducers are many orders of magnitude different than between DACs or amps, I do believe there are people who have reliably differentiated between the latter in blind tests.

However, we’re also not talking about the difference between one of the top-measuring speakers and one of the bottom-measuring speakers. We’re talking about the case of several of the top-performing speakers here, heard weeks apart in sighted listening.

Insofar as Amir’s “recommendeds” are valuable and can impact a company’s business, it seems that some evidence that he can make such distinctions under such circumstances is needed.
 

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,559
Likes
1,703
Location
California
By 'many people' do you include audio journalists and online audio pundits? Because if you think they routinely report DAC and amp differences as 'small' you have not been reading them for decades like some of us have.

Sure, that's a fair question. By "many people," I'm referring to the audience here on ASR, which generally consists of people who reject these types of "the sky opened up" claims published in audio magazines about SOTA devices. But that doesn't mean I don't read them for entertainment purposes.
 

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,193
Likes
3,754
Then we are , once again , talking about a very limited reality, not the wider reality experienced by typical audio hobbyists. We are talking about an ideal.

The 'trained listeners' that Amir would trust for a month of Sundays in sighted tests versus blind , untrained listeners.... who are these people, exactly? What training, exactly, imparts this privilege? What proportion of audiophiles do they comprise? I don't say they don't exist. Obviously Harman has a listener training program, for example. Obviously some standards have been proposed.

I do say than every blessed one of the hundreds of tedious 'audiophile' blathering on about unit X's sound versus unit Y's sound in any given month in audioland is implicitly claiming to be a 'trained listener' -- they are implicitly claiming to ALREADY HAVE the 'chops' to make these distinctions reliably.


How do we tell who are *really* 'trained listeners?

Who is conducting sighted tests with such 'trained listeners' , whose bona fides are provided, along with results?

I suspect that the actual 'trained listeners' will turn out to be so rare, and so rarely documented properly, that we 'objectivists' have been right all along to start from a position that treats 'sighted' reports as being hopelessly....frail.
 
Last edited:
Top Bottom