• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Catalogue of blind tests

Oh okay. I thought there was some CSV file with the data for the plots somewhere that I am missing.

1. There’s a woefully small amount of data to draw any conclusions
You mean 40 listeners are woefully small?

Is there any statistics based reason to think sample size is too low or is it your personal conviction that it is too small?

I am not an expert in statistics but when samples are not interrelated, 20-30 samples should give you enough data to calculate mean and control limits, and calculate sigma. In any case, I think two PhDs, would have the necessary statistics knowledge between them to ensure they have enough data points before they publish it, don't you think?

Or maybe you think 4 speakers are too small? What would change if there were 20?

2. The results in rankings between the blind and sighted tests were mostly the same. I’m not saying blind protocols aren’t important. But I don’t see how his data supports the extreme conclusion that the listeners were largely unresponsive to real changes in sound quality. So out of four speakers first and second place were just coincidences?
As I understand it, the conclusion is that nobody is immune to the biases created by sighted listening, regardless of their "expertise" and training. Whether you think think you are immune to biases is also irrelevant. Maybe it was an extreme conclusion in 1994 but today, 30 years later, I'd say it is common sense.

The way I understand the graph is like this : when people can see the speakers they are listening, they rate it a full rating point higher or lower than they would if they could not see it. Order does not change much indeed, except S is the worst rated speaker sighted and T is the worst unsighted, yet the ratings change significantly. People said D is 2 rating points better than S when they could see it, they think they are more or less the same when they cannot.

1700391371463.png
 
Last edited:
Oh okay. I thought there was some CSV file with the data for the plots somewhere that I am missing.


You mean 40 listeners are woefully small?

Is there any statistics based reason to think sample size is too low or is it your personal conviction that it is too small?

I am not an expert in statistics but when samples are not interrelated, 20-30 samples should give you enough data to calculate mean and control limits, and calculate STD Dev. In any case, I think two PhDs, would have the necessary statistics knowledge between them to ensure they have enough data points before they publish it, don't you think?

Or maybe you think 4 speakers are too small? What would change if there were 20?


As I understand it, the conclusion is that nobody is immune to the biases created by sighted listening, regardless of their "expertise" and training. Whether you think think you are immune to biases is also irrelevant. Maybe it was an extreme conclusion in 1994 but today, 20 years later, I'd say it is common sense.

The way I understand the graph is like this : when people can see the speakers they are listening, they rate it full a full rating point higher or lower than they would if they could not see it. Order does not change much indeed, except S is the worst rated speaker sighted and T is the worst unsighted, yet the ratings change significantly. People said D is 2 rating points better than S when they could see it, they think they are more or less the same when they cannot.

View attachment 327752
I mean 4 speakers is woefully small. What would change with 20 speakers is the clarity of correlations or lack there of in the ranking under blind and sighted conditions. It’s pretty hard to tell if 1st and 2nd place were the same by coincidence or not when there are only 4 speakers. To say there is *no* correlation between audible differences and perceived sound quality under sighted conditions the identical rankings of the top two speakers would have to be a pure coincidence. No way to tell with just 4 speakers
 
I mean 4 speakers is woefully small. What would change with 20 speakers is the clarity of correlations or lack there of in the ranking under blind and sighted conditions. It’s pretty hard to tell if 1st and 2nd place were the same by coincidence or not when there are only 4 speakers. To say there is *no* correlation between audible differences and perceived sound quality under sighted conditions the identical rankings of the top two speakers would have to be a pure coincidence. No way to tell with just 4 speakers
If there were more speakers, it might have more significant impact on rankings indeed. I don't think they were investigating that in this study. As the article explains, this was to convince people in Harman that the tests should be blind. I think there are other studies that compare a larger number of speakers.

First and second are not the same by coincidence. Article explains, G and D are the same speaker with a different crossover.
note: G and D were identical loudspeakers except with different cross-overs, voiced ostensibly for differences in German and Northern European tastes, respectively.

The psychological biases in the sighted tests were sufficiently strong that listeners were largely unresponsive to real changes in sound quality caused by acoustical interactions between the loudspeaker, its position in the room, and the program material

Maybe you are referring to this statement originally highlighted by BDWoody?

That conclusion is from the other graph I believe.

When people can see what they are listening, where you put the speakers makes very little difference to sound quality rating they give to the speaker. When they can not see it however, where you put it becomes as important as the speaker it self, therefore biases are sufficiently strong that listeners were largely unresponsive to real changes in sound quality.

Unless you think there are speakers that are not affected by placement, adding more speakers to the test would not change the outcome in my view.

1700394110681.png
 
Last edited:
If there were more speakers, it might have more significant impact on rankings indeed. I don't think they were investigating that in this study. As the article explains, this was to convince people in Harman that the tests should be blind. I think there are other studies that compare a larger number of speakers.

First and second are not the same by coincidence. Article explains, G and D are the same speaker with a different crossover.
note: G and D were identical loudspeakers except with different cross-overs, voiced ostensibly for differences in German and Northern European tastes, respectively.



Maybe you are referring to this statement originally highlighted by BDWoody?

That conclusion is from the other graph I believe.

When people can see what they are listening, where you put the speakers makes very little difference to sound quality rating they give to the speaker. When they can not see it however, where you put it becomes as important as the speaker it self, therefore biases are sufficiently strong that listeners were largely unresponsive to real changes in sound quality.

View attachment 327753
I totally agree that tests should be blind. I was only questioning whether or not the data Olive presents supports his assertion that under sighted conditions listeners were “largely *unresponsive*” to real audible differences. It seems to me that they were somewhat responsive but I think you would need more speakers to tell. Correlated rankings would be indicative of some meaningful responsiveness under sighted conditions. I don’t think there’s enough data to determine whether or not the rankings have some correlation or none
 
I was only questioning whether or not the data Olive presents supports his assertion that under sighted conditions listeners were “largely *unresponsive*” to real audible differences. It seems to me that they were somewhat responsive but I think you would need more speakers to tell.

I disagree.

First lets establish that the real audible differences to which the listeners were largely unresponsive is the differences caused by speaker placement.

1700395610976.png


I believe this can be concluded from the graph above. When you change the position of the speaker, ratings change significantly, pointing to real audible differences.

The graph below however shows that people rate the speakers the same if they can see it, regardless of placement, which we established caused real audible differences.

1700395580428.png


I don't think adding more speakers would make any difference to the conclusion that (as I put it) if people can see what they are rating, they rate what they see not what they hear.
 
I disagree.

First lets establish that the real audible differences to which the listeners were largely unresponsive is the differences caused by speaker placement.

View attachment 327760

I believe this can be concluded from the graph above. When you change the position of the speaker, ratings change significantly, pointing to real audible differences.

The graph below however shows that people rate the speakers the same if they can see it, regardless of placement, which we established caused real audible differences.

View attachment 327759

I don't think adding more speakers would make any difference to the conclusion that (as I put it) if people can see what they are rating, they rate what they see not what they hear.
Enter the audiophiles to claim that their hearing was *less* sensitive blinded, for instance their sensitivity was blunted or changed by the *pressure* of blinded tests.
 
Enter the audiophiles to claim that their hearing was *less* sensitive blinded, for instance their sensitivity was blunted or changed by the *pressure* of blinded tests.
Ignorance has no bounds really. When you focus, as you would need to, in order to hear the minute differences between different gear, your peripheral vision diminishes, the brain narrows the vision naturally.

In any case, if someone is so deeply affected by seeing a curtain instead of a speaker that they can no longer hear properly, and refer to the experience as “pressure”, I think they would have more serious clinical issues than being an audiophile
 
Last edited:
Ignorance has no bounds really. When you focus, as you would need to, in order to hear the minute differences between different gear, your peripheral vision diminishes, the brain narrows the vision naturally.

In any case, if someone is so deeply affected by seeing a curtain instead of a speaker that they can no longer hear properly, and refer to the experience as “pressure”, I think they would have more serious clinical issues than being an audiophile
Being an audiophile is a symptom of clinical issues. We all have them to a certain extent resulting in all sorts of odd behavior, often involving purchases and consumption.
 
I remember following the Zipser challenge in 'real time' on that audio board. Fun days!
Not long after that, Steve Zipser and his wife came through Colorado.

I had a lovely dinner with him and his wife. He was actually a really nice guy. I think we all were, once we got offline.
 
Here’s a nice test of DACs by Tom’s Hardware. This also reveals issues with volume calibration, which yielded a false positive. Once recognized, predictable results.

 
Did I link this one on high definition audio? 318 participants.


Hi-Res Audio or HD-Audio provides no perceptible fidelity improvement over a standard-resolution CD or file. CD-spec and hi-res audio versions sound identical to vast majority of listeners through systems of all kinds. I’ll present the track by track breakdown over the next few articles, but the responses present a picture that is undeniable. In fact, over 25% of the listeners that submitted their results indicated “No Choice” when asked to pick the hi-res track. People were honest and acknowledged that they could not tell the two different versions apart. And those that made a selection admitted that it “was virtually impossible” to detect any differences or “they were essentially guessing” which was which.
 
Hi I’m new to this hobby and I’m trying to do some research. I’ve tried to read most of the links in this thread but I was unable to find the answer to:

Are there any good evidence in support of premium headphone amps/dac? Is a $50 amp the same as a $2000 tube amp? I guess there’s really two questions here. Is it possible to tell the difference in a double blinded test? And is the more expensive one the preferred one in a blind test?
 
Hi I’m new to this hobby and I’m trying to do some research. I’ve tried to read most of the links in this thread but I was unable to find the answer to:

Are there any good evidence in support of premium headphone amps/dac? Is a $50 amp the same as a $2000 tube amp? I guess there’s really two questions here. Is it possible to tell the difference in a double blinded test? And is the more expensive one the preferred one in a blind test?
Good questions.

DACs are rarely different in an audible way. The main differences in the amps are drive capability (power, voltage, current) and source impedance.

Price is irrelevant, it's all about the ability to drive your phones and frequency response errors from high source impedances. Expensive tube amps may have worse performance (or at best may be audibly equivalent to a less expensive solid state amp), but are mostly for the bling factor.
 
Good questions.

DACs are rarely different in an audible way. The main differences in the amps are drive capability (power, voltage, current) and source impedance.

Price is irrelevant, it's all about the ability to drive your phones and frequency response errors from high source impedances. Expensive tube amps may have worse performance (or at best may be audibly equivalent to a less expensive solid state amp), but are mostly for the bling factor.
Thank you for the guidance! This is what I suspected. Are there any data or tests I can review?
 
Thank you for the guidance! This is what I suspected. Are there any data or tests I can review?
There's a ton of them on this site- use the Review Index. Good luck!
 
There's a ton of them on this site- use the Review Index. Good luck!
I hear what you are saying, logically it makes sense. On the other hand, it's hard to believe there are so many companies, and even within one company there are so many amp products, and they will all sound the same? They dont advertise as just 'more features', but that the sound is superior
 
I hear what you are saying, logically it makes sense. On the other hand, it's hard to believe there are so many companies, and even within one company there are so many amp products, and they will all sound the same? They dont advertise as just 'more features', but that the sound is superior
Of course they do - they are in the market for selling amplifiers, not blind testing.

It's what the law calls "puffery", which is the advertising practice of making a product sound better than its competitors. Certain claims are to be expected ("sounds better"; "unequaled") that are not capable of factual dispute.
 
A true catalog of blind test should be written in braille.
 
Back
Top Bottom