• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Double Blind Testing FAQ Development

OP
C

CMOT

Active Member
Joined
Feb 21, 2021
Messages
147
Likes
114
This is why the FAQ/Guide needs to be very simple and clear. If we make it too high a bar, no one will do it and few will believe. The perfect IS the enemy of the good.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,052
Likes
36,427
Location
The Neitherlands
So simple and clear.... start drafting and then refine from the comments you get.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,663
Likes
241,011
Location
Seattle Area
Let's get a simple scenario done. Person changes power cord and "hears" an improvement. Such a person is not going to be able to get a switchbox for this, and probably wouldn't accept one anyway. So let's cover this scenario for now.

Intro:
Our ears receives audio signals but it is our brain that interprets that. Unfortunately they brain uses more than sonic input to decide what it is perceiving. It routinely uses our experiences in the past and other sensory input to draw conclusions. This is why high-end restaurants serve food with pretty presentation and use the saying, "you eat with your eyes." It is important that when assessing audio differences, that we don't "hear with our eyes."

0. This FQA is for devices that don't have variable output such as cables, digital sources, various tweaks (e.g. cable lifters, outlets, etc.).

1. Since we are interested in just the sound waves that hit your ears, it is critical that any testing removes this knowledge. Note that not wanting an outcome does not fix this problem. You listen differently when testing from moment to moment and even if the sound has not changed, you can perceive a difference.

2. Run the test as you have been but this time have someone else switch the test item with another. You decide how long you want to listen. If you need a full day, then have the switching happen once a day. If you think the difference is very immediate and large, then have the switching happen quickly in sequence as you ask for it.

3. It is extremely easy to get lucky and guess correctly which items is which once, twice or maybe even five times in a row. Try this with a coin flip and you will see the reality of this. As such, the test needs to be run enough times so that we can rule out chance. So repeat the experiment at 10 times and aim to get 9 instances right. Anything less unfortunately means that the difference is not likely to be there even though your intuition says otherwise. Remember, if the difference is large, then you should be able to get 10 out of 10 right.

4. Have the person doing the switching keep a log and randomly switch or not switch the samples. You can use a random number generator like: https://www.random.org/. Put in 0 for minimum and 1 for maximum. Hit generate and it will give you a random 0 or 1. Assign 0 to one sample and 1 to the other. Have the person doing the switching (proctor) do this 10 time and generate the sequence for the test. Once there, have the proctor write down your answers without any kind of communication/feedback to you.

5. It goes without saying that you have to hide the nature of change. Simple solutions like turning your back to the device, or using a cover may be effective.

6. Note that you can run the test with no stress as no one is looking at you and the outcome is just between you and the equipment.

@CMOT, please focus on getting this done as this is the most pressing need I have in dealing with myriads of people coming to us with this type of scenario.
 
OP
C

CMOT

Active Member
Joined
Feb 21, 2021
Messages
147
Likes
114
Hi Folks. Been busy. But won't drop this, just need to take it as I have time. Amir's outline above is a good start. I have been thinking about this in down time and here is a proposed structure:

1. Introduction: Goals, Principles, etc (1 page or so)
2. Stimuli - what signals should be used (ideally). If @j_j is willing, maybe he could create a suite of test stimuli ramped the way he suggested (most people out there won't have access or skills to use Matlab or even Audacity). We should think about how many total samples and their lengths - should be long enough to be "musical" but not too long. Since I am proposing a 2IFC design, the combinatorics for all trials goes up quickly given all pairwise combinations. So maybe 10-20 total signals would be more than sufficient as stimuli. Should represent a range of different sorts of music - classical, jazz, rock, even country (ack), vocals, etc. I think the short snippets would be okay to use under fair use copyright rules and so we could put the signals somewhere where people could easily download.
3. Design - basic structure of the experiment - explain 2IFC and how to construct a full on experiment relatively easily
4. Methods - specifics about controlling for other factors, etc. Basically how to carefully do double blind 2IFC so that the experiment is well run - maybe a sheet with fill in the blanks so everyone is on the same page
5. Results - how to record results, maybe a scoring sheet...
6. Analysis - two straightforward analyses with links to simple web interface applets to run the analyses. Basically some version of d-prime and some significance tests as well
7. Discussion - what might one's results mean. How to interpret. How to report back to the community - sharing methods sheet, results sheet, and analyses

Yes, it is structured a bit like a scientific paper....
 

MarkS

Major Contributor
Joined
Apr 3, 2021
Messages
1,077
Likes
1,514
What IS your goal?

If it's to persuade subjectivists that they actually can't hear differences that they think they can, this is most definitely not going to do it.
 

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,559
Likes
1,703
Location
California
What IS your goal?

If it's to persuade subjectivists that they actually can't hear differences that they think they can, this is most definitely not going to do it.

Was just about to say the same thing.
 

oldsweatyman

Member
Joined
Oct 26, 2020
Messages
16
Likes
15
Can’t say I’m qualified (I’m a mere medical student) but even the potential “goals” here seem quite biased to begin with. I’m in the category of a comparatively ignorant audio-newcomer who finds these discussions useful to consider for further purchases.

Repeatedly, the idea of “disproving” outside claims of differences and whether or not such evidence will be convincing to the people who claim these differences is presented as unidirectional… if the point is to be scientific instead of, in all honesty, coming off as scolds desperate to say “I told you so,” there should be a discussion on the standard criteria for your own claims as well.

What is the minimum criteria an “objectivist” would accept/require before they cease to say that the subjectivist is incorrect? The hounding and shifting of goalposts occurs on ASR just the same, especially given that blind testing standards (as we can see on this thread) are ultimately rather arbitrary. There is effectively no limit to stringency, and the acceptability of leniency can become subjective.

My point is that pre-defining criteria for what the people on here perceive as “defeat” is equally important if you want to be “scientific.”

Seems to me that “You can’t hear a difference because measurements demonstrate that” can also be answered with “Please refer to the blind test FAQ to provide evidence for that statement.”
 
Last edited:

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,052
Likes
36,427
Location
The Neitherlands
My point is that pre-defining criteria for what the people on here perceive as “defeat” is equally important if you want to be “scientific.”

Seems to me that “You can’t hear a difference because measurements demonstrate that” can also be answered with “Please refer to the blind test FAQ to provide evidence for that statement.”

It's not about being 'right' nor is such the intend.

It's about outlining what is needed to do more valid testing (using the ears).
The problem is that you will need 2 pages.
One page showing what is needed when you want evidence to hold up in 'objectivists court' for publishing papers on it that can pass peer reviews and are repeatable.
As this is not feasable for mere mortals (it really isn't) you will need a second description on how to test 'better' for your own personal opinion.
This cannot be used to 'prove' you can or can't pass this test. The reason is that there can still be give-aways or it can be borked (by accident or on purpose).

But at least with a 'blind testing guide for dummies' you can point to it. Those interested in truth finding for themselves can test acc. to the outlined protocol and make up their own minds a bit more 'rigidly'.

Hoping at least the dummy guide comes to fruitition.
 

oldsweatyman

Member
Joined
Oct 26, 2020
Messages
16
Likes
15
One page showing what is needed when you want evidence to hold up in 'objectivists court' for publishing papers on it that can pass peer reviews and are repeatable.

That's what I'm sayin. I was essentially pointing out that there hadn't been discussion on this until I brought it up. Instead, it was focused on the opposite, which is will it hold up in "subjectivists court" as if the intention was to be "right." This, of course, muddies the scientific idealism here. I went further to say that the discussion becomes meaningless rather quickly when one realizes that stringency and leniency are rather arbitrary. I do believe that there is a likely consensus point for "good enough," just that attempting to please everyone is impossible because the goals and chosen definition of "objective" are arbitrary.

Anyway, +1 for the page on a FAQ for mere mortals/dummies on easy and cheap improved testing for our own personal opinions, looking forward to it.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,052
Likes
36,427
Location
The Neitherlands
The whole issue is that most people that prefer to use sight and hearing are convinced any 'objective' testing is flawed. The reasons are obvious. It takes away their abilities which are beyond question.

The FAQ is needed to point people to when making claims so that, when they really want to know (most don't, they already are convinced), they can try to test that way. It would only have to be for personal enlightment. It can't be used for anything else.
 

oldsweatyman

Member
Joined
Oct 26, 2020
Messages
16
Likes
15
The whole issue is that most people that prefer to use sight and hearing are convinced any 'objective' testing is flawed. The reasons are obvious. It takes away their abilities which are beyond question.

The FAQ is needed to point people to when making claims so that, when they really want to know (most don't, they already are convinced), they can try to test that way. It would only have to be for personal enlightment. It can't be used for anything else.

Essentially any given 'objective' testing is ultimately flawed to some degree though, as I mentioned before, despite the implications on countless posts here that seem to desperately believe otherwise. Regardless, there should be no value attributed on a "scientific" forum (theoretically) to addressing the boogeyman of the inconsolable subjectivist. The only value that would have would be to say "I told you so," and the fact that the boogeyman is repeatedly brought up leads one to think that science isn't the goal here, it's to say "I told you so."

So, I agree with the part that you said the FAQ should only for personal enlightenment. Case in point, we may end up finding that so-called "subjectivists" will point to this FAQ to show that "objectivists" are wrong and that even with the consensus standards, differences and decisions of "better" or "worse" are routinely made. Isn't finding out which way it will go on an individual level the actual whole point here?

With that said, I would like to make an attempt to answer two of the fundamental questions previously asked.

The intended audience of this FAQ should be the members of this forum who's goal is to have conversations with a pre-determined baseline standard degree of "objectivity" to discuss audio gear and how they compare. This would lead to further rational and productive discussion while acknowledging the flaws and benefits of the chosen minimum standard. This acknowledgement must also include the fact all "objective" testing is ultimately flawed one way or another because actual true objectivity isn't possible, practical, or important anyway. I like the idea of iterations of rigidity on the FAQ. This approach is in direct contrast to pointing someone to a FAQ to prove some kind of point one way or another.
 

shuppatsu

Active Member
Joined
Apr 27, 2023
Messages
135
Likes
185
The target for matching levels is 0.1 dB across the full audible range although this is more strict than research tends to indicate.
Is there any non-AES-walled research available for research n this topic. I’ve seen many places on ASR saying that 0.2dB is probably fine and higher than that can affect preferences, but I haven’t seen reference to any studies along those lines.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,282
Likes
4,789
Location
My kitchen or my listening room.
Is there any non-AES-walled research available for research n this topic. I’ve seen many places on ASR saying that 0.2dB is probably fine and higher than that can affect preferences, but I haven’t seen reference to any studies along those lines.
They're pretty old, and you might want to look in JASA, but I don't recall them any more.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,052
Likes
36,427
Location
The Neitherlands
Is there any non-AES-walled research available for research n this topic. I’ve seen many places on ASR saying that 0.2dB is probably fine and higher than that can affect preferences, but I haven’t seen reference to any studies along those lines.
Take a good audio file.

Normalize it to -1dB and save it as ref. (0dB)
Normalize the original file to -1.1 save as -0.1
Normalize the original file to -1.2 save as -0.2
Normalize the original file to -1.3 save as -0.3
Normalize the original file to -1.4 save as -0.4
Normalize the original file to -1.5 save as -0.5

Then use ABX comparator to determine your personal threshold. Educational and you'll know what matters to you and if you come to the same conclusion (around 0.2dB) then you will know why 0.1dB is important. (safety margin)

Have fun... these kinds of simple tests are highly enlightening.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,282
Likes
4,789
Location
My kitchen or my listening room.
Take a good audio file.

Normalize it to -1dB and save it as ref. (0dB)
Normalize the original file to -1.1 save as -0.1
Normalize the original file to -1.2 save as -0.2
Normalize the original file to -1.3 save as -0.3
Normalize the original file to -1.4 save as -0.4
Normalize the original file to -1.5 save as -0.5

Then use ABX comparator to determine your personal threshold. Educational and you'll know what matters to you and if you come to the same conclusion (around 0.2dB) then you will know why 0.1dB is important. (safety margin)

Have fun... these kinds of simple tests are highly enlightening.
This is a good idea, but I should add make sure that the two files are exactly time aligned to the sample.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,052
Likes
36,427
Location
The Neitherlands
Yep, that's why I suggested to use the same source file and also create (normalize) the new reference.
Sometimes, indeed, after some operations empty samples are added.
This way this should be the same for all files one plays with.
 

shuppatsu

Active Member
Joined
Apr 27, 2023
Messages
135
Likes
185
Take a good audio file.

Normalize it to -1dB and save it as ref. (0dB)
Normalize the original file to -1.1 save as -0.1
Normalize the original file to -1.2 save as -0.2
Normalize the original file to -1.3 save as -0.3
Normalize the original file to -1.4 save as -0.4
Normalize the original file to -1.5 save as -0.5

Then use ABX comparator to determine your personal threshold. Educational and you'll know what matters to you and if you come to the same conclusion (around 0.2dB) then you will know why 0.1dB is important. (safety margin)

Have fun... these kinds of simple tests are highly enlightening.
Thanks, I’ll try that when I get back home in a week or so. Or maybe just download that Java-based comparator on my laptop.
 

GaryH

Major Contributor
Joined
May 12, 2021
Messages
1,351
Likes
1,859
Is there any non-AES-walled research available for research n this topic. I’ve seen many places on ASR saying that 0.2dB is probably fine and higher than that can affect preferences, but I haven’t seen reference to any studies along those lines.
Dr. Floyd Toole cites Kommamura and Mori (1983), New Measurement of Frequency Response Flatness Limen for the ~0.1 dB (per octave) figure. In fact they found even smaller differences could be distinguished in some cases:
index.php
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,282
Likes
4,789
Location
My kitchen or my listening room.
Yep, that's why I suggested to use the same source file and also create (normalize) the new reference.
Sometimes, indeed, after some operations empty samples are added.
This way this should be the same for all files one plays with.
That *should* work.

I've had interesting experiences with a variety of audio processing systems. I can't name names, but yes, that SHOULD work.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,052
Likes
36,427
Location
The Neitherlands
Dr. Floyd Toole cites Kommamura and Mori (1983), New Measurement of Frequency Response Flatness Limen for the ~0.1 dB (per octave) figure. In fact they found even smaller differences could be distinguished in some cases:
index.php
There are about 10 octaves in the audible band (20Hz to 20kHz) so with music this means a 1dB tilt was audible. This is something different than just an overall level difference.
 
Top Bottom