• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Double Blind Testing FAQ Development

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,570
Likes
239,101
Location
Seattle Area
No, really you don't. The wild ass claim is just that. You only need to disprove the claim for the specific condition set, not for every condition imaginable.
Please take caution in your tone as you address our audio luminaries in their field of expertise. This is not a fighting thread. If you want to contribute to the FAQ, do so. Otherwise, don't post in such manner or I will give you a reply ban.
 
OP
C

CMOT

Active Member
Joined
Feb 21, 2021
Messages
147
Likes
114
If you go to loudness differences (in partial loudness, of course) over time, and the ability to distinguish loudness differences inside an ERB, you will find that anything but a smooth, quick transition hides your data. This is not about masking, rather about short-term auditory memory.

Masking occurs at many different levels. It can be purely signal (e.g., at the basilar membrane), but it can also occur at the level of auditory short term memory. Perhaps we are just using the term differently due to different backgrounds. And agreed that abrupt offsets and onsets will produce clicks that will then produce signal making. But even if one correctly smoothed the signal so that this wasn't the case, if the interval between Signal 1 and Signal 2 were too brief one would get masking at the level of auditory short-term memory. (or you can call it interference or whatever, but basically the first signal isn't sufficiently processed and isn't well retained, the result of which is that it is difficult to compare against the second signal). Maybe this is what some here refer to as "decision processes". Different terminologies. In any case, the main point is that one needs to attend to the shape of the signals (no abrupt offsets), but also the time durations of the signals and the time duration between the signals.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,570
Likes
239,101
Location
Seattle Area
Adding on, such an FAQ can be used in many instances. When someone said 320 kbps MP3 is transparent, I took an ABX test to show it was not. This was not a "wild ass claim" on my part indicating it was not transparent. Just because you think something is inaudible doesn't mean it is.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,570
Likes
239,101
Location
Seattle Area
I agree with most of this stuff but this one I can't let go. First I can tell you the tests need to be "can you hear ANY difference"? If this is the case, why would there need to be feedback during the test.
Because not all impairments are constant throughout the selected music. A lossy compression algorithm for example may disturb (add "pre-echo") to transients and this is only audible under certain circumstances. One needs to know where this is by trial and error playing different segments of the clip and having the system verify that the detected difference is real. If not, then one can seek out another segment.

If you were to guarantee that the impairment shows up during any part of the music, then sure, you don't need to do this but then again, this is very hard to impossible to do.

Also, it is important to have this feedback during the test as well. As otherwise, you will waste time running a trial, spending 10 minutes only to find out that you had gotten it wrong from the start.

Most tests are ruined by "tells".
They can be but we need to keep in mind that this FAQ will be used generally for both outrageous claims and real ones. Example of the latter was generational loss through ADC/DAC, or lossy compression. These are the types of tests I have run and passed using double blind ABX. Interestingly enough, the generational loss had a simple tell I found after the fact: the metadata was only in the original and not the generational loss one! I let the test creator know but I think I was the only one to find this. Others tried and miserably failed.

All in all, I like to see us finish this FAQ than argue about it and have nothing.
 
OP
C

CMOT

Active Member
Joined
Feb 21, 2021
Messages
147
Likes
114
One thought here as we all hash this out. We are going to end up with a (hopefully) short FAQ/guide that people can use in doing comparisons between components. Maybe we should also have a short form that could be completed. A sort of checklist of things such as "I ensured that the entire audio chain was identical except for the two comparison items" and "I matched level differences at sound playback by using _______" and "I ran Y trials each of the following types" and so on. Plus a standardized results table. So that all tests can be compared/understood using the same parameters. And then we can look at these parameters and decide for ourselves if we accept the results. A lot like science and peer review. Not every experiment will be perfect but all of the testing conditions will be made explicit. Then we can each judge for ourselves whether we accept the results or feel there is a flaw in the design (and explain why we think that flaw changes our interpretation)
 

audio2design

Major Contributor
Joined
Nov 29, 2020
Messages
1,769
Likes
1,830
You contend wrong. If there is an audible difference, there is an audible difference and claiming otherwise is a farce. You can ask for qualification from the tester on how significant that difference was and making some inferences, but not otherwise.

I consider that answer a both hypocritical (for reasons below) and not accurate. I am well aware of the research w.r.t. this figure and echoic memory. Inui's paper in 2010 really brought this to the forefront again, but at this point, you are turning music into a test signal and audibility of test signals is always much higher than audibility of real music. Detecting a 0.2db shelving or boost over a small frequency range in music is pretty much if not totally inaudible. A fast transition coincident with a dominance of that frequency range is audible. Again, you took music and turned it into a test signal.

Is it no different from your claims w.r.t. audibility in the thread on phase where you claim masking from reflections. You can't make that claim and then state emphatically that my claims is incorrect.

And last, as I stated before, you can't actually transition in 10msec, windowing or not, without fundamentally introducing something into the signal that was not there. It is literally impossible. You can soften it, but you have still introduced something that is not there, and hence, if a detection is made, you also cannot state definitively why the change was detected. I asked that question 2010, and the answer is still the same, you cannot state definitively what was detected.
 
Last edited:
OP
C

CMOT

Active Member
Joined
Feb 21, 2021
Messages
147
Likes
114
One other thought (then that is it for me for at least a day since I will be offline), we should aim to keep things clear and simple if at all possible. A long document or overly complex procedure won't get used. We would prefer if more subjectivists were willing to try such tests and see what happens. Maybe they will be surprised sometimes. If we make the requirements too demanding, no one will use it everyone will just be back to square one. I am concerned about easy ways for interested participants to: match sound level differences at output, efficiently switch in a manner that doesn't provide any "tells", and use signals that don't produce artifacts (e.g., clicks from abrupt offsets).
 

audio2design

Major Contributor
Joined
Nov 29, 2020
Messages
1,769
Likes
1,830
Please take caution in your tone as you address our audio luminaries in their field of expertise. This is not a fighting thread. If you want to contribute to the FAQ, do so. Otherwise, don't post in such manner or I will give you a reply ban.

Just because I don't post my qualifications on here, does not mean I am not qualified. You had no issue telling me point blank I was wrong and your answer was questionable at best. Well I have no issue telling a "luminary" they are wrong, and I have the experience, research, and qualifications to do that. I was called "wrong" by quite a few luminaries in a thread I posted about, where they incorrectly used several fundamental aspects of circuit theory and physics totally wrong. If we cannot challenge "luminaries" then this is not a science forum, but another echo chamber. I won't feel too bad about a reply ban if that is the case.
 
OP
C

CMOT

Active Member
Joined
Feb 21, 2021
Messages
147
Likes
114
I consider that answer a both hypocritical (for reasons below) and not accurate. I am well aware of the research w.r.t. this figure and echoic memory. Inui's paper in 2010 really brought this to the forefront again, but at this point, you are turning music into a test signal and audibility of test signals is always much higher than audibility of real music. Detecting a 0.2db shelving or boost over a small frequency range is pretty much if not totally inaudible. A fast transition coincident with a dominance of that frequency range is audible. Again, you took music and turned it into a test signal.

Is it no different from your claims w.r.t. audibility in the thread on phase where you claim masking from reflections. You can't make that claim and then state emphatically that my claims is incorrect.

And last, as I stated before, you can't actually transition in 10msec, windowing or not, without fundamentally introducing something into the signal that was not there. It is literally impossible. You can soften it, but you have still introduced something that is not there, and hence, if a detection is made, you also cannot state definitively why the change was detected. I asked that question 2010, and the answer is still the same, you cannot state definitively what was detected.

I *think* I know a bit about perception. But I am confused by your point here. What do you mean by "transition in 10msec"? Do you mean that short duration signals or intervals inject noise into the audible range in the signals? As in the point above that that an abrupt onset or offset can introduce a click? Also by "audibility of test signals is always much higher than audibility of real music". Are you saying that one can find perceived differences in experimental conditions that one would never perceive under real-world listen conditions?
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,278
Likes
4,784
Location
My kitchen or my listening room.
I *think* I know a bit about perception. But I am confused by your point here. What do you mean by "transition in 10msec"?

I have no idea what he means, but the right way to do a transition between two gain equalized, time-aligned signals of very, very similar characteristics is windowing. About the safest way is to use a half-hann window and its compliment to window between the two signals.

This will only add new frequency content when the two signals are quite different to start with. If they are quite different to start with, well, ok, yes, you can hear that. That's the point, too.

BUT masking really is not the issue here. Allen and a bunch of other people have demonstrated clearly with self-training signal-detection tests, that smooth, quick transition is what leads to the most sensitive results. I'll have to stick with that.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,278
Likes
4,784
Location
My kitchen or my listening room.
And last, as I stated before, you can't actually transition in 10msec, windowing or not, without fundamentally introducing something into the signal that was not there.

Remember, we are crossfading, with sum to one windows (half-Hann window is a good choice), between two signals that are very close to each other. These two signals are time and level matched to one sample (at 44.1) or less. (If a signal is 10 ms offset or so, and there's some percussion involved, you're going to notice that anyhow if you've practiced for doing that.) If we are not doing time and level alignment, the whole test is superfluous, anyhow. No, you can't do this with speakers, but I'm not very worried about speakers NOT sounding different, after all.

But yes, it is possible to do a clean crossfade. This isn't "off then on", it's crossfade or more particularly, windowing between two very similar signals. Under such circumstances, there should be very, very little "introduced". Quite obviously, between two signals that are identical, there is no error introduced. It is not until there are substantial differences are involved that there is substantial error added.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,469
Likes
25,171
Location
Alfred, NY
I'm seeing the usual issues with talking about experimental design without considering what the question being asked is. That's causing some unnecessary friction.

"Is XYZ audible?" is an entirely different question than, "Person A claims to hear XYZ in his system. Can he actually hear this?" No surprisingly, the approach to experimental design and controls will also be different. Insisting that appropriate controls for the first example must be applied to the second (or vice versa) is rather Procrustean.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,278
Likes
4,784
Location
My kitchen or my listening room.
I'm seeing the usual issues with talking about experimental design without considering what the question being asked is. That's causing some unnecessary friction.

"Is XYZ audible?" is an entirely different question than, "Person A claims to hear XYZ in his system. Can he actually hear this?" No surprisingly, the approach to experimental design and controls will also be different. Insisting that appropriate controls for the first example must be applied to the second (or vice versa) is rather Procrustean.

It will always evolve to the second, as the claimant rejects the first in some fashion, by dismissing the test. I understand your point, but I've been in these debates once or thrice myself, and there's always an excuse.

I prefer to remove all excuses, as a result.
 
Last edited:

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,278
Likes
4,784
Location
My kitchen or my listening room.
I've generated a file that windows between two signals, audibly different, in 10 milliseconds. While audio2design is doing that, he can also tell me what was "added" that was not in one or the other of the two signals. Nothing. That's why time alignment and level alignment are necessary. I've recorded the exact transfer points (but this was done with randomization of the two points of the switch) so ...

https://we.tl/t-OqI44WZjz2 Up at wetransfer for a week. Enjoy. Well, at least observe "nothing added"
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
15,982
Likes
36,177
Location
The Neitherlands
Q1: How to achieve time alignment when comparing 2 DACs at home ?
Q2: How to achieve crossfade between 2 analog signals and how long should the fade be ?
Q3: When you do hard switching (analog time/level aligned no fade available) should it be make before break at line level (not recommended for speaker amps) ?
Q4: When doing break before make; what would be the maximum allowed 'silence' in ms ? The max timing difference between L and R ?
Q5: When designing an AB box that can do this for speaker and line-level inputs the relay types should be fixed ?
Q6: When using the same box for DAC outputs how to achieve time alignment between DACs ? Would that require specific software ? Is that freeware ?
Q7: Is the FAQ meant to encourage folks to test properly at home or to show it is complicated and requires specialized equipment that is expensive/hard to find for the test to be valid (to discourage and think again before one tries) or is it to show that even when you think you AB'ed correctly you still might not have done so. ?
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,570
Likes
239,101
Location
Seattle Area
Just because I don't post my qualifications on here, does not mean I am not qualified.
State them then. Without it, you have not earned the right to bring this type of attitude. My qualifications in this topic are extensive as are JJ's. Conducting, taking and creating blind tests were part and parcel of our professional careers.

You had no issue telling me point blank I was wrong and your answer was questionable at best. Well I have no issue telling a "luminary" they are wrong, and I have the experience, research, and qualifications to do that. I was called "wrong" by quite a few luminaries in a thread I posted about, where they incorrectly used several fundamental aspects of circuit theory and physics totally wrong. If we cannot challenge "luminaries" then this is not a science forum, but another echo chamber. I won't feel too bad about a reply ban if that is the case.
Your doctor is not going to have much patience with you if you challenge his qualifications with the attitude you are showing. Ask questions, disagree but do it with utmost respect as you are addressing our few luminaries. If this is too much to ask, then you don't belong in this forum. I won't warn you again.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,570
Likes
239,101
Location
Seattle Area
And last, as I stated before, you can't actually transition in 10msec, windowing or not, without fundamentally introducing something into the signal that was not there. It is literally impossible. You can soften it, but you have still introduced something that is not there, and hence, if a detection is made, you also cannot state definitively why the change was detected. I asked that question 2010, and the answer is still the same, you cannot state definitively what was detected.
So if I have two analog sources you are claiming I have to have an artificially long switchover time? I can't switch at zero crossing?

I have done plenty of AB tests with analog sources (e.g. headphone amps) with no tell in switching. Glitches if any will be bidirectional and random.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,278
Likes
4,784
Location
My kitchen or my listening room.
Q1: How to achieve time alignment when comparing 2 DACs at home ?
Q2: How to achieve crossfade between 2 analog signals and how long should the fade be ?
Q3: When you do hard switching (analog time/level aligned no fade available) should it be make before break at line level (not recommended for speaker amps) ?
Q4: When doing break before make; what would be the maximum allowed 'silence' in ms ? The max timing difference between L and R ?
Q5: When designing an AB box that can do this for speaker and line-level inputs the relay types should be fixed ?
Q6: When using the same box for DAC outputs how to achieve time alignment between DACs ? Would that require specific software ? Is that freeware ?
Q7: Is the FAQ meant to encourage folks to test properly at home or to show it is complicated and requires specialized equipment that is expensive/hard to find for the test to be valid (to discourage and think again before one tries) or is it to show that even when you think you AB'ed correctly you still might not have done so. ?

1) it's hard. You also have to get good level alignment. A scope is really handy there. If you're testing DAC's it is possible to use a 4-channel output to the 2 stereo DACS (for instance) and build the delay into the program that loads the audio data into the driver.
2) this can be tough. make before break can help IF you can ensure you don't cause output misconduct. Like you said, no, do not to this with power amps. That is in the realm of relays and best needs short smooth mutes around the relay "click". Yes, this is a pain in the behind.
3) Going to silence in a discontinuity (i.e. a click) is bad news. It might almost be a pot driven by the two (line level) inputs driven back and forth. That would work for low level signals.

If you HAVE to break before make, smoothly mute the input first, and then bring it up again afterwards. There's no way this can be an ultimate test, of course, but it's what you can do that isn't worse. Clicks really throw things off.

I don't know about the second DAC question. Yes, such software exists, no it's not freeware that I know of, although it would make sense for somebody to do that.

As to FAQ - I'd make it a graduated scale, from "just barely double blind" to "as good as you can get". As you have observed, these tests are a pain in the behind.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
15,982
Likes
36,177
Location
The Neitherlands
Trying to figure out how to design an AB tester PCB that can be built by enthusiasts and would be easy to operate (using a PC/USB for d.b. testing) and make gerbers available for free.

But it looks like for more serious testing would also require software to control relays that would have to be encased with damping material (off board) as well as have more options. There would also have to be some measuring and waveform checking (using ADC ?).
I see more bears on the way like the fade thing and how to do this at home comparing 2 DACs with unknown delays which vary depending on the used reconstruction filter as well.

When the goal is to provide an easy and cheap way to AB as properly as possible it would still be a compromise.
I am getting the feeling those making claims amp A sounds better to them as amp B in any sighted test can not be persuaded to invest in something that may take away their lovely dream in a lot of cases.

Meaning that no matter what the FAQ will look like it will not solve the problem. The problem being claims made based on incorrect testing where we would point to the FAQ and the answer that will come back is something like: I know what I am hearing and don't need such expensive and complicated test methods.

The FAQ could be useful to show what would be involved to test specific types of devices to get valid test results. This could benefit hardcore objectivists that like to perform this type of testing and show their results incl. completely registering the used process and validation.

When we would be pointing to this FAQ to end one of the many discussions about 'proper testing' it would end all efforts on both sides of the fence as both 'sides' would realize having a conversation about proper testing procedures would be rather pointless when it involves too many aspects if it is to serve as evidence.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,570
Likes
239,101
Location
Seattle Area
When the goal is to provide an easy and cheap way to AB as properly as possible it would still be a compromise.
I am getting the feeling those making claims amp A sounds better to them as amp B in any sighted test can not be persuaded to invest in something that may take away their lovely dream in a lot of cases.
I think if we can make a simple switch box for analog sources that costs less than $50, we can always lend them one. Wonder how much work it is to have a simple tone generator and a digital meter for the output.
 
Top Bottom