Resurrecting this thread, as I just stumbled across a scientific article which seems to me to be pertinent to the discussion here (and for the "Can you trust your ears" and "blind test design" threads as well). It's an article about different kinds of auditory/echoic memory:
"From Sensory to Long-Term Memory: Evidence from Auditory Memory Reactivation Studies"
https://pdfs.semanticscholar.org/2096/25309d5e01183db129fa3a7151945cdf3e8f.pdf
The present "rational consensus" seems to be that
a) the echoic memory is very short, and A/B or ABX-comparisons therefore have to use short excerpts and short time intervals
b) test tones/signals are better suited for finding differences than musical excerpts, since the brain doesn't get fooled by getting drawn into the music
While this is probably valid for many cases, this article nevertheless seems interesting. The claim in the article is that there also exists a long-term auditory memory. They base this on a very intuitive and self-evident fact: That we are able to recognize things like specific voices of people we know, even after a long time. The article is rather technical and a heavy read, but my take-away is that it is easier to remember tones and sounds for a longer time if we can put them into categories/systems/regularities. When we can put sounds in a specific context, we remember them more easily.
What's the relevance for blindtesting? I think that "analytical" blindtesting might take away some of our ability to put sounds within a larger context, and thus make it more difficult to identify differences. I assume that this can be overcome by training on the specific task at hand, but I still think that this easily can mask differences for untrained testees. It also seems to me that ABX tests are way too difficult for the brain to handle, given the limitiations of our auditory memory, and that AB tests would be better.
I also wonder about what
kind of differences we might expect to find in AB-comparisons. If our short term acoustic memory is so short, it might be able to spot what I would call "static" differences - that would be differences in frequency response, for example. But what about differences in dynamics? Or transients? Or the time domain in general? Everything which has to do with changes in the music over time seems to me to be rather difficult to capture in short-term listening. Might this go some way towards explaining why Toole and others found that frequency response trumped all other differences in their tests?