Frantz post this on WBF forum and I thought it also belongs here. It is the most fundamental concept in audio evaluation and we need to get on the same page on it. The text is my response to his post.
---
While this is responsible for some of the faulty observations, just as big of a factor is the
elasticity of our perception of audio. When evaluating new additions to our systems, we become far more attentive. We are dying to know if the new addition made a difference. We pay far more attention to what is played and as a result, hear detail, nuances, etc. that we did not when we were just enjoying music. What happens then is what you say: we attach those improvements to the new device and bias makes sure that when we go back to "before" configuration, we don't hear those improvements.
This is a very difficult thought exercise but when faced with this situation, try to see if you can hear the same differences in the old configuration. Having done that, all of those improvements appear in the older configuration too! And by the same token, you can take them out of the new config/tweak.
This is why blind tests work better. There, you apply the analytic technique to both samples, not just one. And without identity, whatever you think is different, cannot be associated with the new config/tweak.
The above explains why our evaluation of audio products can be faulty even when "we didn't expect to hear a difference." Or, "I expected it to sound worse." In both cases, we still listen more attentively and as a result hear more detail whether we expected it or not. This happens to me all the time even though I am hyper aware of listener bias. The above thought exercises and blind testing are the only way I can pull myself out of false conclusions.
I can't tell you how many times I have run a blind test and read more detail, resolution, lower noise floor into one sample, only to have all of those observations be false. And then be able to hear all of that into the other sample and not the first!
Without checks and balances that blind testing provides, i.e. holding the truth card, we can lead ourselves to completely wrong conclusions about the products we are evaluating and our ability to do so. And once lost in the forest, anything goes from there on.
BTW, I hope you don't mind me stealing your post for ASR Forum

. I won't be posting more in this thread but do like to continue the discussion there.