OK, let's dig into the study and clear out some confusion. I was going to tell this story as Sean told it to us in person but I see that it is right there in his blog post:
Focusing on the first sentence, there was essentially a revolt against the new people coming in saying controlled tests are necessary for proper evaluation of speakers. The "old guard" immediately took this as a challenge to their expertise and stated that they knew what the speaker sounded like and were in no use of any academics to tell them otherwise. So Sean and crew set out to prove them wrong.
Now the key part is the last section I have highlighted. "trained audio professionals" were not trained/critical listeners as I am talking about. They were existing sales, marketing and engineering staff who had been in audio business for a long time at Harman and elsewhere. Heaven knows we all agree that such jobs do not give them automatic qualification to be trained listeners.
A trained listener in my book gets paid to only evaluate audio fidelity. He is properly taught what to look for and he makes a living performing such tests. No one would have had such experience prior to Sean arriving and certainly would not hold the position of sales, marketing, etc.
Here is the next bit:
All testers -- experienced or otherwise -- were Harman employees. No one like me who has no stake at the company or its products was in the study. Remember, the goal was to prove the Harman employees wrong.
Here is the punchline and why you don't want to extrapolate from this study to what I am doing at ASR:
Of course Harman employees would have strong views regarding their own products in sighted tests that would disappear in blind tests. You all keep saying "we are talking about biases all humans have." Well no. None of us have an employee relationship with the products being tested, and neither do I. None of us have pride of brand or engineering in the products being tested either. So what impacted these employees the most in sighted tests, would not be in play in ours.
That is NOT to say that was the only bias. Seeing an expensive speaker like T and a cheap one even from Harman does bias people. Am I effected by that? Maybe sometimes. But I am on record in recommending cheap $99 speakers as well as ugly DIY speakers from people I am not necessarily fond of. In other words, I have demonstrated isolation from such biases to high degree.
And of course there was silliness like this:
This is a very strong bias: people who take positions in a company, i.e. German people want a sound, and then go on to create such products. Separating themselves from their own children becomes very hard. That is why my compression team despite having their own trained listeners/testers, would come to me for bit decisions for listening tests. I was not biased like the team that created the technology. As a result, I would routinely be harder on the technology and find flaws that needed to be fixed.
Likewise, that engineered needed to be proven wrong and let go.
Discussion and Conclusions
This study was
not aimed at determining usefulness of professionally and formally trained listeners in sighted evaluations.
It was designed to solve an internal problem where people did not want to believe in controlled testing, period. What the researchers showed that being in audio industry doesn't make you good in evaluating speakers in sighted testing. This "industry experience" was stated as the person having training causing people here to keep referring to this study even though it doesn't apply.
There are also other differences. I come fully armed into most listening session with objective performance of a speaker that gives me a strong direction as to what the speaker may do, and what faults it has. I am trained to listen critically, past the beauty of music in wide range of domains and audio impairments. Someone like me was not tested in the study.
Note that as I have corrected people countless times, I am not infallible. I may be biased but the biggest source of error is that I am not sitting there comparing 4 speakers at once that get the same score. I do go back and forth manually at times but that is coursery. So my reference of what is good is somewhat variable. A person in Harman tests has a better situation than me with four speakers to compare against each other.
So some amount of error exists in my subjective assessments. With correlation of 75% or better with good measurements, whatever error there is, is small. It has to be or our measurements are quite wrong!