I have several first generation CD players in my collection and I would be prepared to bet nobody here would be able to pick them from any flavour of the month DAC they want to name in a genuine, blind, real-time, level matched comparison. It's a fantastic reality check to pull out the world's first CD player (CDP-101) and A-B it with some ASR approved DAC and hear absolutely zero difference, 41 years on.
If that mouthful means actually "Audio ABX per Clark/Krueger et at (AABX after this)" I would bet money that it is so.
Yet back in the 90's back in olde blighty (England) HaiFai Rags used to do group tests of the latest and greatest mainly japanese CD-Players, integrated Amplifiers and so on, every month, month on month. Usually at least two SeeDee player group tests at different price points and another two sets for Amplifiers and Speakers.
I was part of the listening panel a fair few times while doing my 2nd degree in London. Listeners were paid, not well, but paid. Hell, I even made money queuing for ticket sharks. And yes, there were consistent preferences for specific products both with individual listeners and across the group of listeners (usually ~ 10). And yes, test were completely blind, level matched etc.
My own later tests used a similar format, because these tests naturally confirmed that there are reliable preferences. I also found interesting that some technical "outliers" showed strong preference polarisations, an example were the Pioneer Legato Link DAC based CD players. They were like Marmite. Many listeners preferred them, but tosome they were the absolute anathema.
The main issue I have with AABX (other than that it cargo cult science [pseudo science], is designed to reliably return "null hypothesis not rejected" in the presence of anything audible but not "day and light" and the fact that is closely related to the fraudulent confidence trick called "shell game") is that no matter the outcome, there is nothing actionable from the outcome for anyone.
If we find that no difference was confirmed (with a 99% likelyhood of the conclusion being in Error, specifically a type 2/B statistical error), what is next step?
If we find that a difference was confirmed (with a 99% likelyhood of the conclusion being correct, avoiding a type 1/A statistical error), what is the next step?
I'm in part an engineer. As such I want something that can be actioned.
An example of something that can be actioned would be: "Out of appx. different chinese 20 dongle DAC's with a mix of DAC Chip's (including Apple, Cirrus Logic, ESS and others) auditioned sighted and scored for preference by around 30 listeners we have a large majority of listeners preferring one DAC Chip, regardless of the brand and looks of the dongles, while other create a mostly negative reaction among listeners" (I had a chance to run this one, it informed the choice of the DAC chip for commercial products).
Another might be:
"Such and such distortion is reliably inaudible and after reaching this threshold concentrate on other parameters, including efficiency, price, profit margins etc.". Similar things are interesting also for other objective parameters. And it would be REALLY useful for consumers if there are clearly established "red lines" where, as long as a product being considered for purchase stays on the other side of the red lines it can be considered "blameless".
In fact, that is normally the regulatory approach to regulating food additives, pollutants etc. et al based on scientific evidence. If we are doing audio scientifically the least sensible expectation would be to have such "red lines" based in reliable science.
So I am not interested in AABX not just because it is likely one of the worst possible tests to confirm or reject subtle audible differences, but because it has no informational value for what interests me in audio.
Thor