• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Interesting blind test comparisons of CD players

tcpip

Member
Forum Donor
Joined
Jan 7, 2021
Messages
85
Likes
32
Location
Bombay India
I found this video interesting.


Why is this interesting to me:
* Blind testing
* Multiple long sessions
* Level matched
* Consistent room, rest of the gear, and a single CD which was used for all sessions
* Young listeners (hearing not degraded much)
* Listeners trained in music, acoustics, or musicians, not audiophiles looking for crystal aligned cables
* I couldn't perceive any possible commercial motive of the organiser to promote a specific model of CD player.

They heard differences, took detailed notes, and actually couldn't differentiate between two CD players but could differentiate among the others. Both of these are very interesting to me: what they could differentiate and what they couldn't.

I'm posting in the DAC section here because I, in my ignorance, believe that the differences between CD players is in the DACs and analog electronics, not in the transports.

I'd LOVE to see a similar well organised comparison test for just CD transports, all being played back through a common DAC. This test too has a pure-digital comparison in the second part of the video, where all the players are just used as transports, and then all the differences seem to disappear.
 
Last edited:
I found this video interesting.
(...)
They heard differences, took detailed notes, and actually couldn't differentiate between two CD players but could differentiate among the others. Both of these are very interesting to me: what they could differentiate and what they couldn't.
Well, i have a few questions/comments...

How did he actually level the outputs ?
Manually with the analog preamp volume control ?
That may introduce some difference not related to the DUT.

Did he just measure them (except for the level) ?
Old CD players may have some kind of modified frequency response. Same when he says one is warmer: is that visible in a basic measurement ?
What noise shaping is at work for each ?
...
If there are obvious differences showing in measurements, then maybe they will actually sound somehow differently.

Also, if he kept the preferred one every time, that's not really random. It may lead to biased conclusion, IMO.

So, I'd say:
Good (and, at first glance, fair) try.
But should do better...
 
He explains in the video that he levels the signals by using his preamp which has a numeric readout of gain. He first chose one CD player and set the preamp gain to a level which gave him a good listening level, and that corresponded to a level of 70 on his preamp. Then he switched in each of the other CD players with a 1KHz tone and measured the voltage at his speaker terminals to make their voltage match the first, and noted the preamp gain setting for each.

This is not a double-blind listening test, it's a blind one. So one or two other biases may have come through.

The rest of your comments are possibly right, possibly wrong, but they are speculation. And while really old, 30-year-old CD players may have measurable SPL curve differences in the 20-20K band, sufficient members of his test were not that old.
 
How many trials per person?
 
I think his preamp has adjustable level controls for each input. So he can set a reference level and then adjust each input accordingly.

I do wonder though; are those full range speakers capable of good resolution at high frequencies?

And finally, "all CD players sounded the same when fed into a DAC" - what a surprise!!!
 
How many trials per person?
It was up to the test subjects to decide whether they heard a difference or not, that wasn't actually verified in any way.

I suspect in a proper ABX there would be few if any conclusive scores. At best, differences can only be very marginal.
 
How do you "verify" if the test subjects heard a difference or not? The statistical correlation of the reported differences in a blind test is itself supposed to be verification. What else are you expecting to find? The organiser may not have explained all the details very clearly, but he did share some very clear consistencies along the results from the different listeners.
 
How do you "verify" if the test subjects heard a difference or not? The statistical correlation of the reported differences in a blind test is itself supposed to be verification. What else are you expecting to find? The organiser may not have explained all the details very clearly, but he did share some very clear consistencies along the results from the different listeners.
In an ABX test the subject must confirm if X is A or B. So we know if he really can distinguish A from B, or not.

In this test he gets to decide for himself if he can distinguish or not. It's a critical flaw.
 
he levels the signals by using his preamp which has a numeric readout of gain.
That's what I said.
The analog volume control may introduce some unwanted differences, like a change of balance between channels.

This is not a double-blind listening test, it's a blind one. So one or two other biases may have come through.
Indeed.

The rest of your comments are possibly right, possibly wrong, but they are speculation.
It's no speculation.
It's just saying that even if he can prove some of those devices are systematically selected as better, we still don't have a clue WHY.

If I had done such a test, I would, for sure, also measure each device, to detect any actual (measurable) difference.
It's not doing it that is actually opening the door to speculations !

I would for sure also have recorded the outputs, to allow another (real ABX) test for further confirmation using the recordings.
 
Last edited:
How do you "verify" if the test subjects heard a difference or not? The statistical correlation of the reported differences in a blind test is itself supposed to be verification. What else are you expecting to find? The organiser may not have explained all the details very clearly, but he did share some very clear consistencies along the results from the different listeners.


I would normally expect a statistically significant number of trials for each participant - minimum 10 per comparison - with 8 correct in each comparison. Otherwise an individuals evaluation can't be distinguished from guesswork.
 
minimum 10 per comparison - with 8 correct in each comparison.
The problem is that there is no "correct" answer here, like there is on an AB-X test.

The only thing you could assess is if the listeners are systematically preferring the same device.

Which is way more unlikely than just finding a difference: not only you should identify one, but also always prefer the same device for those differences.
 
There could be differences with a "bad" CD player. We have to be careful when we say ALL sound the same. And the preference could be one with an imperfection like boosted bass or boosted treble, or even distortion!*

we still don't have a clue WHY.
Right. Even if the person can't clearly explain what they're hearing, measurements should be able to define it. With electronics, we can "measure better" than we can hear. (With speakers and acoustics it's trickier... You can't measure soundstage which is obviously an illusion with the sound really coming from a pair of speakers.)

Bad methodology and no statistics. And there's no "X". If I think #1 sounds "warmer" I'll probably be consistent in my perception or preference. Or if listeners notice details in #2 that they missed on the 1st listening (common) that perception of #2 being "better" will probably stick with the listener.

Note that an ABX test is NOT a preference test. It only confirms that you can (or can't) statistically-reliably hear a difference. It can be the 1st step before doing a preference test or measuring to discover what the difference is. With speakers or headphones there's no need for ABX because we expect that there will be a difference and you'll probably always be able to identify X. And with headphones they will feel different and have different weight, etc.

He tried and he did a lot of things right, and he put in a lot of time & effort. It was level-matched and blind, and I like the fact that he had a listening panel so it's not just one person's ears, and they did the tests individually so they couldn't accidently influence each other.

It's just saying that even if he can prove
You can't really "prove" that you're hearing a difference. Even in a proper controlled ABX test you only get a probability that you aren't "guessing". If you do one trial there is a 50/50 chance of guessing X correctly. With 8 out of 10 correct, that's only a 5% chance that you are guessing and that's considered a "statistically significant" result, but it's not absolute proof. That also means that if there are 20 participants, somebody will likely guess at least 8 correctly. Or if one listener repeats the test 20 times, one of the times you'll probably get 8 of 10 (or more) correct. 10 of 10 correct is a probability 0.1% of getting lucky so that's 'nearly proof".

Amir has a good video about Controlled Audio Blind Listening Tests and HydrogenAudio has a good write-up about Blind ABX Tests.



* Dan Clark (headphone manufacturer) has a video about measuring headphones and he mentions that more distortion is often described as "more detailed".
 
Last edited:
Can someone just list the CD players under test and the trends that were observed?

Extraordinary claims require extraordinary evidence, but the claims I’m seeing summarized in posts above don’t seem that extraordinary. But I do expect boutique brands to tinker with spectral output in the analog section, either on purpose or by accident, so evidence of that wouldn’t offend my expectations.

My Naim CD5 puts out 0.1V more signal than the 2-volt standard, a fact which JA noted in his measurements 20 years ago. He warned users that this might bias comparisons. I doubt any level display in a consumer preamp would make it easy to see that 0.4 dB difference, but it would certainly affect perceived sound. That’s a key flaw that will leave a tell in an ABX and will make the louder unit seem a bit more dynamic.

That the test is an AB rather than ABX test reduces the reliability but does not invalidate the results.

That they were all the same when using an external DAC proves that whatever differences there may be reside in the DAC and analog sections. That alone kills one of the usual sacred cows, and it’s also not that wild a claim, depending on the players being tested. That’s why listing them would be nice.

Rick “can’t watch videos where I am” Denney
 
I doubt any level display in a consumer preamp would make it easy to see that 0.4 dB difference,
The preamp he's using seems to have "laser trimmed" step resistors attenuator, with a step of 0.5dB (probably with a tolerance).

So, indeed, you may very well sit at a max 0.x dB level difference that you can't compensate for with this method.

That may induce perceived difference.

From the M8s pre manual:
The red LED display shows the current dB step of the laser-trimmed electronic attenuator. The display
increments by accurate 0.5dB steps.
 
Last edited:
From my experience (68), do not trust the sound quality judgements of people above 50. The ears don't hear any more above 14khz and the high frequency character which is the most important difference in digital audio, gets neglected.
The first time I could listen at will a CD player was Akai CD1, the one with the CD is loaded vertical with TDA1540 DAC. It sounded very harsh high frequencies as breaking glasses but I was in my mid twenties. I listened 2 decades later the same player and it sounded very nice.
 
From my experience (68), do not trust the sound quality judgements of people above 50. The ears don't hear any more above 14khz and the high frequency character which is the most important difference in digital audio, gets neglected.
The first time I could listen at will a CD player was Akai CD1, the one with the CD is loaded vertical with TDA1540 DAC. It sounded very harsh high frequencies as breaking glasses but I was in my mid twenties. I listened 2 decades later the same player and it sounded very nice.

Could be your hearing changed - far more likely your biases changed. I'd also be prepared to bet that 2 decades on you weren't listening with the same speakers, in the same room with the same furnishings, or some other significant factor had changed.

Distrust people's sound quality judgements regardless of their age, including your own. Bias is always a confounding factor if not controlled for, and is far more significant than loss of high frequency sensitivity.
 
Last edited:
My friend visited a few weeks ago. He is half interested in hifi but he thought it would be fun to blind test two different players that I have.:

Blu-Ray Denon DBP-2012UD and Blu-Ray Sony BDP-S570.

Two CD discs, from the same master (I was careful to check it), was used:
IMG_20220121_160426.jpg
In any case, he tested twenty-five times. He failed to pinpoint the right player, so p-value below 0.05. He also failed to determine whether it was the same player or not that was used when I was A / B alternating.
I was then going to test but we lost interest and listened to music and had a beer instead.:) I most likely also failed to pinpoint correctly.

We didn't need to level match because he failed to pinpoint the right player.Both players have 2 Vrms output.

Two inputs in my HK amp, CD and AUX, were used when we blind tested:
Screenshot_2025-09-28_120041.jpg

Could he/we have succeeded if he/we:
Had trained extensively in detecting differences.
Had top notch high end hifi (with the same two Blu-Ray players).
Had young ears that could hear high frequencies.
Had used different music that would have made it easier to detect differences.
What do I know. Maybe so, but I don't think so.
 
Last edited:
The video author meant well, but this is a flawed test:
  • The test is only single blind: The author knows which player is active and could impart his bias on the listener
  • The predetermined answers will "nudge" listeners to answer in a certain way, futher increasing the influence of bias
  • It is an A/B test, not A/B/X
  • The volume matching appears to be to within 0.5 dB, which is not good enough (0.2 dB would be OK, 0.1 dB would be ideal)
  • The test CDs were "duplicated". I assume that means the author burned either CD-Rs or CD-RWs, which may result in playback errors including interpolation in older CD players. The burned CDs might also not be bit perfect to begin with, causing playback differences even in modern players. This should have been checked before using them
  • The players are not time-aligned very well: By manually starting some of them, there will be a couple of seconds of playback time difference between them during songs, which makes listening tests difficult to useless
Many of these flaws are very hard to correct, which is why reliable, valid listening tests are so difficult. But they could simply capture the output of each player using a high end audio interface and compare the results that way. This would be orders of magnitude more reliable and would also be simpler to do, cheaper and quicker than repeating subjective tests with multiple listeners... :rolleyes:
 
Back
Top Bottom