Of course not. The measurements show the DACS are audibly identical. They sound the same.Is this substantiated by their relative measurements?
Keith
5,4,3,,,
Of course not. The measurements show the DACS are audibly identical. They sound the same.Is this substantiated by their relative measurements?
Keith
Of course not. The measurements show the DACS are audibly identical. They sound the same.
5,4,3,,,
Guess I didn't make the point clear enough.
A:In a ABX test, the attendee got 43 correct out of 60. How likely it is really a different between A and B
ChatGPT:
In an ABX test, the goal is to determine if the attendee can reliably distinguish between two stimuli, A and B. To assess how likely it is that the attendee's correct guesses are due to a real difference between A and B (rather than just chance), we can perform a binomial test or calculate the p-value for the result.
In this case:
The p-value will help us understand how likely it is that a person could have gotten 43 or more correct answers purely by guessing.
- The number of trials n=60
- The number of correct answers k=43
- Under the null hypothesis (assuming no difference between A and B), the probability of a correct answer in each trial is p=0.5, since it would be just a random guess.
Let me calculate this for you.
The p-value is approximately 0.00053. This indicates that the probability of achieving 43 or more correct answers purely by chance (under the assumption that there's no difference between A and B) is very low—about 0.05%.
Since this p-value is well below the common significance threshold of 0.05, it suggests that it is very likely there is a real difference between A and B in the test, and the attendee is not simply guessing.
So it's not that 72% isn't representative enough. You need to deny the whole deny the whole experiment.
That is true only in the absence of confounders.Undeniably, this suggests that it is more likely than not that the participants genuinely noticed a difference in sound quality, rather than the results being due to random guessing.
They're talking in general, statistics, not about any particular test.It is true that technical measurements (distortion, noise, and frequency response) indicate minimal or insignificant differences, so the Topping and the WiiM Ultra should perform very similarly and sound audibly identical.
However, according to this post the probability that the listeners in the test would have obtained 43 correct answers out of 60 by pure chance was 0.053%.
Undeniably, this suggests that it is more likely than not that the participants genuinely noticed a difference in sound quality, rather than the results being due to random guessing.
Undeniably, this suggests that it is more likely than not that the participants genuinely noticed a difference in sound quality, rather than the results being due to random guessing.
That is true only in the absence of confounders.
Or that the controls were insufficient.
Better bass and slightly clearer are symptomatic of one level being slightly louder than another.
Keith
Not dogmatically assumed. But if the controls are insufficient (and I think we have determined that to be the case) then confounders are possible.If the existence of confounders in the test is dogmatically assumed, the possibility of any meaningful discussion is undermined.
If the existence of confounders in the test is dogmatically assumed, the possibility of any meaningful discussion is undermined.
Something if you ever do it again, milli volt meter from the units outputs.It is true that, based on the psychoacoustic principle, volume affects sound perception. When a system is louder, bass and treble tend to stand out more, giving the impression of better quality or clarity. However, in our test, none of the participants mentioned that one system sounded louder than the other, nor did they base their responses on volume differences.
That said, clearly there is the possibility that, despite balancing volume levels with an SPL meter, small variations in volume could still have influenced the perception of sound quality.
That is the point - a small level difference (say from as low as 0.2dB to <1dB) is NOT percieved as a volume difference. It will be perceived only as a quality difference.However, in our test, none of the participants mentioned that one system sounded louder than the other, nor did they base their responses on volume differences.
Any further suggestions are welcome.
Without denying that confounding factors may have influenced the results,
completely dismissing the participants' ability to detect differences could overlook subtle nuances that technical measurements don't fully capture.
Without denying that confounding factors may have influenced the results, completely dismissing the participants' ability to detect differences could overlook subtle nuances that technical measurements don't fully capture.
Listening tests often reveal small subjective differences in sound that may not appear in purely technical measurements.
People often notice subtle tonal variations, dynamics, or soundstage that can differentiate equipment, even when objective measurements indicate near-identical performance.
If technical measurements were the only factor considered, many DACs and speakers would sound exactly the same, and that’s clearly not the general opinion.
Therefore, I believe subjective listening experiences should also be considered when considering audio equipment, alongside the technical data.
Thanks to everyone for the feedback. I have decided to redo the same test with instructional and recreational purposes exclusively and electrically balance the output levels of the DACs to compare results. Any further suggestions are welcome.