A normal distribution for a null result? SHOCKING!I agree - the result from the report look suspiciously like coin flips to me too.
A normal distribution for a null result? SHOCKING!I agree - the result from the report look suspiciously like coin flips to me too.
Note that the probabilities column in the table are not in percentages, so 0.056 is actually 5.6% i.e. 56/1000 and not 6/1000 as you suggested.Granting the 5% was used vs. 1%, a result like 0.056%, (6/1000) chance of result being due to random chance is too low to blown off.
Sure - nominal output level is a valid reason to select one device vs another, as it can be important for appropriate and optimal gain staging.a dac sending 1 vrms and another sending 2 vrms will sound different . adjusting pot to level match is not a solution IMHO
I don't see any results here that appear to be anything but random. Highest rate of correct answers is 65% and that is not enough to be statistically significant.Wearisome issue for sure, but the same-old.
I'm not a scientist and I can't explain why ABX-style, blind tests would not refute subjective claims of sound differences. Seems they ought to, but there might be something about that testing method that, while it may prove that difference do exist, it can't certainly prove they don't, nor that under different listening conditions some people, at least, can those differences.
Some might recall the famous testing reported in the January 1987 Stereo Review that compare a series of amplifiers. The general conclusion was the people cannot reliably tell the difference between amps. However when you look at the detailed results, you can see document that some people were able to tell the difference between amps in case of some paired amps at well above chance probability.
My own subjective experience is that things the measure well do tend to sound the same -- or at least the differences are utterly insignificant. I replaced my Schiit Gungnir Multibit DAC with a Topping DX7s DAC/headphone unit because I could hear no significant difference -- this despite the fact that the Gungnir MB doesn't measure all the well. I replace the DX7s with a D90 because it the latter has slightly better measurements but frankly I can't really hear any difference.
OTOH, I know a member on another forum who has decades of experience listening to many different amps and DACs. I have come to respect his wholly subjective impressions far more than most. He replaced his Topping D90 with the RME ADI and claims that they sound the different and he prefers the latter -- but I'm not rushing out do the same switch.
View attachment 190139
I see that the most significant one from post #3313 is 74/128.To get an idea, here is a spreadsheet simulation of a 10 coin toss repeated about 999 times. I had to run it about 3 times to get either an all heads or all tails (but then got 1 all tails and 2 all heads).
In an ABX test each of these would equate to a person getting it right 10/10 times (say heads) and wrong 10/10 (tails) - even if there was no difference. So around 3 x 999 runs = 2997 runs, and 2 times it was guessed correctly 10/10 (heads)
So based on this data, 0.066% or runs can generate 10/10 correct guessing even by blind chance.
View attachment 190149
Why? This is a textbook normal distribution. If Gorgonzola doesn't understand that, he needs to do more studying rather than you tossing work at someone else which will be meaningless to him.I see that the most significant one from post #3313 is 74/128.
Can you do another simulation but each trial is 128 times instead of just 10 times, like what you did? Repeat that for 100 rounds and share your results?
Thanks!
Haha, indeed...So at the end he will most likely find 5 out of 100 that do better.Why? This is a textbook normal distribution. If Gorgonzola doesn't understand that, he needs to do more studying rather than you tossing work at someone else which will be meaningless to him.
Kind of thinking more, 74/128 is by 8 participants. We don't know the break down of the results.Why? This is a textbook normal distribution. If Gorgonzola doesn't understand that, he needs to do more studying rather than you tossing work at someone else which will be meaningless to him.
You are correct. However 5.6% of random in a psychoacoustic test seems very low. I don't have the statistical knowledge to aggregate all the results to demonstrate whether or not this single result falls within a normal distribution having a mean of 50/50 correct for the study as a whole. That was the assertion of the principals of the study, to be sure.Note that the probabilities column in the table are not in percentages, so 0.056 is actually 5.6% i.e. 56/1000 and not 6/1000 as you suggested.
You can also easily check the calculations here (example for the 0.056 result you mention):
View attachment 190147
Humm ... well I won't dispute with you statisticians, however I wonder how the participates feel about their guesses?I don't see any results here that appear to be anything but random. Highest rate of correct answers is 65% and that is not enough to be statistically significant.
How they feel doesn't really matter, the issue is audibility. And the test results indicate close to random guesses. Yeah, people's feelings get hurt when they find out that the $40,000 DAC performs as well as a $9 dongle. Hurt feelings all around, that's what ASR is all about.Humm ... well I won't dispute with you statisticians, however I wonder how the participates feel about their guesses?
Do you have anything more to say about this one? At the end, OP and his friend can pass 7 out of 8 rounds, once comparisons reduced to 2 DAC instead of 4 at the same time.How they feel doesn't really matter, the issue is audibility. And the test results indicate close to random guesses. Yeah, people's feelings get hurt when they find out that the $40,000 DAC performs as well as a $9 dongle. Hurt feelings all around, that's what ASR is all about.
They didn't level match properly though. Used a mic.Do you have anything more to say about this one? At the end, OP and his friend can pass 7 out of 8 rounds, once comparisons reduced to 2 DAC instead of 4.
DAC ABX shootout - unable to distinguish between 10$ and 15k$
Hello All, First of all, thank you for the forum contributors who are sharing their knowledge here. I started reading ASR two month ago and just can't get enough of it. I am sharing my experience with my first ABX testing. Last Friday me, together with a friend performed a double blind test on...www.audiosciencereview.com
Please read through to the end. Beginning was mic. End was matching voltages with scope.They didn't level match properly though. Used a mic.
"Silly enough I can 100% reliably say which one is better when I see what system is connected."Please read through to the end. Beginning was mic. End was matching voltages with scope.
What you quoted is from first post. After that, many members provided testing suggestions and OP improved his testing method and rerun tests a few days later."Silly enough I can 100% reliably say which one is better when I see what system is connected."
Anyway, I didn't find the statistical nugget you cited. Do you have a post#?
Links don't work, just give me the post #s on this thread, ok?What you quoted is from first post. After that, many members provided testing suggestions and OP improved his testing method and rerun tests a few days later.
The statistics of second rounds were described to me at this post:
DAC ABX shootout - unable to distinguish between 10$ and 15k$
I know its probably me, but I cannot fathom an apple dongle sounding the same as any Chord or Topping DAC. Yet... it does. The performance of that little thing is astonishingly good. Not state of the art measurement, but damn excellent measurements. When I did a comparison to an RME ADI-2 Pro...www.audiosciencereview.com
Please ask more questions to the OP in that thread if you need further information.
; )
Anyway, I think the difference likely due to DAC+amp+headphones combinations used. I also observed differences with different combo when doing online blind timing tests. You can see my chart here:
DAC and amp combos did not give same clues when running online blind tests. Why? What would be the desired clue?
When running online blind timing tests yesterday, I heard different clues when using multiple combination of dac and amps. I wonder why. I thought, as long as I am using transparent DAC and amp, I can mix and match them and any combo will sound the same. But not this case. Is it okay? More...www.audiosciencereview.com