• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The ABX test of ABX tests

JSmith

Master Contributor
Joined
Feb 8, 2021
Messages
5,152
Likes
13,207
Location
Algol Perseus
Are you referring to something I said?
Nah mate or I would have quoted you. It was a general thought correlating to those throw away comments in other threads and how those same comments relate to blind ABX. You know the types that say measurements can't pick up differences their ears can, yet when presented with a blind ABX challenge suddenly find all the excuses in the world as to why they can't perform such a test, or how the testing process doesn't work properly. If they trust their ears so much, they should be salivating at the chance to show how great their ears are.


JSmith
 
OP
Talisman

Talisman

Addicted to Fun and Learning
Forum Donor
Joined
Mar 27, 2022
Messages
897
Likes
2,546
Location
Milano Italy
When you perform an ABX test you evaluate several steps right?
For example, talking about lossy files
Maybe it starts with a 16kbs mp3 and goes on up to 320kbs by constantly comparing in ABX with a flac file.
Sure you will find important differences initially, then it will be increasingly difficult to find them until everyone finds the level at which they cannot statistically significantly perceive a difference, or for the best ears there is still an audible difference okay?
Now let's say that my perception threshold is 256kbs, that is, at 256kbs I can still perceive a statistically significant difference in abx compared to flac, but that I am no longer able to 320kbs are we there?
There is certainly no clear wall of differentiation between these two levels, but somewhere in between my abilities have started to feel less and less of a difference.
If we could divide into further 10 steps between mp3 at 256 and mp3 at 320 we will notice that maybe at exactly 256 I can hear a difference 13 times out of 15, maybe halfway 11/15 to get to 320kbs at 7o8 out of 15 ok?
So presumably there may still be (can they? Can't they? I don't know, I'm not stating anything, that's why I'd like a test like this) some LIGHT SHADES that I could potentially perceive, but they are so faint that I can't recognize the way statistically significant.
Let's admit for a moment that these differences do not exist, that any single ABX is perfectly capable of establishing what I can or cannot feel in an absolute and incontrovertible way, at that point even adding and concatenating different elements that are on the border of my threshold of inaudibility I will have an unrecognizable ABX from a top chain. Do we agree on this?
Instead now let's try to hypothesize that by forming an audio chain with all my minimum thresholds of inaudibility compared to a top chain, if I could I would be able to blind in ABX to see a statistically significant difference this could mean that there may be subtleties that in the single test do not they are detectable but added together could make an audible difference. But always using a blinded ABX as validation of the result.
I'm not talking about magic ears, nor about "things that can be heard but not measured" or other similar bestialities.
I hope I have explained.
 

Bernard23

Addicted to Fun and Learning
Joined
Nov 25, 2020
Messages
527
Likes
389
When you perform an ABX test you evaluate several steps right?
For example, talking about lossy files
Maybe it starts with a 16kbs mp3 and goes on up to 320kbs by constantly comparing in ABX with a flac file.
Sure you will find important differences initially, then it will be increasingly difficult to find them until everyone finds the level at which they cannot statistically significantly perceive a difference, or for the best ears there is still an audible difference okay?
Now let's say that my perception threshold is 256kbs, that is, at 256kbs I can still perceive a statistically significant difference in abx compared to flac, but that I am no longer able to 320kbs are we there?
There is certainly no clear wall of differentiation between these two levels, but somewhere in between my abilities have started to feel less and less of a difference.
If we could divide into further 10 steps between mp3 at 256 and mp3 at 320 we will notice that maybe at exactly 256 I can hear a difference 13 times out of 15, maybe halfway 11/15 to get to 320kbs at 7o8 out of 15 ok?
So presumably there may still be (can they? Can't they? I don't know, I'm not stating anything, that's why I'd like a test like this) some LIGHT SHADES that I could potentially perceive, but they are so faint that I can't recognize the way statistically significant.
Let's admit for a moment that these differences do not exist, that any single ABX is perfectly capable of establishing what I can or cannot feel in an absolute and incontrovertible way, at that point even adding and concatenating different elements that are on the border of my threshold of inaudibility I will have an unrecognizable ABX from a top chain. Do we agree on this?
Instead now let's try to hypothesize that by forming an audio chain with all my minimum thresholds of inaudibility compared to a top chain, if I could I would be able to blind in ABX to see a statistically significant difference this could mean that there may be subtleties that in the single test do not they are detectable but added together could make an audible difference. But always using a blinded ABX as validation of the result.
I'm not talking about magic ears, nor about "things that can be heard but not measured" or other similar bestialities.
I hope I have explained.

Try that, it's easy to complete, and answers all of your questions from a test methodology PoV I think.

BTW, I suspect most folks are also more influenced on location over bit rate....
 
Last edited:

napfkuchen

Active Member
Joined
Mar 9, 2022
Messages
294
Likes
392
Location
Germany
That's far beyond what I might be willing to take part in. Got tired of a regular ABX test very quickly. Maybe you ask this ABX guy, no ..., XBA, ... no, XZ(i)B(it) guy and stop worrying about flawed tests. :D

I therefore thought of a possible ABX test of the ABX tests, ...
YO DAWG ... sorry I just had to :facepalm::p
 

ahofer

Major Contributor
Forum Donor
Joined
Jun 3, 2019
Messages
4,947
Likes
8,694
Location
New York City
Not exactly the same, but there’s a thread here (and also some on Archimago) with multiple generation copies of files, multiple runs through ADC-DAC-ADC
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,383
Likes
24,749
Location
Alfred, NY
When you perform an ABX test you evaluate several steps right?
For example, talking about lossy files
Maybe it starts with a 16kbs mp3 and goes on up to 320kbs by constantly comparing in ABX with a flac file.
Sure you will find important differences initially, then it will be increasingly difficult to find them until everyone finds the level at which they cannot statistically significantly perceive a difference, or for the best ears there is still an audible difference okay?
Now let's say that my perception threshold is 256kbs, that is, at 256kbs I can still perceive a statistically significant difference in abx compared to flac, but that I am no longer able to 320kbs are we there?
There is certainly no clear wall of differentiation between these two levels, but somewhere in between my abilities have started to feel less and less of a difference.
If we could divide into further 10 steps between mp3 at 256 and mp3 at 320 we will notice that maybe at exactly 256 I can hear a difference 13 times out of 15, maybe halfway 11/15 to get to 320kbs at 7o8 out of 15 ok?
So presumably there may still be (can they? Can't they? I don't know, I'm not stating anything, that's why I'd like a test like this) some LIGHT SHADES that I could potentially perceive, but they are so faint that I can't recognize the way statistically significant.
Let's admit for a moment that these differences do not exist, that any single ABX is perfectly capable of establishing what I can or cannot feel in an absolute and incontrovertible way, at that point even adding and concatenating different elements that are on the border of my threshold of inaudibility I will have an unrecognizable ABX from a top chain. Do we agree on this?
Instead now let's try to hypothesize that by forming an audio chain with all my minimum thresholds of inaudibility compared to a top chain, if I could I would be able to blind in ABX to see a statistically significant difference this could mean that there may be subtleties that in the single test do not they are detectable but added together could make an audible difference. But always using a blinded ABX as validation of the result.
I'm not talking about magic ears, nor about "things that can be heard but not measured" or other similar bestialities.
I hope I have explained.
That's far beyond what I might be willing to take part in. Got tired of a regular ABX test very quickly. Maybe you ask this ABX guy, no ..., XBA, ... no, XZ(i)B(it) guy and stop worrying about flawed tests. :D


YO DAWG ... sorry I just had to :facepalm::p
Keep in mind that there's other formats. ABX is just one way.

Double blind, level matched- i.e., ears only. That's the sole non-negotiable part.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,521
Likes
37,049
One version of blind tests I prefer to abx is one I used on those multi generation samples. You are given three files, two are the same and one is different. You always know what file you are listening to. You listen to them anyway you prefer. You select which of the three is different.

This is called a triangle test, and is commonly used in the food industry.

I also prefer 2afc tests, but those are rarely applicable to audio properties.
 
Last edited:

JSmith

Master Contributor
Joined
Feb 8, 2021
Messages
5,152
Likes
13,207
Location
Algol Perseus
Rather pricey... has been mentioned in another thread a while back I think;


JSmith
 

NikJi

Member
Joined
Sep 9, 2022
Messages
81
Likes
12
Hello, I have always thought, even before starting to read about ASR, that the only way to choose my audio correctly is to blindly test the various components without being influenced by preconceptions.
Recently, however, I have also had some doubts, if there really exist nuances that we cannot grasp in instant ABX but that could have long-term influence in listening? I'm not asking because I'm against science, but on the contrary, because I like to check every statement.

I therefore thought of a possible ABX test of the ABX tests, let's take it for granted that if we cannot distinguish two songs performed with some difference in ABX obviously we must consider them equivalently for us, and therefore adding various modifications, each of them not differentiable in ABX, the result should always be the same.

Let's start from the listening file, take a good file in flac and compare in ABX with gradually worse lossy files (mp3 320, 256, 125, 96 etc ...) and find out which is the first file in which we can distinguish the difference and let's go to the next higher step (if we can see the difference in the mp3 at 256, we will take the mp3 320 as a good file for the test)
We pass to the next level, the transmission of the file to a dac, I used or as a reference the cable and then we test the bluetooth ldac 990, then aptx HD, then apts / aac then sbc, also in this case we identify the level at which we can hear the difference and we go to the next step, if we hear difference with aptx hd then we will take LDAC as protocol for the test.
Let's go to the Dac, we will use a definitely transparent dac and then we will gradually try less performing dacs until we get to the poorest and worst measured converters from a few euros that are found on amazon, also here the same procedure, we will use the poorest among the unrecognizable ones.
Finally we go to the amp, starting with something that is certainly excellent (perhaps a hypex module) and gradually descending among the many small amps that many of us surely have at home, arriving at the worst implementation of TDA 7498 or similar.
At this point we create a listening chain that has everything at the highest level (flac file, cable connection, transparent dac, top amplifier) and on the other side the chain with all the poorest elements but that I cannot identify in abx (mp3 320kbs, ldac, 45 euro nobsound dac, breeze amp with tpa3116d2 .... For example) and we send everything to the same speakers used for the other single tests.
At that point we proceed to the real test, accumulating all these differences, are the two listening chains still indistinguishable? If yes, then we can be reasonably sure our abx tests are worth it, but what if we could feel the difference? We should take into consideration the possibility that there are nuances that we cannot immediately or individually grasp but which add up to make an audible difference. At that point what should we do
Do you think it makes sense? Could you try?
That is the core foundation of "Design of Experiments 101". Change one parameter only at a time if you really want to get to the root cause.
 

NikJi

Member
Joined
Sep 9, 2022
Messages
81
Likes
12
No. People usually cannot make a difference between 320k mp3 vs FLAC.
It is a scientific fact that there is difference. But you cannot prove it in blind test.
So any blind testing is an invalid, failed, wrong method.

"being influenced by preconceptions"

Why is that a problem? If you buy a golden colored amplifier and you hear better sound quality because of that, be happy, and enjoy the music.
Do not listen to these "have a bad hearing" people here.
I got a gold colored one and was not impressed. So I got a gold plated one, and that really made a big difference. Someday I hope I can buy one made of 22K Gold.
 

Spkrdctr

Major Contributor
Joined
Apr 22, 2021
Messages
2,212
Likes
2,934
A little searching will turn up strong evidence that rapid switching gives better detection sensitivity that long-term with slower changeovers. And several people have already done the "compare an entire chain of electronics with front-to-back cheap to front-to-back high end."

General advice: Pick a hypothesis, state it clearly and unambiguously, then run an experiment to attempt to falsify it before you decide what the follow-up is. It's often hard, sometimes impossible, to get people to do this. But it's the key to doing good and useful experiments. See this for some examples.
Gee, that Stuart guy who wrote that paper sure is smart. I think everyone should be forced to read it multiple times. Written for a layman and very easy to understand. I tip my hat to him!
 
  • Like
Reactions: SIY

ahofer

Major Contributor
Forum Donor
Joined
Jun 3, 2019
Messages
4,947
Likes
8,694
Location
New York City
People usually cannot make a difference between 320k mp3 vs FLAC.
It is a scientific fact that there is difference. But you cannot prove it in blind test.
So any blind testing is an invalid, failed, wrong method.
God is love
love is blind
Stevie Wonder is blind
Therefore, Stevie Wonder is God
 
Top Bottom