• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

How can I perform a completely proper ABX test?

nyxnyxnyx

Addicted to Fun and Learning
Joined
May 22, 2019
Messages
506
Likes
475
I've been joining this hobby for a few years but it's mostly on the subjective and biased side. This year I feel like I want to learn more and be more knowledgeable and rational in this aspect.... And I realized that I've never done a proper ABX test.

Can anyone here teach me how to perform a true ABX test? Like, what devices, resources do I need to set up? If possible, please explain it in layman terms so I can understand better, as I'm not too fluent in English.
 

restorer-john

Grand Contributor
Joined
Mar 1, 2018
Messages
12,703
Likes
38,836
Location
Gold Coast, Queensland, Australia
ABX is a useless joke. (A or B or 'x')

Mostly the proponents of ABX have never compared anything, letalone 2 or 5 amplifers all at once or more than 2 pairs of speakers at once, letalone 10 pairs.

It's useful for (a) or (b) is different comparisons- that's all. Nothing else.
 

eddantes

Addicted to Fun and Learning
Joined
May 15, 2020
Messages
715
Likes
1,411
Well... lets start small.

Lets say you want to compare two speakers (you will do so in mono), you will need to ensure that both speakers are equally loud. To do so you might want to use a microphone and a pink noise generator. To make the adjustment to loudness, you might employ something like a MiniDSP 2x4 to make one louder as needed, but you can use the balance knob and the A/B switch too. Pink noise that you measure with your mic and ensure that the two speakers produce same levels. Now you need a way to rapidly switch between the two - so hopefully you can use the aforementioned a/b switch or you will need to procure a switching device on Amazon (or where ever). Now just set the two speakers close to eachother, play music, close your eyes, and use that a/b switch to rapidly cycle.

Better still, if you have a partner, let them do it, while you keep yourself ignorant of whats playing and simply ask them to call out "a" or "b" so you can make your preference blind.

The key for comparing for me - equal loudness and rapid switching. I mean you should be able to switch from A to B within a second or two, otherwise it's just too hard to remember.
 

somebodyelse

Major Contributor
Joined
Dec 5, 2018
Messages
3,753
Likes
3,049
ABX has a very specific purpose - can you really tell the difference between A and B? The hydrogenaudio forum use it to verify people can actually hear claimed problems in perceptual audio codecs, and have a short list of software options to perform testing with. The foobar2000 ABX plugin is probably the most widely used option.
https://hydrogenaud.io/index.php?topic=16295.0
https://wiki.hydrogenaudio.org/index.php?title=ABX
https://hydrogenaud.io/index.php/topic,107354.0.html
https://www.head-fi.org/threads/set...ping-tagging-transcoding.655879/#post-9268096
There are also some online options, assuming your browser doesn't do any untoward audio processing.
https://abxtests.com/

There have been a number of threads where members have posted recordings or synthetic examples for people to ABX, such as:
https://www.audiosciencereview.com/...-are-put-in-place-spoiler-probably-yes.29353/
https://www.audiosciencereview.com/...dac-loop-vs-the-original-can-you-hear-it.448/
https://www.audiosciencereview.com/...-files-recorded-for-download-disclosed.25646/
You can also use something like DISTORT to add specific distortions and check whether you can hear them.

If you want to use analog switching between hardware devices it's a whole lot more difficult, in some cases virtually impossible - think about trying to repeatably swap between different VTA settings on most turntables as an example. However if you can show you can't hear the difference between the original source and a 1st generation loopback (out of the DAC and into the ADC) you can then record the output from each of the VTA settings then just use the files to ABX.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,499
Likes
25,313
Location
Alfred, NY
I've been joining this hobby for a few years but it's mostly on the subjective and biased side. This year I feel like I want to learn more and be more knowledgeable and rational in this aspect.... And I realized that I've never done a proper ABX test.

Can anyone here teach me how to perform a true ABX test? Like, what devices, resources do I need to set up? If possible, please explain it in layman terms so I can understand better, as I'm not too fluent in English.
It's a very broad question, much like asking, "How do I drive a car?"

But a couple things up front:

ABX is just one of many test formats for ears-only evaluation. There are other formats, but the most important part is double blind and level-matched.

It is VITAL to define up front exactly what it is you're trying to determine. That is almost never done in amateur tests and results in a lot of flailing about and unnecessary arguments.
 

DVDdoug

Major Contributor
Joined
May 27, 2021
Messages
3,023
Likes
3,975
HydrogenAudio is "the place" for learning about ABX.

If you are comparing files (i.e. comparing WAV to MP3 or CD-quality to high-resolution) there is software for that and you can do a blind test by yourself.

For hardware, somebody has to help you in order to make a blind test. It's supposed to be double-blind but usually the person taking care of the switching will know if X is A or B so it will usually be single-blind. Generally, your assistant should operate the switch every time or if they are swapping connections, disconnect every time, even no switching is actually being done so there are no "hints".

It's a statistical test/experiment so there has to me multiple trials and you have to decide on the number of trials in advance. But for each trial you can request A, B, or X multiple times until you choose A or B and record the results, before moving-on to the next trial. X has to be truly random, so your assistant should flip a coin. The coin flips can be done in advance
 

BDWoody

Chief Cat Herder
Moderator
Forum Donor
Joined
Jan 9, 2019
Messages
7,049
Likes
23,322
Location
Mid-Atlantic, USA. (Maryland)
This year I feel like I want to learn more and be more knowledgeable and rational in this aspect.... And I realized that I've never done a proper ABX test.

Neither have I.

When I wanted to put what I was reading here to a personal test comparing some DACs, I used a multimeter to match the output levels on each DAC I was comparing, sync'd each via optical connections from two CCAs, mixed up the cables so I didn't know which was connected to what input on my preamp, and went back and forth.

That was as far as I needed to go. If I had heard differences, I would have added the double blind element and asked a friend to help, but I couldn't tell them apart. At all.

Do you have gear in mind you want to compare?
 

Sgt. Ear Ache

Major Contributor
Joined
Jun 18, 2019
Messages
1,895
Likes
4,162
Location
Winnipeg Canada
Neither have I.

When I wanted to put what I was reading here to a personal test comparing some DACs, I used a multimeter to match the output levels on each DAC I was comparing, sync'd each via optical connections from two CCAs, mixed up the cables so I didn't know which was connected to what input on my preamp, and went back and forth.

That was as far as I needed to go. If I had heard differences, I would have added the double blind element and asked a friend to help, but I couldn't tell them apart. At all.

Do you have gear in mind you want to compare?

That's where I'm at more or less. I've done some simple A/B comparisons between different things over the past couple years. In several cases, it was a situation where I thought I'd heard "something" between the items. But in every case when I've done a direct comparison it's revealed to me pretty clearly that whatever I thought I was hearing (and had seemed "obvious" in general listening) just didn't actually exist. And these were cases where you'd think there might actually be an identifiable difference. I haven't felt the need to go any further along that path...
 

charleski

Major Contributor
Joined
Dec 15, 2019
Messages
1,098
Likes
2,240
Location
Manchester UK
'ABX' is just short-hand. These tests should really be called double-blind randomised controlled trials, which covers the critical elements.
Double-blind: Neither the test subject nor the experimenter knows which is which. So one person hooks everything up and then goes away and has no contact whatsoever with the person doing the test. Someone else (who doesn't know how everything is connected) then comes in and administers the test, if needed.
Randomised: Fairly obvious, the two items are presented repeatedly in a random order.
Controlled: This is often the hardest part. Any factors apart from sound quality that might interfere have been removed. The most important of these is volume, since it's been shown we're quite sensitive to small volume changes and will reliably prefer whatever is slightly louder. But other factors that might interfere need to be considered as well, an example of this can be found here, in which the relays doing the switching had slightly different clicks.

So the exact nature of the test depends on what you're comparing. Comparing audio files is relatively easy, and the hydrogenaudio links provide good coverage of how to do a test comparing, say, compressed vs uncompressed audio. You can do these with the computer acting as the blinding mechanism. Comparing electronics is harder, needs a good switch box, and needs other people to participate to make sure it's blind. Comparing loudspeakers blind is very hard if you really want to do it properly (you can't just sit them down next to each other, as the inactive speaker will interfere with the soundfield) and ends up needing commercial-level resources like the test facility at Harman.

Then you need to consider what test you're actually going to do. This is where 'ABX' comes in, as most tests are basic forced-choice discrimination: "Do you think X is A or B?" This gets directly to the heart of the matter, but it would be perfectly valid to consider other test questions.

As you might have noticed by now, inviting some friends over for drinks and asking if they can tell the difference while you swap cables between your new shiny amp and the old one that's fallen out of favour is NOT an ABX test. Though that doesn't stop some people pretending that it is.

It might be worth reading this AudioXpress article as well, that covers amplifiers.
 
Last edited:
OP
nyxnyxnyx

nyxnyxnyx

Addicted to Fun and Learning
Joined
May 22, 2019
Messages
506
Likes
475
It's a very broad question, much like asking, "How do I drive a car?"

But a couple things up front:

ABX is just one of many test formats for ears-only evaluation. There are other formats, but the most important part is double blind and level-matched.

It is VITAL to define up front exactly what it is you're trying to determine. That is almost never done in amateur tests and results in a lot of flailing about and unnecessary arguments.
Can you explain to me like I'm 5 about other methods? And I'm curious about non-ears-only evaluation methods as well.
If I understand it right, level-matched means that I should test my devices at the exact same dB right? Double-blind is when me as the listener doesn't know which device I'm listening to, and the host of the test doesn't know it either (leave it to another 3rd party to run the test, I presume?)?

I'm trying to determine whether a lot of stuff in this audio hobby can be distinguished reliably or not. That includes any kind of accessories and items that are often associated with snake oil and vagueness (le "audiophile" mumbo jumbos). Based on the great information I've learnt from ASR and other objective-based communities I'm already a believer that you guys make more sense, but I want to experience that for myself as well as to make an attempt to let other local audiophile friends of mine think about it (I'm not trying to convince them, just wanna let them experience both sides).

So based on my goal, do you have any suggestions of a method that will work well for me? For example, I want to host an offline event where we'd do multiple DACs testing, cables at the same lengths, etc... I'm thinking of buying a high-quality switcher and educating myself to know how to do volume-matched testing and have a better understanding of technical knowledge and software applications for the test.
 

JSmith

Master Contributor
Joined
Feb 8, 2021
Messages
5,215
Likes
13,445
Location
Algol Perseus
Can anyone here teach me how to perform a true ABX test?
My suggestion is to watch @amirm's video on the subject too;


JSmith
 
OP
nyxnyxnyx

nyxnyxnyx

Addicted to Fun and Learning
Joined
May 22, 2019
Messages
506
Likes
475
Neither have I.

When I wanted to put what I was reading here to a personal test comparing some DACs, I used a multimeter to match the output levels on each DAC I was comparing, sync'd each via optical connections from two CCAs, mixed up the cables so I didn't know which was connected to what input on my preamp, and went back and forth.

That was as far as I needed to go. If I had heard differences, I would have added the double blind element and asked a friend to help, but I couldn't tell them apart. At all.

Do you have gear in mind you want to compare?
I have a lot of things I want to compare and run tests. Not just amplifiers, dacs but other boutique accessories as well. I'm doing this to fulfill the last bits of my belief and to further make what I believe in more grounded. One thing subjective audiophiles usually tend to follow is that they're very sketched in one side of the field, and rarely or never bother to try to know what the other side is like.

I don't believe that a boutique, exotic or supposedly very special by using the highest quality components can sound vastly different in comparison with other well-designed DACs and currently, I'm a little bit under fire in my local community for thinking this way. So as a way to confirm what I believe in as well to give others an opportunity to try for themselves I want to educate myself in this subject, plus I've been interested in audio gears for quite a few years I think it's time I should be more knowledgeable for my own sake.

The method you used is similar to what I thought, but I don't know what is a good extent for the duration of the test. Like, let's say if I run the blinded, volume-matched test (using a high-quality switcher to switch between dacs instantly) 10 times. Then I don't know if that 10 times is sufficient or not or should it be shorter/longer than that. Besides that, I don't have enough knowledge to check on the progress to see if I'm doing everything right and that's the tricky part in my opinion. If I miss some aspects and run the test wrongly then the end result cannot be validated because it was misled from the beginning. If I have to sum it up my goal is like:
1) Find a very effective methods to do blind testing
2) invite participants and DACs they're supposed to know and remember their "sound signature" very well.
3) Run the test with their DAC(s) and other instrument-grade DAC(s) (I'm thinking of using their/my EXACT system minus the DAC so it'd be fair). The listener is not informed which DAC is currently in use.
4) Conclude the test with certain metrics like how many times they got it right, how hard/easy was it to distinguish the sound quality, if there were any reliable differences (not just SQ but maybe things like hissing, humming noises etc...) during the test.
Moreover, there's a possible issue that I cannot control in this matter, if the test is taking quite long and the listener is not honest, he/she might start to guess the device in use not by what that person is hearing, but by thinking or other ways. It's quite a headache when I think about it.

Anyway, that's what I'm trying to figure out and it's what I'm interested in right now, more than other newly-arrived audio gears at the moment.
 
Last edited:

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,499
Likes
25,313
Location
Alfred, NY
Can you explain to me like I'm 5 about other methods? And I'm curious about non-ears-only evaluation methods as well.
If I understand it right, level-matched means that I should test my devices at the exact same dB right? Double-blind is when me as the listener doesn't know which device I'm listening to, and the host of the test doesn't know it either (leave it to another 3rd party to run the test, I presume?)?

I'm trying to determine whether a lot of stuff in this audio hobby can be distinguished reliably or not. That includes any kind of accessories and items that are often associated with snake oil and vagueness (le "audiophile" mumbo jumbos). Based on the great information I've learnt from ASR and other objective-based communities I'm already a believer that you guys make more sense, but I want to experience that for myself as well as to make an attempt to let other local audiophile friends of mine think about it (I'm not trying to convince them, just wanna let them experience both sides).

So based on my goal, do you have any suggestions of a method that will work well for me? For example, I want to host an offline event where we'd do multiple DACs testing, cables at the same lengths, etc... I'm thinking of buying a high-quality switcher and educating myself to know how to do volume-matched testing and have a better understanding of technical knowledge and software applications for the test.
Try this: https://linearaudio.net/sites/linearaudio.net/files/LA Vol 2 Yaniger(1).pdf

"Non-ears-only evaluation methods" are absolutely unreliable.
 

charleski

Major Contributor
Joined
Dec 15, 2019
Messages
1,098
Likes
2,240
Location
Manchester UK
I don't know if that 10 times is sufficient or not
Not. Amir posted on the statistics involved here:

I would plan on doing at least 20 trials for each person, ideally more. While rapid-switching has been shown to be the most reliable way to distinguish primary audio features which rely on low-level neural processing, some audiophiles may well want to listen to each choice for longer periods of time to build up a high-level perception of the audio stream, and that's equally valid. You will also need to allow breaks so they can recover from listener fatigue (listening to the same section of music over and over again gets tedious). This will all take quite a lot of time. If you have a group of people who can come over on a regular basis it might be an idea to give each person a block of five or ten trials an evening and then repeat that across several days.

he/she might start to guess the device in use not by what that person is hearing, but by thinking or other ways.
Well if we're talking about difference between DACs, most people here will say they're just guessing anyway :). You will need to make sure there aren't any extraneous cues that might indicate which is in operation (such as different-sounding relays that I mentioned earlier), but a lot will depend on the precise nature of your setup. It might be an idea to get someone else in to test the system before you start the trials and see if they notice anything that would give the game away. I wouldn't be worried about discrimination "by thinking" so much. If someone believes that 'A' makes them feel happy while 'B' makes them feel a bit annoyed then it's perfectly fine for them to use that as a basis for discrimination (though I suspect it would turn out to be rather unreliable).
 
Top Bottom