• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

What to do about the ABX test?

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,168
Likes
3,712
Seems increased commentary in recent weeks about ABX tests. Much of it stemming from people who come to ASR to set us straight about trusting our ears. I do agree with some who have said that calls for ABX or it didn’t happen have become almost like a club to beat people over the head with, and nearly cultish in how some new posters have the call rain down upon them. Not that I haven’t been guilty of it myself.

Some comments by @restorer-john have caused me to think about this situation. We stand little chance of convincing, or engaging in meaningful discussion with people with this approach. Like restorer-john I think there is a lot more talk of it than participation in or use of ABX listening tests among most posters. For most audiophiles it is impractical for most situations.

Some who don’t like ABX tests complain they are stressful. Only if you feel challenged by it or think you’ll suffer loss of face. After you have done it a couple or three times it isn’t stressful. It is major league TEDIOUS and BORING. Most of us do them with Foobar ABX or similar software. That isn’t very useful for amps and not at all for speakers.

So what is a next best alternative? What is a friendlier way to get the point across? How do regular ASR members pick their gear?

We don't need an 'alternative'. All anyone needs, is to understand and acknowledge the *fact* that their 'sighted' audio claims of difference are going to be subject to cognitive bias. Which means their claims should be accordingly tempered, qualified, or supported with excellent proof.

It's a matter of language, in other words.
 
Last edited:

Sgt. Ear Ache

Major Contributor
Joined
Jun 18, 2019
Messages
1,894
Likes
4,150
Location
Winnipeg Canada
In most cases we're talking about distinctions that could best be described as "infinitesimal." Differences between cables or dacs or (competent) amps...even if we allow that there might possibly be some difference (even if only measurable), the chances of audibility are minute. So, my first suggestion to anyone who thinks they are hearing readily identifiable differences (because that's what is often claimed right? "I just hooked up my new dac and OMG it's like night and day!") should go take a few online ABX tests that are easy to find and see if for instance they can readily tell 320kb mp3s apart from flac files. Because the differences between those is orders of magnitude larger than the differences between almost any 2 dacs or between cables or whatnot. For myself, that's the sort of thing that clearly convinces me that I'm not under any sort of normal listening circumstances distinguishing differences between any two things that measure as close to the same as dacs and amps and cables do. Also, a simple hearing test can go a long way to enlightening anyone who thinks they are hearing ultrasonics...lol. I can't hear anything above 15khz so I don't waste much of my time worrying about a 1db roll-off from 19k up. I mean it's not like the claims go like this - "under extremely careful listening circumstances I could hear subtle differences between this and that." Instead it's people hooking up gear in their living room and based on what they remember their old gear sounding like deciding they can hear remarkable new dimensions in clarity and soundstage and blah blah blah. There's such a degree of ridiculousness to the claims that they basically fall apart if one is open enough to apply even the simplest logic to them...
 

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,168
Likes
3,712
Blind tests are the best most discriminating method. I find I can detect with 100% reliability some very small differences when using two segments of 5 seconds or less and rapid switching. OTOH, some of those I score 50/50 if segments are 15 or 30 seconds long. I have found anything I only hear using the very short segments which both can fit inside my Echoic memory are so small they have zero relevance to normal music listening. So on one hand if you cannot hear something using short rapid switching listening tests it is a pretty sure bet you cannot hear it. On the other if the difference isn’t large enough to hear with 30 second segments it isn’t big enough to matter for music listening.
And I think this is a very, very important consideration. Far too often debates about blind audio tests bog down in 'best case' results...results that pretty much can only be obtained using optimized protocols that are FAR in excess of the sensitivity available during typical listening. (Or, in one infamous case, require meta-analysis to 'extract' from decades of published data).

Yet it is 'typical listening' (not to mention 'sighted') that audiophiles usually base their claims on. Aka 'real world' listening.

So, if a rigorous test, using careful level matching and instant switching and short, maximally revealing snippets of sound, and possibly after some 'training' on same*, does prove that it is possible to do better than p=0.05 telling A from B ....so effing what? Does it mean we should believe you , Joe Audiophile, are hearing it in your La-Z-boy playing Diana Krall? Nope. Don't kid yourself.



(*I am very much thinking of Amir's tests of high-bitrate mp3s vs lossless here, in case you are wondering)
 
Last edited:

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,168
Likes
3,712
I have no idea what you are talking about. Here is an example of ABX testing that was done by the very people who popularized such tests:


i-NVbTMcL-XL.png

How is this not "comparing one device to another to determine if you can tell them apart?"

Amps can sound different...e.g., when one or more of them is clipping. No one claims otherwise. I'm invoking the ghost of Arny Kruger to chide you for not noting that here (you noted it in the original ASR post).
 

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,168
Likes
3,712
ABX testing is not the correct test to discover a preference.

Blind conditions are certainly crucial if you want to pin the basis of the preference on the audio alone.

It's why blind testing is of course used in Toole/Olive speaker preference research. Speakers do sound different; but preference can still be biased by non audio factors.
 

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,168
Likes
3,712
Exactly. I made this point only a few days ago. It means absolutely nothing to anyone else, except the person taking the test. It's not the equivalent of some "preference curve"...
Well, no, if the test is actually well-controlled, and the reporting source is not lying, it provides evidence that the difference under test can be heard. That it is 'real'.

What it doesn't mean, by itself, is that you will hear it.
 

Sgt. Ear Ache

Major Contributor
Joined
Jun 18, 2019
Messages
1,894
Likes
4,150
Location
Winnipeg Canada
A blind test on one individual isn't some sort of "preference curve." But a whole bunch of blind tests on a whole bunch of people can certainly be used to establish a preference curve. Much like blind testing food (pepsi vs coke for instance) can establish a taste preference. You can do a thousand tests and then say 66% preferred Pepsi. Such a test would obviously be invalidated if you could see the brand of each as you tasted them. In the same way, you can conduct a whole series of blind tests on speakers, record the results of those tests, and then examine the speakers to determine what audible characteristics the "most-preferred" models share...from which you might be able to say something like "66% of people seem to prefer such and such a tonality."
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,834
Likes
16,496
Location
Monument, CO
It used to be a practical solution. Since 2002 till about 2015, we had numerous meetings of audio fans here usually 2-3 days at some hotel, we hired a conference room for listening sessions. However, with the expansion of social networks, this activity ceased, unfortunately. Now we have what we have. Social networks open for everyone, with all its pros and cons.
Professionals versus con-artists?

Good things versus bad things?

Works either way...

ABX is hard to do well when needing to switch HW and not just files in Foobar or whatever. I have now and then wished I had kept my random AB tester built in college with relays and SSI/MSI chips on a breadboard. Push a button to select a random output, and the logic inside kept track of the order for a few (forgotten how many, maybe 16) trials. I had to manually look at the switcher to get the order, then collate the lists (on paper) from the listeners. These days a bright puppy would probably use a Raspberry Pi or something to drive the relays and keep track, maybe give each listener a couple of buttons to choose A or B for each trial and take most of the manual recording out of it.
 

kemmler3D

Major Contributor
Forum Donor
Joined
Aug 25, 2022
Messages
3,008
Likes
5,604
Location
San Francisco
A few thoughts on this.

First, what is the goal (explicit or implicit) when a new person shows up, and posts "I heard a clear difference between ______" - and we say to go do an ABX test?

1. Discourage / chase off a possible troll?
2. Enlighten someone who is mistaken about what they heard?
3. Feel superior to the n00b?

I think #1 is best left to mods and #3 is an unfortunate tendency but not a legitimate reason for response. The only proper goal here is #2.

What we're trying to do is bring people around to the science-based point of view - working from most likely explanations for what they experienced, to least.

If I tell you "Hey, I just heard X" and you tell me "Pfft, doubt it, ABX?" enlightenment is not a likely outcome. I think a better approach would be to send them to a noob-friendly guide to common mistakes in listening tests / a guide to proper listening test methodology. More flies with honey than vinegar, etc.

Anyone who has actually conducted a test of any kind is probably open to a scientific mindset. They're already trying harder than 99% of the people out there. But I don't think we can get them there by frustrating and discouraging them in the first post. So I do agree that the "ABX or it didn't happen" approach isn't a good one.

Yes, that message does have to get across eventually. But you will not successfully get it across before you have helped them clear a few more hurdles of understanding.
 

Rednaxela

Major Contributor
Joined
Mar 30, 2022
Messages
2,051
Likes
2,673
Location
NL
So what other things can we do or that some of you do that is useful? What is a more effective way to engage people who don’t understand things about what can and cannot be heard without chiming in over and over “hey, do an ABX test or it didn’t happen”?
Maybe we should simply formulate what the S in ASR means to us. A manifesto of sorts to refer to instead of asking for the dreaded ABX tests.

It’s not like we try and silence everybody about everything all the time in the name of science - it is a very specific kind of claims combined with a certain argumentation pattern that triggers the ABX or else response. These things are and have to be challenged here for very specific reasons.

Perhaps we can try and put these reasons into words? Might be an interesting exercise.
 

kemmler3D

Major Contributor
Forum Donor
Joined
Aug 25, 2022
Messages
3,008
Likes
5,604
Location
San Francisco
Perhaps we can try and put these reasons into words? Might be an interesting exercise.
Agreed, a friendly, informative introductory post, a "stock reply", seems like it would be helpful. It's not uncommon on different forums (Reddit has this a lot) for new posters to get hit with an auto-reply that covers the FAQs and common issues with first posts.
 

voodooless

Grand Contributor
Forum Donor
Joined
Jun 16, 2020
Messages
10,223
Likes
17,799
Location
Netherlands
Perhaps we can try and put these reasons into words? Might be an interesting exercise.
How horrible it would be for unsuspecting members to be shown a standard response to their very specific question ;) You can probably train a chatbot to do this :cool:.

But seriously, a list of standard responses for ever recurring questions would be great. An ASR FAQ or sorts:

Q: I hear the difference between speaker cable A and B, now what?
A: Don’t worry, this is a completely normal an human thing…. Blablablabla… etc…

Obviously we can eternally bikker over the exact content and phrasing, by at least we’ll be annoying each other, and not new members ;)
 

DonR

Major Contributor
Joined
Jan 25, 2022
Messages
2,968
Likes
5,611
Location
Vancouver(ish)
How horrible it would be for unsuspecting members to be shown a standard response to their very specific question ;) You can probably train a chatbot to do this :cool:.

But seriously, a list of standard responses for ever recurring questions would be great. An ASR FAQ or sorts:

Q: I hear the difference between speaker cable A and B, now what?
A: Don’t worry, this is a completely normal an human thing…. Blablablabla… etc…

Obviously we can eternally bikker over the exact content and phrasing, by at least we’ll be annoying each other, and not new members ;)
I think an FAQ would be a great idea. @amirm has made lots of teaching videos that would be useful there.
 

kemmler3D

Major Contributor
Forum Donor
Joined
Aug 25, 2022
Messages
3,008
Likes
5,604
Location
San Francisco
Q) Some jerk keeps telling me to do an ABX test. What is an ABX test and why is it such a big deal?

A) You are being asked about ABX because you're probably trolling, please GTFO N00B!

OK, great start! :D

But in all seriousness, if there isn't a good new-member-oriented FAQ it could go a long way. Perhaps we could start drafting it collaboratively in Google Docs or similar.
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,834
Likes
16,496
Location
Monument, CO
IME/IMO the problem is that without actually doing the AB(X) testing a listener is unlikely to be convinced a difference he (she, whatever) heard is not really there. A default answer about the fallibility of hearing and links to perceptual studies would be good to have but likely the listener will be unconvinced even if he bothers to read it. Decades ago I was absolutely certain about what I heard and convinced it would be readily discerned in a blind AB test. I was wrong, and it was a humbling introduction into what I thought I heard, versus what was actually there. But "I know what I heard" leads to cementing bias into place and "unhearing" it is virtually impossible, again IME.

That said there are plenty of things that can be heard and pass an AB(X) difference test, including the way an amplifier interacts with a speaker and so forth. Even then the things mentioned earlier can muck up a test... For example, if one amplifier has a higher noise floor, it may be easy to pick that out, even if the actual musical signals are identical between two (or more) amplifiers. That is a problem I was never able to fully resolve way back when I was running tests.

FWIWFM - Don
 

ahofer

Major Contributor
Forum Donor
Joined
Jun 3, 2019
Messages
4,947
Likes
8,694
Location
New York City
I am NOT saying there are NO DIFFERENCES in a literal academic sense. I am saying the correct advice to 99.99% of the people looking for recommendations is "get a basic level of performance from your electronics and then STOP READING ABOUT IT. Instead, focus on speakers/room treatment/room correction/bass management, any of which is far more important."
Agree.

The difference between well-measuring speakers is even pretty small, a difference 99.9% could easily live with. Part of this hobby is splitting hairs, though. even hairs inside the brain, apparently.
 

Shadrach

Addicted to Fun and Learning
Joined
Feb 24, 2019
Messages
662
Likes
947
Suppose a difference can be found in the upper treble (so young ears only) or requires a very good headphone/speakers and or training and 10 people take 'the test' and 9 fail where there is 1 who was trained, had the right gear and could reliably detect the difference. Does that one test prove audibility or are the other 9 truly showing no audibility ?
It depends on the test. The case you put forward is not a comparison between two products and not a fair test. Everyone would need to listen to the same system.
 

kemmler3D

Major Contributor
Forum Donor
Joined
Aug 25, 2022
Messages
3,008
Likes
5,604
Location
San Francisco
a listener is unlikely to be convinced a difference he (she, whatever) heard is not really there.
This is true. However, IMO in most cases perceived differences aren't actually pure imagination, they're usually because of imperfect level matching. Telling someone "yes, you heard something real, but it's not what you thought it was" is far easier to accept than "nope, just your brain playing tricks on you".

Getting someone up to speed on fletcher-munson and loudness effects is pretty doable via a FAQ or something, but is a pain in the ass to do every time someone pipes up with "massive differences between DACs" or whatever.
 

ahofer

Major Contributor
Forum Donor
Joined
Jun 3, 2019
Messages
4,947
Likes
8,694
Location
New York City
Suppose a difference can be found in the upper treble (so young ears only) or requires a very good headphone/speakers and or training and 10 people take 'the test' and 9 fail where there is 1 who was trained, had the right gear and could reliably detect the difference. Does that one test prove audibility or are the other 9 truly showing no audibility ?
This is sort of the case with digital resolution trials, which you can do all over the internet. Just a few trained listeners can tell the difference between redbook and higher. Not that many can even do the higher res mp3 vs redbook. Those that can, however, can do it reliably.

It comes back to what we were discussing in my last post (right advice for 99.9%). The difference may be "audible" but so small as to wonder why we care.
 
Top Bottom