I personally prefer to do AA, AB, BA, BB testing, where the switching is random, and the test is to be able to distinguish A from B with statistical validity. I find that ABX testing where X has to be identified as either A or B doesn't work for me, as by the time I've heard X, I've forgotten what A and B sound like, so end up guessing. The AB, BA, AA BB test, with rapid switching back and forth is exquisitely sensitive to small differences. It also just requires the answer 'same' or 'different'.
S