Given that audio memory is only around 7 seconds, I need to be able to switch quickly.
You are correct that quick (or instant switching) is "better" (a more-sensitive test), BUT as a PRACTICAL MATTER, I'd argue that if you can't hear a difference from one day to the next there's no point in choosing a different DAC or replacing the one you're using. And it makes no economic sense to spend more money for a "better one" if you can't hear the difference the next day.
I'd agree it's probably is better to compare 2 devices at a time with an
ABX test. Maybe choose one as the the "standard" (for example "A") and compare all of the others to "A". If you can't statistically-reliably hear a difference between "A" and any of the others, you're done. If you can hear a difference, things are getting interesting and you can switch-out "A" and repeat and or do more advanced quality/preference tests.
The protocol for ABX tests is well defined, and it's OK to "cheat" a bit and do it single-blind as long as you're not publishing a formal study. (A double-blind test can be very difficult to do.) Or with certain experiments (like when comparing MP3 to WAV) the computer can do the switching so it's not truly double-blind (the computer knows if X is "A" or "B") but it can still be a as good as a double-blind test.
An ABX test doesn't require you to decide which one sounds better. It's simply a test to confirm that you can actually hear a difference by identifying if you can reliably hear a difference. So again, if you can't hear a difference you're done! If you can hear a difference you may want to do further listening tests or measurements.
Or the difference may be something obvious like noise (hum, hiss, or whine in the background) and in that case you don't really need a blind test! The problem is, in many cases people just THINK they are hearing a "night and day" difference. See my signature below.
BTW - An audible filter difference is a valid difference so that's something worth investigating if you're reliably hearing a difference.
A level difference is also a valid difference but it's not a very interesting difference and (again assuming an ABX test) if it allows you to identify "X" every time, people often think they are hearing other differences too, but then those other differences have to validated by eliminating the level difference and repeating the test...
Also see
Controlled Audio Blind Listening Tests (video).