I'm saying when someone asks "I want a short delay added" I would give them many options, my point is when I've gone as low as 10mS they look at me like I have two heads when I say I added a "delay". I'm not saying if I asked them to sit down and concentrate to hear a difference between the original signal and the delayed signal they wouldn't be able to do so. They don't accept it as a "delay" until it gets up near 100mS and above. In that time domain is where it's an acceptable level of "delay" effect. When asked to richen up a signal I'll often add a very short delay, like 10mS and yes it's audible but the artists don't consider it a delay. So non-professional sound engineers (musicians) has essentially empirically defined for us where a "delay' begins and where the effect of a small time delay begins to be defined as a different style of effect.
I'm not sure you're comparing this test with the right example. Here, it's not really "do you hear a delay?" but more "do hear a difference between a kick and hi-hat with attacks aligned and kick and hi-hat with attacks not aligned", and you have both to compare.
It would be harder if there was only one and asking you to find the delay between both attacks in this file.
Not the same thing, but it's a bit like there are people who can find notes after hearing a A 440, but not if you're not giving them this A 440 first, and other can find any note without reference first.
Just using my computer monitors speakers I found the test pretty easy, especially @ 1 ms which to me had more difference than 5 ms or 2 ms? Is it possible crappy speakers make this easier? This test doesn't seem right to me.
Depends on your speakers, but I found this test very easier on 2-way speakers (even small ones) than headphones, and the first reason I found, maybe I'm wrong, would be: most of the hi-hat comes from the tweeter, and most of the kick comes from the woofer, so it adds source separation