We are actually extremely good at that. Which is a large part of why I used to believe it was very important for the interconnect cables and speaker wires to be the same length... otherwise, the left and right channels would be slightly skewed and the sound stage would be off. With the result of having 10' or more extra speaker wire coiled near the one was closest to the amp.
Then someone on ASR pointed out the math. So while we are very good at locating objects based on timing differential (relative positioning and movement velocity), I don't believe we are
that good.
Anyone have research references on how close that differential timing could be for recognizing even the slightest difference? Microseconds (at mentioned by @DonH56) is a huge difference from milliseconds as referenced in the Reddit conversation; but both are still much larger than nanoseconds.
I do not have the reference handy; it was something I found many years ago when wondering how important speaker placement was before we had DSP to correct delays. It has to do with how we locate (localize) sounds. The number I had in mind was around 5 or 6 us but don't take that as gospel... This Wikipedia article found with a quick search just says <10 us; chasing the references might provide a better answer. I'd have to dig for the papers I found way back then (may not be in electronic form).
en.wikipedia.org
Edit: I looked through some old posts and found this (from September 2010), but do not have the reference for Gary's assertions. He is a very sharp guy so I tend to believe him. This is a little higher (larger time difference) than I remembered from other sources, however, which were below 10 us. This is the delay in the acoustic (audio, audible) signal reaching the ears. Note TOA is Time Of Arrival.
Lab experiments have shown that the majority of test subjects can tell if the movement is less than 4-inches at 10ft. If the distance between the ears is 6 inches, what is the change of the source distance between the 2 ears if the source moves 4 inches sideways?
A challenge! I’ll play…
I’ll work in inches, so 10 feet is 120”. Starting with the person directly in front, the distance to each ear is 120.037494” assuming ears 6” apart. At 1127 fps, TOA is 8.87588688 ms to each ear.
Now move her over 4” to the left, still 120” away. Distance to the left ear is 120.004167”, with TOA 8.87342255 ms. It’s now 120.203993” to the right ear, or 8.88819826 ms, a difference of 0.0147757 ms (about 14.78 us).
I am impressed! Assuming I did the math right (I did not double-check it), that implies we can resolve a time delay of under 15 us, much less than I would have guessed, and the difference in distance to our ears in the latter case is only 0.200”, or 1/5”. Not quite 1/16”, but very small relative to everything else. I still wonder how much that matters in the real world with music, room interaction, and listener movement, but something to think about. Music has much higher-frequency signals than average speech, making time differences (and thus positioning) more critical, one would think.
A range finder it is for me this year… - Don