• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The Physics of 3-D Soundstage

tallbeardedone

Active Member
Joined
Sep 3, 2022
Messages
102
Likes
215
Here's a physics fact: the intended 3-dimensional stereo soundstage is ONLY possible along the center line between precisely placed stereo speakers.

Here’s why: Stereo reproduction uses either time-of-arrival delay or intensity-difference (volume difference) between channels to place sound at a specific angle within the soundstage (see photo 1)
graph1-(1).jpg


For example, to place the image in the center there must be zero time delay between the sound waves from left and right channels to your ears or the sound from each speaker must be exactly the same volume, or a combination of the two. To make the sound source appear 30 degrees off-center a time delay of approximately 1.1ms or a level difference of approx 15dB or a combination of the two (0.5ms with 6dB volume difference) must be used (see photo 2).
graph2.jpg


Long story short, the ONLY way to get the correct time delay or volume difference at your ears is to be EXACTLY mid-line between well-placed speakers. If you're off to the right or left, by the laws of physics, you are changing the time-delay and sound pressure difference at your ears, and the image/soundstage will shift accordingly.

source: https://www.dpamicrophones.com/mic-...v2UiPaHNb0uc7mi30HmoW2nimVctCU-c8wZBC2E5p_5JA
 
Last edited:

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
Does this account for cross-talk between the ears? In conventional stereo setups most of the ITD's in the source material will be lost since both ears will hear both speakers. I would suspect that most of the imaging is coming from principally ILD's. Obviously some timing differences can be heard, but I think the need to be exactly on the ℄ of the speakers is likely to be an academic argument, no?

I think the above is still important as you say, but likely to be of interest when recording and mixing. Most of us don't have setups that are as well behaved as what a recording studio will have. There is headphones, of course and someone chime in if I'm wrong but I think most speaker setups will have enough cross-talk and reflections mixed in with the direct sound to make the impact of time delays more obscure since it impacts both the ipsi and contralateral ears simultaneously.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,521
Likes
37,050
All depends upon the particulars. A setup like in the graph actually creates time of delay difference from level differences in channels. Some microphone techniques have both, and as long as you are close enough to the sweet spot level and time delay differences factor into where you hear the sound coming from.

If you don't believe it you can quite easily test it for yourself. Take a mono signal, put it at equal loudness in both speakers and then delay one channel vs the other. Conversely with no time delay make one channel louder vs the other. All easily accomplished in something like Audacity to create the test tracks. All described by Alan Blumlein in his early stereo patents.
 
OP
T

tallbeardedone

Active Member
Joined
Sep 3, 2022
Messages
102
Likes
215
Does this account for cross-talk between the ears? In conventional stereo setups most of the ITD's in the source material will be lost since both ears will hear both speakers. I would suspect that most of the imaging is coming from principally ILD's. Obviously some timing differences can be heard, but I think the need to be exactly on the ℄ of the speakers is likely to be an academic argument, no?

I think the above is still important as you say, but likely to be of interest when recording and mixing. Most of us don't have setups that are as well behaved as what a recording studio will have. There is headphones, of course and someone chime in if I'm wrong but I think most speaker setups will have enough cross-talk and reflections mixed in with the direct sound to make the impact of time delays more obscure since it impacts both the ipsi and contralateral ears simultaneously.
This takes into account the distance between the ears. The time delay between ears (d = ~17cm) is t = d / v = 0.17 / 340 = 0.5ms. This is why stereo mics are often separated by 17cm.
 

restorer-john

Grand Contributor
Joined
Mar 1, 2018
Messages
12,579
Likes
38,278
Location
Gold Coast, Queensland, Australia
Here’s why: Stereo reproduction uses either time-of-arrival delay or intensity-difference (volume difference) between channels to place sound at a specific angle within the soundstage (see photo 1)

Right. Forget about deliberate manipulation of phase huh?
 
OP
T

tallbeardedone

Active Member
Joined
Sep 3, 2022
Messages
102
Likes
215
All depends upon the particulars. A setup like in the graph actually creates time of delay difference from level differences in channels. Some microphone techniques have both, and as long as you are close enough to the sweet spot level and time delay differences factor into where you hear the sound coming from.

If you don't believe it you can quite easily test it for yourself. Take a mono signal, put it at equal loudness in both speakers and then delay one channel vs the other. Conversely with no time delay make one channel louder vs the other. All easily accomplished in something like Audacity to create the test tracks. All described by Alan Blumlein in his early stereo patents.
The above graph accounts for both (i.e. is for microphones which use both time and level difference). I'm 100% aware that (some) stereo mics use BOTH time-delay and level difference.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
This takes into account the distance between the ears. The time delay between ears (d = ~17cm) is t = d / v = 0.17 / 340 = 0.5ms. This is why stereo mics are often separated by 17cm.
I agree that its completely correct from a physics standpoint and something that needs to be known when one is recording using microphone arrays. But I'm not sure how much this ultimately translates to real-world speaker setups given all the variables involved. The only time I can get a true 3D soundstage with a 2-channel setup is in the near field with an RFZ established around the listening position. Once there are significant reflections mixed with the direct sound and substantial cross-talk the soundstage collapses down to more conventional stereo imaging and how much the timing and level differences in the source material itself matter becomes more variable. Something to potentially keep in mind as things may not always translate fully for all listeners, so compromises obviously need to be made, and that's where experience comes in on the part of the mixing engineer.
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,523
Likes
3,745
Location
Princeton, Texas
I'm still learning how phase is used to place things. It's very cool physics. That dog in Roger Waters "Ballad of Bill Hubbard" is in my backyard. Very clever.

My understanding is that the wrap-around-the-head portion of the signal from the left speaker, at the instant in time it arrives at the right ear, is cancelled by a slightly delayed and phase-inverted signal coming from the right speaker. This effectively mimics a maximum left-azimuth ("9 o'clock") direction cue, so the dog comes from the extreme left, beyond the plane of the speakers, and then the reverberation tail adds the distance cues. (Or maybe it's from the extreme right; I haven't listened to that track in many years.)
 
OP
T

tallbeardedone

Active Member
Joined
Sep 3, 2022
Messages
102
Likes
215
I agree that its completely correct from a physics standpoint and something that needs to be known when one is recording using microphone arrays. But I'm not sure how much this ultimately translates to real-world speaker setups given all the variables involved. The only time I can get a true 3D soundstage with a 2-channel setup is in the near field with an RFZ established around the listening position. Once there are significant reflections mixed with the direct sound and substantial cross-talk the soundstage collapses down to more conventional stereo imaging and how much the timing and level differences in the source material itself matter becomes more variable. Something to potentially keep in mind as things may not always translate fully for all listeners, so compromises obviously need to be made, and that's where experience comes in on the part of the mixing engineer.
I’ve been able to get an immersive 3-D soundstage in my room by getting all first reflection >6ms from direct sound (and attenuated as much as possible with absorption) and sitting about 30cm inside the tip of equilateral triangle (quite near field). This is of course hugely room and speaker dependent, but definitely possible. I can place each instrument in an orchestra in 3-D space in, for example, “Jack Sparrow” by Royal Philharmonic. Very cool effect.
 

jae

Major Contributor
Joined
Dec 2, 2019
Messages
1,208
Likes
1,508
The nature of sound/signal is significant as well since the psychoacoustic signficance of both ITDs and ILDs are frequency-specific. The threshold of sensitivity for time is much more of a significant factor for lower frequencies, while the opposite is true for higher ones.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
I’ve been able to get an immersive 3-D soundstage in my room by getting all first reflection >6ms from direct sound (and attenuated as much as possible with absorption) and sitting about 30cm inside the tip of equilateral triangle (quite near field). This is of course hugely room and speaker dependent, but definitely possible. I can place each instrument in an orchestra in 3-D space in, for example, “Jack Sparrow” by Royal Philharmonic. Very cool effect.
That's basically similar to the method I use. I keep the first 20ms free of reflections, and then have the early reflections arrive as a tight group at the end of the fusion window so it gets integrated together (basically means using furniture and what not to absorb side reflections). Anything outside of that is heard as an echo, so the whole room more or less has to be sequestered for the purpose of music reproduction even though I sit right in front of the speakers. Its a pain, but well worth it given how much realism it can add. Ah, the joys of audiophilia...
 

DVDdoug

Major Contributor
Joined
May 27, 2021
Messages
2,917
Likes
3,831
Here's a physics fact: a perfect 3-dimensional stereo soundstage is ONLY possible along the center line between precisely placed stereo speakers.
Some people do experience a soundstage with depth on some setups, some rooms, and some recordings.

It's all an illusion.... There is obviously no real center or anything in-between the speakersunless you have a surround setup with a center speaker.

A delay in a vocal or instrument may help to perceive the source as farther-back.

I never get a "clear center" perception. For me, centered sound seem to be vaguely in the center (except with 5.1 surround where there is a center channel and a center speaker).


I read something interesting once in a Moulton Labs Article about panning as it's normally done (adjusting the levels without delay) and how poorly it actually works for precisely-positioning sounds in the stereo soundstage.

...I had found out something quite interesting: that as long as the difference between channels is less than 3 decibels, the phantom image hovers pretty much in the middle point between the two speakers... With between 3 and 6 decibels difference in levels, the phantom quickly and without much stability migrated to the louder speaker, hovering just inboard of that speaker, and once the difference was greater than 7 decibels, the phantom was for all intents and purposes coming from the louder speaker.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
Some people do experience a soundstage with depth on some setups, some rooms, and some recordings.

It's all an illusion.... There is obviously no real center or anything in-between the speakersunless you have a surround setup with a center speaker.

A delay in a vocal or instrument may help to perceive the source as farther-back.

I never get a "clear center" perception. For me, centered sound seem to be vaguely in the center (except with 5.1 surround where there is a center channel and a center speaker).
That was sort of my take as well. For conventional stereo setups, the results are quite variable. My personal experience has been if you can arrange to remove the cross-talk and offending reflections, then the spatial cues in the recording drive the imaging, and the sounds are perceived as originating from the actual sources in the recording and not the speakers themselves when both ITDs and ILDs are present. Basically the logical conclusion of spatial effects in headphones. But this, too, I suspect can likely vary from listener to listener and how well they can reconstruct the respective locations of the sound sources from the given cues available. I'm sure there has been substantial research done, but how much is in the public domain, I'm not sure.
 

kongwee

Major Contributor
Joined
Jan 22, 2022
Messages
1,024
Likes
276
Now you even can artificially create the soundstage. Of course, you need to place your speaker well.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
Now you even can artificially create the soundstage. Of course, you need to place your speaker well.
The actual nuts and bolts part of the program works nicely from what has been provided in the demo and the audio snippets. But the samples make everything sound like Animusic :( I'm assuming it can also be used with real recorded sound sources? If so that is pretty neat.
 

Gringoaudio1

Addicted to Fun and Learning
Forum Donor
Joined
Sep 11, 2019
Messages
585
Likes
742
Location
Calgary Alberta Canada
That's basically similar to the method I use. I keep the first 20ms free of reflections, and then have the early reflections arrive as a tight group at the end of the fusion window so it gets integrated together (basically means using furniture and what not to absorb side reflections). Anything outside of that is heard as an echo, so the whole room more or less has to be sequestered for the purpose of music reproduction even though I sit right in front of the speakers. Its a pain, but well worth it given how much realism it can add. Ah, the joys of audiophilia...
What do you mean the first 20ms? Sound travels 22.5’ in .02 seconds. You don’t sit 22.5’ away from the speakers right? You have absorption on the walls for 22.5’? How do you control the time of arrival of the first reflections? This is making no sense to me.
 
Top Bottom