• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The Physics of 3-D Soundstage

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
What do you mean the first 20ms? Sound travels 22.5’ in .02 seconds. You don’t sit 22.5’ away from the speakers right? You have absorption on the walls for 22.5’? How do you control the time of arrival of the first reflections? This is making no sense to me.
By using a suitably long room, of course. Basically its a "Live End Dead End" configuration. Keep in mind that the reflection has to travel out, bounce off of the rearmost wall, and come back, and with roughly 19' each way, it works out to about 30ms, which is right at the very end of the fusion window. That one is critical to getting the proper tonality since its a diffuse sound field from both speakers. This stops it from sounding anechoic. I'm sure with suitable EQ that could be eliminated and just the direct sound field used (we do this all the time with headphones as part of the Harman target), but doing it this way means that the radiation pattern of the speaker and acoustic measurements can be used to determine what the target response should be, which is not straightforward from how its configured below with large speakers close to the wall, not to mention the headaches associated (no pun intended) with sitting so close to a sub. Its hard to get it properly integrated without the bass ending up in just one channel. As far as the speakers, in this case the Polk S55's I have act more or less as a point source above about 1 kHz and have the tweeter at just the right height below ear level, but I have gotten it to work with other two-way speakers. Three-way not so much due to there not being enough distance for the sound to integrate properly. I have the speakers set up such that the desk reflections miss me as well as not being aligned with any axis, which is critical to getting the crosstalk down at the opposing ear. Even so, its probably close to the bare minimum for the effect to work as I'm only getting around 10 dB of reduction at each contralateral ear. Not shown in the picture is the rest of the room, and I still need to treat the wall behind the speakers as impulsive sounds can survive multiple trips back and forth which gives a distinct flutter echo, but overall its the only way I have found to easily get binaural audio out of conventional loudspeakers, but the caveat is that its tuned to the characteristics of my hearing. So far, other family members just say it sounds like "really good speakers" so I'm not sure what the limits are for the effect working in regards to psychoacoustics. Edit: I should probably state that the TL;DR of the sound quality is accurate azimuthal (from about my extreme left and right) and distance being reproduced with suitable miking, with each sound source being separate and distinct. This gives the same deep contrast as large headphones that comply with the Harman target, but with the placing of instruments being more realistic and accurate. With headphones for me the sound sources more or less end up in a line slightly behind my head along the axis of the ears.
 

Attachments

  • PolkPCSetup.jpg
    PolkPCSetup.jpg
    464.7 KB · Views: 78
Last edited:

kongwee

Major Contributor
Joined
Jan 22, 2022
Messages
1,024
Likes
276
The actual nuts and bolts part of the program works nicely from what has been provided in the demo and the audio snippets. But the samples make everything sound like Animusic :( I'm assuming it can also be used with real recorded sound sources? If so that is pretty neat.
Yes it can be used with recorded sound too. Just need mixing knowledge.
 

puppet

Senior Member
Joined
Dec 23, 2020
Messages
446
Likes
284
Is it physics or more psychoacoustics?
Both perhaps. You could look at the combined dispersion, the combing of the actual sound waves combining, from your L&R loudspeakers that produce the phantom center image. So, physics creates the psycho acoustics.
 
OP
T

tallbeardedone

Active Member
Joined
Sep 3, 2022
Messages
102
Likes
215
Does this account for cross-talk between the ears? In conventional stereo setups most of the ITD's in the source material will be lost since both ears will hear both speakers. I would suspect that most of the imaging is coming from principally ILD's. Obviously some timing differences can be heard, but I think the need to be exactly on the ℄ of the speakers is likely to be an academic argument, no?

I think the above is still important as you say, but likely to be of interest when recording and mixing. Most of us don't have setups that are as well behaved as what a recording studio will have. There is headphones, of course and someone chime in if I'm wrong but I think most speaker setups will have enough cross-talk and reflections mixed in with the direct sound to make the impact of time delays more obscure since it impacts both the ipsi and contralateral ears simultaneously.
This is why I stated "precisely placed speakers". To get the intended soundstage to appear as it was intended you need to be along the exact midline between well-placed speakers in a well-treated room similar to the studio in which in was mixed. I see so many people buy $10,000 DACs to "improve soundstage" when BY FAR the most important factor is speaker placement, speaker-room interaction, and your position relative to those speakers in the room.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
This is why I stated "precisely placed speakers". To get the intended soundstage to appear as it was intended you need to be along the exact midline between well-placed speakers in a well-treated room similar to the studio in which in was mixed. I see so many people buy $10,000 DACs to "improve soundstage" when BY FAR the most important factor is speaker placement, speaker-room interaction, and your position relative to those speakers in the room.
That would imply that the soundstage is not very stable for one reason or another. My experience is if there is not any interference, there is quite a bit of freedom to move about the desk (or console in this instance) without having he imaging change appreciably. Obviously if one moves far off to the side the sound collapses into each respective speaker, but ideally the sweet spot should be large enough to be useful. I don’t know of all the factors involved, but I would surmise the listening window of the speaker plays an important role. A wider radiation pattern will mean that the SPL level heard by each ear does not change much with movement within the listening window since the pattern is flat and wide (edit: but worth noting there will be more side reflections). The ITD will still be altered, of course, but this removes ILDs associated with small head movements. But with a narrow radiation pattern and the need to directly face the speakers, there are potentially large SPL gradients once one moves away from the mid-position, and this induces ILDs as well as ITDs, which is likely to induce additional image shift. Just some food for thought, perhaps.
 

onion

Senior Member
Joined
Mar 5, 2019
Messages
342
Likes
383
One can use software to implement cross-talk cancellation in-room (like Bacch4Mac) to dramatically enhance spatial audio. As this involves calibration and measurement of the signal from left and right speakers arriving at the left and right ears, the filter already takes into account the position of the speakers relative to the listener. So the position of the speakers does not matter that much when using this type of solution.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,699
Likes
37,434
With large panels you'll know unambiguously when you are in the sweet spot as it may be less than one foot wide. Moving outside this area sound will collapse to one speaker only. With such panels you'll get reduced early reflections because they are directional.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
One can use software to implement cross-talk cancellation in-room (like Bacch4Mac) to dramatically enhance spatial audio. As this involves calibration and measurement of the signal from left and right speakers arriving at the left and right ears, the filter already takes into account the position of the speakers relative to the listener. So the position of the speakers does not matter that much when using this type of solution.
I have heard of the BACCH stuff, but have not heard any of their systems directly. I think their stand-alone processors they make cost something like $28k. Id imagine they may also include some additional processing to enhance the imaging along with the base XTC filters and head tracking, but don't know for sure.
 

onion

Senior Member
Joined
Mar 5, 2019
Messages
342
Likes
383
There are a few threads on Bacch4Mac eg here.

I suppose the point I was making to the OP is that software can render the physics of speaker placement and its inviolability for stereo imaging moot.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
There are a few threads on Bacch4Mac eg here.

I suppose the point I was making to the OP is that software can render the physics of speaker placement and its inviolability for stereo imaging moot.
I would think its less sensitive while you are within the region where the XTC is effective. In principal, there the sources in the recording are what you perceive as the origin of the sounds you hear, which is more robust than when the speakers themselves are perceived as the source like you normally would.

Out of curiosity, have you had a chance to try that system? Curious with how it works.
 

Ricardus

Addicted to Fun and Learning
Joined
Mar 15, 2022
Messages
843
Likes
1,153
Location
Northern GA
No. Of course the term shouldn't be retired.
It absolutely should because it has ZERO meaning. ZERO. ZILCH. NONE.

There is only the L-R stereo field that the mix engineer selected when he mixed the record. Any "magic" people hear is from room imperfections, constructive and destructive wave interference, speaker imperfections, and hearing quirks. And none of it is repeatable.
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,558
Likes
3,864
Location
Princeton, Texas
It absolutely should because it has ZERO meaning. ZERO. ZILCH. NONE.

There is only the L-R stereo field that the mix engineer selected when he mixed the record. Any "magic" people hear is from room imperfections, constructive and destructive wave interference, speaker imperfections, and hearing quirks. And none of it is repeatable.

The term "soundstage" has meaning to me.

I understand it to mean the auditory illusion of sound sources believably distributed both laterally and in the depth dimension, and the term can also include the illusion of being within an acoustic space which corresponds to the venue ambience cues on the recording.

In my experience, imperfections in the room and playback chain degrade rather than create or enhance "soundstage".

Also, I find soundstage to be repeatable from one system with good spatial qualities to the next. Consider for example that quite a few people have described a similar soundstage experience from "The Ballad of Bill Hubbard" on Roger Waters' "Amused To Death" album, indicating that the "soundstage" originates with the recording rather than with non-repeatable quirks of hearing or imperfections in the room or playback chain.

Reality is only an illusion, albeit a very persistent one. - Albert Einstein
 
Last edited:

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
There is only the L-R stereo field that the mix engineer selected when he mixed the record. Any "magic" people hear is from room imperfections, constructive and destructive wave interference, speaker imperfections, and hearing quirks. And none of it is repeatable.
Also, I find soundstage to be repeatable from one system with good spatial qualities to the next. Consider for example that quite a few people have described a similar soundstage experience from "The Ballad of Bill Hubbard" on Roger Waters' "Amused To Death" album, indicating that the "soundstage" originates with the recording rather than with non-repeatable quirks of hearing or imperfections in the playback chain.
In the absence of any additional processing, in a binaural system distance and azimuthal positions from about -80 to 80 degrees can be accurately relayed from my own experiences provided the recording has more than just ILDs. With additional processing to apply HRTFs/HRIRs, elevation can be conveyed, as well as potentially positions behind the listening position. Its all down to getting the transfer function right, which is the hard part. Without that it really is just an illusion of the speakers trying to recreate some form of an acoustic image in the listening area. I think William Snow is paraphrased as saying that binaural systems transport the listener to the scene of the recording whereas stereo systems transport the sound sources to the listener's room, and I think that captures the experience quite well. Its a shame more work has not been done on it, but multi-channel seems to be where most of the effort is going these days thanks to home theater.

Getting back to the dog in Roger Water's piece, there is some acoustical coloration such as what you would get if it was in a semi-enclosed space, and there is a sense of ambience and distance, like what you might hear if your own dog were barking out on the front porch of your house. But it appears to emerge from in front of my seating position (or as some form of near-IHL with headphones) since there is no way to resolve a 0/180 degree ambiguity without putting ears around the microphone capsules or providing additional post-processing. I suspect the sensation of hearing it from behind may be down to room reflections, or learned expectations of sounds in real life and how they are altered by distance and the like.
 
OP
T

tallbeardedone

Active Member
Joined
Sep 3, 2022
Messages
102
Likes
215
It absolutely should because it has ZERO meaning. ZERO. ZILCH. NONE.

There is only the L-R stereo field that the mix engineer selected when he mixed the record. Any "magic" people hear is from room imperfections, constructive and destructive wave interference, speaker imperfections, and hearing quirks. And none of it is repeatable.
Engineers often add DEPTH and HEIGHT to a mix using pitch, reverb, and of course level. For DEPTH the general rules are:
1.) Louder sounds closer
2.) Higher pitch sounds closer
3.) Less reverb sounds closer
And of course the reverse is true for deeper depth.
Is it all illusory? Of course. Stereo reproduction by definition is the ILLUSION of a 3-D auditory experience from two-point sources of sound.
 
OP
T

tallbeardedone

Active Member
Joined
Sep 3, 2022
Messages
102
Likes
215
The term "soundstage" has meaning to me.

I understand it to mean the auditory illusion of sound sources believably distributed both laterally and in the depth dimension, and the term can also include the illusion of being within an acoustic space which corresponds to the venue ambience cues on the recording.

In my experience, imperfections in the room and playback chain degrade rather than create or enhance "soundstage".

Also, I find soundstage to be repeatable from one system with good spatial qualities to the next. Consider for example that quite a few people have described a similar soundstage experience from "The Ballad of Bill Hubbard" on Roger Waters' "Amused To Death" album, indicating that the "soundstage" originates with the recording rather than with non-repeatable quirks of hearing or imperfections in the room or playback chain.

Reality is only an illusion, albeit a very persistent one. - Albert Einstein
I use that Roger Waters "Bill Hubbard" example often. I am still unclear on the physics of how phase change creates the illusion of positioning (that dog is BEHIND my right shoulder) but it's damn cool!
 

Ricardus

Addicted to Fun and Learning
Joined
Mar 15, 2022
Messages
843
Likes
1,153
Location
Northern GA
Engineers often add DEPTH and HEIGHT to a mix using pitch, reverb, and of course level. For DEPTH the general rules are:
1.) Louder sounds closer
2.) Higher pitch sounds closer
3.) Less reverb sounds closer
And of course the reverse is true for deeper depth.
Is it all illusory? Of course. Stereo reproduction by definition is the ILLUSION of a 3-D auditory experience from two-point sources of sound.
Yes I know. I'm a recording engineer.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
I use that Roger Waters "Bill Hubbard" example often. I am still unclear on the physics of how phase change creates the illusion of positioning (that dog is BEHIND my right shoulder) but it's damn cool!
In terms of psychoacoustics to me it makes more sense to think in terms of delay rather than phase, albeit the two can be used interchangeably. In a nutshell, the auditory center uses the relative level differences, timing differences, and spectral coloration of the sound that results from the interaction with the pinnae, head, and torso to localize things. An interesting experiment would be to hang a blanket or some other suitably absorptive medium behind the listening position and see if the apparent acoustical image of the dog barking changes. I suspect the early reflections arriving from behind are what gives that effect, but it could be perceptual as well since that is often how we judge distance.
 

kongwee

Major Contributor
Joined
Jan 22, 2022
Messages
1,024
Likes
276
Even in recording industry, we often use the word, soundstage. You just can't beat the majority.
 
Top Bottom