• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required as is 20 years of participation in forums (not all true). There are daily reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Sound stage depth?

Duke

Major Contributor
Manufacturer
Forum Donor
Joined
Apr 22, 2016
Messages
1,014
Likes
2,382
Location
Princeton, Texas
And I thank you simply for your patience and determination in digesting what I had to say in it's entirety. I honestly don't like writing a lot, I just have trouble in being confident I am coming across as clear to the other person, so apologies for making you go through those walls of text twice.

You're not making this easy. You give me zero excuses to get riled up; quite the contrary in fact. I try blowing you off anyway, and...

Though your posts to others shows that we hold sufficient common ground on the matter. Like when your discussion about reflection properties and their timings. So it seems you and I already see eye to eye on soundstage from a recording procedure.

... you respond with pointing out where we share common ground. So now it's like I'm trying to break up with my conjoined twin.

How about this: Let's forget about me wanting to "demonstrate something", and let's just have the conversation that seems to be happening anyway.

Just so you know, I know very little about the actual recording process; my comments have been mostly if not entirely about loudspeaker/room interaction. I'm just a speaker geek turned small-time manufacturer.

A couple of times you've mentioned "making drivers have more soundstage" (or something like that) being a position taken by some. I'm much more of a radiation-pattern-obsessed guy, but imo it is possible for some drivers to have less precisely-defined sound images. This involves the short time interval preceding the onset of the Haas effect, so let me explain:

The Haas effect (suppression of localization cues from reflections) does not begin at the instant a new sound arrives; it begins after about .68 milliseconds. That initial .68 millisecond "window" is enough time for a sound which arrives from one side of your head to wrap around and reach the other ear, with the arrival time difference at the two ears informing the ear/brain system of the azimuth (horizontal direction) of the sound source. As you well know, after that first .68 milliseconds the Haas effect kicks in and the ear ignores directional cues from reflections for the next 20 milliseconds or so, but during that interval it is still picking up loudness (and therefore timbre) cues.

If a driver or loudspeaker baffle has diffraction or a reflection, that is a re-radiation of the initial sound delayed by an inherently short path-length difference, thus typically arriving at the ears well within that initial .68 millisecond time window. This results in a FALSE azimuth cue (recall that the inter-aural arrival time difference is how the ear/brain system computes azimuth). Well actually it results in four false azimuth cues... two from each speaker, each arriving at the two ears. The result is a blurring of the sound image width. So for good sound image definition, we want drivers (and baffles) which do not have reflections or diffractions arriving within that first .68 milliseconds. Next-best would be for any (hopefully fairly weak) reflections or diffractions to arrive as early as possible within that first .68 milliseconds, as this will result in a smaller-angle "smearing" from the resulting false azimuth cues. One implication of this is, narrow-baffle speakers generally should have better sound image definition than wide-baffle speakers (all else being equal).

So I'm not really into "this driver images better than the rest", but I am into "that driver has inherent imaging problems". Diffraction horns, sharp-edged waveguides, and non-flush-mounting of tweeters are examples of things which can cause a driver to have imaging problems arising from false azimuth cues. (This is not the only issue which can arise from these very early reflections and diffractions, but I don't think it's one that is very well known. Strong early reflections or diffractions can also make the location of the loudspeakers obvious to the ears, which is a distraction from the spatial information on the recording.)

The foregoing probably seems like a trivial tangent in the big scheme of things, but imo one has to get a lot of little things right at the loudspeaker (and room-interaction) level in order for the spatial cues on the recording to become perceptually dominant.
 
Last edited:

goat76

Senior Member
Joined
Jul 21, 2021
Messages
332
Likes
282
Lots of talks, not enough music. :)

Have a listen to the following song. Can you hear the recorded three-dimensional room behind your speakers, if so, how far into that room is the drumset positioned according to you?

Low - Dinosaur Act

 

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,158
Likes
5,199
You're not making this easy. You give me zero excuses to get riled up; quite the contrary in fact. I try blowing you off anyway, and...

How about this: Let's forget about me wanting to "demonstrate something", and let's just have the conversation that seems to be happening anyway.
Riled up? I'm a bit confused, did I incite some sort of negative emotion? I hope not.

I read what you wrote, but I'm not sure what you would like to have a conversation about precisely. Nothing seems to be anything I'd disgree with. You bring up loudspeaker baffles having an effect (obviously, as mounting a driver to a glob of jelly or a speaker housing that's made of all-metal will impart obviously different sonic behavior, one that may sound completely odd in relation to another, obviously I'm exaggerating hypotheticals here to simply illustrate I have no contention with the claim as it's straightforwardly acceptable in the way you present it).

There was also the Haas Effect, which can be simulated in the digital realm, which is how many of these mono-to-stereo widening examples have been produced (so again, no qualms here). Though some of your numbers run contrary to what I've been exposed to. Since I'm not as deep into these things as a manufacturer might be, I'd simply take your word for it on authority alone on this front. Oh and by the numbers, I mean that ~20ms number. I can hear differences with channel panned delay of mono sound even at ~10ms to 20ms, though any less than that, it just sounds "louder" on the other channel instead of "wider" overall. Oh and I also was under the impression that the discernability of the 'echo effect' starts at ~40ms, I wasn't aware Haas actually is at ~20ms. Though I doubt any of these particulars are of merit, because as I've said, I'll take your claims in virtue of you being a more experienced authority on these matters, I just thought I'd mention it so you're aware where my baseline were prior.

As far as false Azimuth cues. I'm not sure why we would be evaluating "faulty" baffle or speaker designs that would lead to these false cues based on timings (unless of course this is actually one of the primary aspects of what you take soundstage to be, thus any speaker with the least false azimuth cues, would equal a speaker with the best soundstage, or should I say imaging). Instead we should be taking speakers with minimal reflections and diffraction. There's nothing really particularly interesting about a driver if most of the directionality is driven by things you mentioned like waveguides and such. That's just trivially true, in the same way a massively distorting woofer will eventually distort enough to ruin "imaging" simply in virtue of basically ruining everything about the sound produced as a byproduct.

I know you opened up by telling me most of your talking points from from the perspective of speaker room interactions, but if soundstage exists in things like headphones and IEMs, then I find it too unwieldy to start soundstage discussions since that's yet another thing that needs to be juggled (unless of crouse, again, room reflections are tied to your notion of soundstage definitionally). What would happen for instance if someone asks you: "Yeah, no problem with anything you're saying here, but now start talking to me about soundstage with respect to IEMs". Unless of course you're ready to say that soundstage is simply tied to bore-geometry (in the same way baffle design or waveguides are partly what determine directionality in speakers). And if that's all soundstage actually is, then Y-axis imaging/soundstage is going to need to be explained as to how such a thing is possible in stereo (or better yet, even mono) without the use of visual or pre-contextualized cues (like silly binural demos that psychologically pre-load you by telling you which sound is coming from the top, or which is coming from the bottom). I won't ask that question though, because I first need a definition to the best of your ability, but we've abandoned soundstage as the main talking point, so sorry for even bringing it up like this again. Though I would like to ask you something before I conclude: Do you actually believe something like vertical imaging is remotely possible in a blind evaluation of a recorded sound that was recorded in an anechoic chamber, and you listening to said sound on headphones also in a chamber? I ask this because I want to remove all factors that contribute to generalizing the concept or bogging it down with too many moving parts. So no reflections in the recording, no reflections during playback, and no reflections of the pinna (though for this it would need to be IEMs).

In conclusion, nothing you mentioned here is anything I'd really contest as far as it's effects on soundstage goes, better housing design can fully lend itself to a more "expansive" sound image experience (obviously the actual numbers are for manufacturers to figure out what they think is best, but I can't imagine why anyone would opt for anything less than maximal expansiveness for their given form factor at least where speakers are concerned, unless of course that ruins some other intents for certain metrics).
 

tmtomh

Major Contributor
Joined
Aug 14, 2018
Messages
1,556
Likes
4,770
At the risk of stating the obvious, it appears we are really talking about three things:

1. Recordings contain differing amounts of information that our brains might interpret as soundstage depth: EQ, reverb, and so on. I'd also add that a number of pop/rock recordings contain a combination of close and distant-mic'd voices and instruments. The first thing that comes to mind is the early Led Zeppelin records, which were not first or unique but were somewhat groundbreaking in using distant mic'ing for drums, and (if memory serves) close mic'ing but with small amps on the lead guitar. There's often a palpable illusion of depth with John Bonham's drums clearly sounding farther back than the rest of the band, because of all the extra ambience from the mic arrangement.

2. Different listening rooms have qualities that might - or might not - allow the depth-illusion in recordings to be easily detected. So if one has the "right" room setup and speaker placement, then one might hear soundstage depth in those recordings that "have" it, while if one does not have the "right" setup, one might hear minimal soundstage depth in all recordings.

3. Some listening rooms might have qualities that actually help increase or flat-out create an illusion of soundstage depth. With such rooms, almost all recordings in one's collection would produce an illusion of depth, as the illusion would be added by the room to all recordings (as opposed to scenario #2, where the room could allow for the preservation of the illusion if it happened to be encoded in the recording.)

My own experience includes items 1 and 2, as far as I can tell. A few years ago I moved into a new house with a dedicated listening room that is larger, more symmetrical, and much better damped than my old one. The old room was carpeted, had upholstered furniture, and did not actively sound "live" - but the new room is definitely deader. The new room provides much better soundstage precision, and it's not subtle. The L-R width is a bit greater, but the more dramatic change is the placement precision within the L-R soundstage - and with many recordings, an illusion of some depth that was basically 100% new to me when I started listening in the new room.

So... since my recordings now seem to vary considerably in how much depth they seem to have, and since the new room quite obviously has fewer reflections than the old one, it seems to me that the most logical (albeit not certain or experimentally verified) conclusion is that the illusion of depth is primarily encoded in the recordings and that the new room is allowing me to hear that.

I have no particular investment in being right about this, though, so if someone has a better hypothesis, I'm all ears (pun intended :) ).
 
Last edited:

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
1,975
Likes
1,374
Location
Sweden
At the risk of stating the obvious, it appears we are really talking about three things:

1. Recordings contain differing amounts of information that our brains might interpret as soundstage depth: EQ, reverb, and so on. I'd also add that a number of pop/rock recordings contain a combination of close and distant-mic'd voices and instruments. The first thing that comes to mind is the early Led Zeppelin records, which were not first or unique but were somewhat groundbreaking in using distant mic'ing for drums, and (if memory serves) close mic'ing but with small amps on the lead guitar. There's often a palpable illusion of depth with John Bonham's drums clearly sounding farther back than the rest of the band, because of all the extra ambience from the mic arrangement.

2. Different listening rooms have qualities that might - or might not - allow the depth-illusion in recordings to be easily detected. So if one has the "right" room setup and speaker placement, then one might hear soundstage depth in those recordings that "have" it, while if one does not have the "right" setup, one might hear minimal soundstage depth in all recordings.

3. Some listening rooms might have qualities that actually help increase or flat-out create an illusion of soundstage depth. With such rooms, almost all recordings in one's collection would produce an illusion of depth, as the illusion would be added by the room to all recordings (as opposed to scenario #2, where the room could allow for the preservation of the illusion if it happened to be encoded in the recording.)

My own experience includes items 1 and 2, as far as I can tell. A few years ago I moved into a new house with a dedicated listening room that is larger, more symmetrical, and much better damped than my old one. The old room was carpeted, had upholstered furniture, and did not actively sound "live" - but the new room is definitely deader. The new room provides much better soundstage precision, and it's not subtle. The L-R width is a bit greater, but the more dramatic change is the placement precision within the L-R soundstage - and with many recordings, an illusion of some depth that was basically 100% new to me when I started listening in the new room.

So... since my recordings now seem to vary considerably in how much depth they seem to have, and since the new room quite obviously has fewer reflections than the old one, it seems to me that the most logical (albeit not certain or experimentally verified) conclusion is that the illusion of depth is primarily encoded in the recordings and that the new room is allowing me to hear that.

I have no particular investment in being right about this, though, so if someone has a better hypothesis, I'm all ears (pun intended :) ).
I would agree with most of it but also add a factor regarding speaker characteristics and speaker positioning. Factors that "reveal" the speakers (resonances, near-by reflections, stereo system errors etc) is detrimental to the depth illusion and vice versa. Dispersion is one factor (as many also report omni speakers to "disappear"). Since positioning and early reflections are part of the equation, room is as well.

One note though, the D/R in the recording is not what D/R is in reality, since reflections in real life comes from other directions than the direct sound. Our brains interpret a low D/R as if sounds comes from a reverbant room, such as a large hall or church. Probably due to that we and or brains have experienced that before also IRL. But for the "real thing", multichannel or very clever processing s needed.
 

goat76

Senior Member
Joined
Jul 21, 2021
Messages
332
Likes
282
At the risk of stating the obvious, it appears we are really talking about three things:

1. Recordings contain differing amounts of information that our brains might interpret as soundstage depth: EQ, reverb, and so on. I'd also add that a number of pop/rock recordings contain a combination of close and distant-mic'd voices and instruments. The first thing that comes to mind is the early Led Zeppelin records, which were not first or unique but were somewhat groundbreaking in using distant mic'ing for drums, and (if memory serves) close mic'ing but with small amps on the lead guitar. There's often a palpable illusion of depth with John Bonham's drums clearly sounding farther back than the rest of the band, because of all the extra ambience from the mic arrangement.

2. Different listening rooms have qualities that might - or might not - allow the depth-illusion in recordings to be easily detected. So if one has the "right" room setup and speaker placement, then one might hear soundstage depth in those recordings that "have" it, while if one does not have the "right" setup, one might hear minimal soundstage depth in all recordings.

3. Some listening rooms might have qualities that actually help increase or flat-out create an illusion of soundstage depth. With such rooms, almost all recordings in one's collection would produce an illusion of depth, as the illusion would be added by the room to all recordings (as opposed to scenario #2, where the room could allow for the preservation of the illusion if it happened to be encoded in the recording.)

My own experience includes items 1 and 2, as far as I can tell. A few years ago I moved into a new house with a dedicated listening room that is larger, more symmetrical, and much better damped than my old one. The old room was carpeted, had upholstered furniture, and did not actively sound "live" - but the new room is definitely deader. The new room provides much better soundstage precision, and it's not subtle. The L-R width is a bit greater, but the more dramatic change is the placement precision within the L-R soundstage - and with many recordings, an illusion of some depth that was basically 100% new to me when I started listening in the new room.

So... since my recordings now seem to vary considerably in how much depth they seem to have, and since the new room quite obviously has fewer reflections than the old one, it seems to me that the most logical (albeit not certain or experimentally verified) conclusion is that the illusion of depth is primarily encoded in the recordings and that the new room is allowing me to hear that.

I have no particular investment in being right about this, though, so if someone has a better hypothesis, I'm all ears (pun intended :) ).

As @Thomas_A said it's important to get the speakers positioning right, they need to "work together" so that the best possible projection of the stereo image can be had. And the room acoustics must be kept at bay, otherwise, the reverberation from the listening room will overshadow the small details that are needed to give us the important cues of the recorded room.

#3 in your list doesn't have much to do with high-fidelity reproduction of the recorded signal, that's a coloration created by the listening room that will, to some agree, overshadow the recording and add its own characteristics to everything that is played. It's not an illusion, that's the real reverberation from your listening environment. But still... I don't think we should eliminate all of that, it can contribute to some enveloping characteristics which I personally like. :)
 

Tangband

Major Contributor
Joined
Sep 3, 2019
Messages
1,636
Likes
1,469
Location
Sweden
I would agree with most of it but also add a factor regarding speaker characteristics and speaker positioning. Factors that "reveal" the speakers (resonances, near-by reflections, stereo system errors etc) is detrimental to the depth illusion and vice versa. Dispersion is one factor (as many also report omni speakers to "disappear"). Since positioning and early reflections are part of the equation, room is as well.

One note though, the D/R in the recording is not what D/R is in reality, since reflections in real life comes from other directions than the direct sound. Our brains interpret a low D/R as if sounds comes from a reverbant room, such as a large hall or church. Probably due to that we and or brains have experienced that before also IRL. But for the "real thing", multichannel or very clever processing s needed.
Regarding acoustical recordings with only two microphones in a concert hall:

It takes the ability to do your own recordings to fully understand what the sence of depth with two stereo loudspeakers is about - in two channel playback its an illusion because recording a grand piano and fixing an illusion of sitting 8 metres from it when doing playback from two speakers , might result that you put your stereomicrophones 1 meter from the middle of the inside of the grand piano .

In real life we also listen with our eyes, and the precedence effect + cocktailparty effect is very real in the concert hall.
The microphones and the ear/brain works very different at distances more than 5 ms airtravel, ie 1,7 meters or more, and early reflections in the recording sounds really bad - thats why one always have rather high microphone stands when doing acoustical recordings .
 
Last edited:

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
1,975
Likes
1,374
Location
Sweden
It takes the ability to do your own recordings to fully understand what the sence of debth is about - in two channel playback its an illusion because recording a grand piano and fixing an illusion of sitting 8 metres from it when doing playback from two speakers , might result that you put your stereomicrophones 1 meter from the middle of the inside of the grand piano .
Without any reverb from the hall? (Also, our brains are not very good at absolute distance judgements at long distances, but better at relative. E.g. two voices presented with different D/R you can judge whether one appears to be more distant than the other.)
 

Tangband

Major Contributor
Joined
Sep 3, 2019
Messages
1,636
Likes
1,469
Location
Sweden
Without any reverb from the hall? (Also, our brains are not very good at absolute distance judgements at long distances, but better at relative. E.g. two voices presented with different D/R you can judge whether one appears to be more distant than the other.)
The critical distance for a microphone in a concert hall or a church is often less than 1 meter ( Critical distance i.e 50%reverb and 50% direct sound ) . Longer microphone distances can make the recording very distant with tons of reverb and no direct sound.
4403BA34-782F-4A5F-B5F4-32E0EEE6D018.png



This is very different from how you hear sound in the concert hall - the precedence effect attenuate all reverb sound in your brain with more than 10 dB , while your attention with eyes and ears see the performer playing the music.
Because of this, sitting 15 meters from the grand piano can give you a good listening balance in the livesituation in the concert hall, while putting the microphone 15 meters away doing the recording will sound extremely different, with maybe 95 % reverb sound and 5 % direct sound.


”A special appearance of the precedence effect is the Haas effect. Haas showed that the precedence effect appears even if the level of the delayed sound is up to 10 dB higher than the level of the first wave front. In this case, the range of delays, where the precedence effect works, is reduced to delays between 10 and 30 ms.”
 
Last edited:

Frank Dernie

Master Contributor
Forum Donor
Joined
Mar 24, 2016
Messages
6,063
Likes
14,383
Location
Oxfordshire
I would agree with most of it but also add a factor regarding speaker characteristics and speaker positioning. Factors that "reveal" the speakers (resonances, near-by reflections, stereo system errors etc) is detrimental to the depth illusion and vice versa. Dispersion is one factor (as many also report omni speakers to "disappear"). Since positioning and early reflections are part of the equation, room is as well.

One note though, the D/R in the recording is not what D/R is in reality, since reflections in real life comes from other directions than the direct sound. Our brains interpret a low D/R as if sounds comes from a reverbant room, such as a large hall or church. Probably due to that we and or brains have experienced that before also IRL. But for the "real thing", multichannel or very clever processing s needed.
This is my experience.
Over 30 years ago I went to a dealer to listen to the ATC SCM50 speakers. Years before I had heard ProAc EBS speakers and been very impressed, they used the same mid and bass drivers from ATC but were no longer available. The dealer was in Hitchin iirc and was the nearest ATC specialist to me.
They also had Apogee Duettas on demo and having read all the Apogee hype in the mags was keen to listen.
The difference in stereo depth between ATC and Apogee was vast on the same recoding in the same room and listening position - though not speaker location.
Recording may make a bit of a difference but the biggie is speaker and room IME and is therefore probably 99% ”fake”
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
1,975
Likes
1,374
Location
Sweden
The critical distance for a microphone in a concert hall or a church is often less than 1 meter ( Critical distance i.e 50%reverb and 50% direct sound ) . Longer microphone distances can make the recording very distant with tons of reverb and no direct sound.
View attachment 232727


This is very different from how you hear sound in the concert hall - the precedence effect attenuate all reverb sound in your brain with more than 10 dB , while your attention with eyes and ears see the performer playing the music.
Because of this, sitting 15 meters from the grand piano can give you a good listening balance in the livesituation in the concert hall, while putting the microphone 15 meters away doing the recording will sound extremely different, with maybe 95 % reverb sound and 5 % direct sound.


”A special appearance of the precedence effect is the Haas effect. Haas showed that the precedence effect appears even if the level of the delayed sound is up to 10 dB higher than the level of the first wave front. In this case, the range of delays, where the precedence effect works, is reduced to delays between 10 and 30 ms.”
Yes, that is correct. But as mentioned, absolute distance judgements based on direct to reflected sound ratio seems to be quite inaccurate as distance increases, even if that is the best cues we have.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
1,975
Likes
1,374
Location
Sweden
This is my experience.
Over 30 years ago I went to a dealer to listen to the ATC SCM50 speakers. Years before I had heard ProAc EBS speakers and been very impressed, they used the same mid and bass drivers from ATC but were no longer available. The dealer was in Hitchin iirc and was the nearest ATC specialist to me.
They also had Apogee Duettas on demo and having read all the Apogee hype in the mags was keen to listen.
The difference in stereo depth between ATC and Apogee was vast on the same recoding in the same room and listening position - though not speaker location.
Recording may make a bit of a difference but the biggie is speaker and room IME and is therefore probably 99% ”fake”
Do you refer to the very different dispersion? I listened once to the Megatrend speakers, quite close distance though. It was like listen to a sound through binoculars. Rock-steady and clear, narrow focused phantom image in the center. I sounded like headphones with frontal centered sound (rather than between the ears sound).
 

Tangband

Major Contributor
Joined
Sep 3, 2019
Messages
1,636
Likes
1,469
Location
Sweden
About the reverb in the concert hall and recordings of acoustical instruments: - If one have a very good hall or church where you also can find the optimal distance from microphones to the instrument, one can make a better ambient recording than with the best plugin reverbs in a DAW . This is my experience, but I havent tried every reverb plugin in the world, only those that works with Logic Pro X and Audacity.

Every conversion in a DAW , be it conversion from 32 bit to 24 or 24 to 16 , or 96 kHz to 44,1 kHz makes a tiny, tiny worsening of the sound. Every plugin program ( eq, compression peq ) is even worse.

I wish it was like the slogan ” perfect sound forever” that Phillips invented in the -80, but not really. One has to be careful even with 24 bit material not doing SRC many times ( as one must do with most plugin programes ) If one dont need it. SRC often error the sound in a less forgiving way than good open reel analog gear . The latter can even sound more ”nice” to the ears with an analog tape coloration.
 
Last edited:

Duke

Major Contributor
Manufacturer
Forum Donor
Joined
Apr 22, 2016
Messages
1,014
Likes
2,382
Location
Princeton, Texas
Riled up? I'm a bit confused, did I incite some sort of negative emotion? I hope not.

Quite the opposite! Conversing with you has been exceptionally positive.

There was also the Haas Effect, which can be simulated in the digital realm, which is how many of these mono-to-stereo widening examples have been produced (so again, no qualms here). Though some of your numbers run contrary to what I've been exposed to. Since I'm not as deep into these things as a manufacturer might be, I'd simply take your word for it on authority alone on this front. Oh and by the numbers, I mean that ~20ms number. I can hear differences with channel panned delay of mono sound even at ~10ms to 20ms, though any less than that, it just sounds "louder" on the other channel instead of "wider" overall. Oh and I also was under the impression that the discernability of the 'echo effect' starts at ~40ms, I wasn't aware Haas actually is at ~20ms.

I may have been mis-remembering that 20 ms. And I agree with your observations - the Haas effect is NOT a perfect suppression of directional cues. This is demonstrated to those familiar with Floyd Toole's work by the widening of the Apparent Source Width caused by the first sidewall reflections (which have their downsides as well, including a reduction in the precision of sound image locations).

As far as false Azimuth cues. I'm not sure why we would be evaluating "faulty" baffle or speaker designs that would lead to these false cues based on timings (unless of course this is actually one of the primary aspects of what you take soundstage to be, thus any speaker with the least false azimuth cues, would equal a speaker with the best soundstage, or should I say imaging). Instead we should be taking speakers with minimal reflections and diffraction. There's nothing really particularly interesting about a driver if most of the directionality is driven by things you mentioned like waveguides and such. That's just trivially true, in the same way a massively distorting woofer will eventually distort enough to ruin "imaging" simply in virtue of basically ruining everything about the sound produced as a byproduct.

Agreed, I was just pointing out a case where there was variation in spatial quality which could be traced to characteristics of some drivers.

I know you opened up by telling me most of your talking points from from the perspective of speaker room interactions, but if soundstage exists in things like headphones and IEMs, then I find it too unwieldy to start soundstage discussions since that's yet another thing that needs to be juggled (unless of crouse, again, room reflections are tied to your notion of soundstage definitionally). What would happen for instance if someone asks you: "Yeah, no problem with anything you're saying here, but now start talking to me about soundstage with respect to IEMs". Unless of course you're ready to say that soundstage is simply tied to bore-geometry (in the same way baffle design or waveguides are partly what determine directionality in speakers). And if that's all soundstage actually is, then Y-axis imaging/soundstage is going to need to be explained as to how such a thing is possible in stereo (or better yet, even mono) without the use of visual or pre-contextualized cues (like silly binural demos that psychologically pre-load you by telling you which sound is coming from the top, or which is coming from the bottom). I won't ask that question though, because I first need a definition to the best of your ability, but we've abandoned soundstage as the main talking point, so sorry for even bringing it up like this again. Though I would like to ask you something before I conclude: Do you actually believe something like vertical imaging is remotely possible in a blind evaluation of a recorded sound that was recorded in an anechoic chamber, and you listening to said sound on headphones also in a chamber? I ask this because I want to remove all factors that contribute to generalizing the concept or bogging it down with too many moving parts. So no reflections in the recording, no reflections during playback, and no reflections of the pinna (though for this it would need to be IEMs).

I know virtually nothing about soundstaging with headphones and/or IEM's. My opinion is that pinna transforms are worth inclusing somewhere in the signal path for IEM's, but know not how to do so if the recording was not binaural using dummy ears (or one's own ears).

In conclusion, nothing you mentioned here is anything I'd really contest as far as it's effects on soundstage goes, better housing design can fully lend itself to a more "expansive" sound image experience (obviously the actual numbers are for manufacturers to figure out what they think is best, but I can't imagine why anyone would opt for anything less than maximal expansiveness for their given form factor at least where speakers are concerned, unless of course that ruins some other intents for certain metrics).

I make the assumption that the "package" of spatial cues on the recording are what we want to hear, which in turn implies that we want to present those cues effectively while suppressing the effectiveness with which the playback room's "package" of spatial cues are presented.

When I say "package of spatial cues", I'm thinking primarily in terms of four things: The first-arrival sound; the first reflections; the reverberation tails (and yes I know the term "reverberation" is not the most precise in the context of small room acoustics, but it does apply to the spatial cues on the recording); and the "temporal center of gravity" of the reflections. The first .68 milliseconds is primarily what gives us the sound image direction; the first reflections tell us about room size and liveliness or deadness; the reverberation tails inform us of distance and room size and room liveliness or deadness; and the temporal center of gravity of the reflections informs us of distance and room size.

My premise is that the ear/brain system tends to select the most plausible "package" of spatial cues, choosing between the "small room signature" package of the playback room and the "venue spatial cues" package on the recording.

My understanding is that the first reflections are the strongest indicators of room size, so we can DISRUPT the "small room signature" cues by suppressing and/or delaying the onset of the first reflections. Doing so ALSO pushes the temporal center of gravity of the reflections back in time. The net result is that the ear/brain system is no longer presented with a convincing package of "small room signature" cues, as now they are somewhat scrambled and self-contradictory.

My understanding is that the reverberation tails on the recording are effective indicators of venue size and venue acoustics and sound source distance, and the later-arriving in-room reflections are the most effective way of delivering those reverberation tails from many directions. The better we preserve those reverberation tails and present them to the listener from all around, the stronger the presentation of the "venue spatial cues" package. (This is what a good multi-channel recording does, as the rear channels deliver the reverberation tails from multiple desirable directions.) In two-channel the in-room reflections function as the "carriers" of those reverberation tails, so in my opinion we want to preserve the later-arriving in-room reflections rather than eliminate them (as some advocate; I'm not saying that YOU do).

IF this combination of room acoustic characteristics sounds a bit like "live end-dead end", well that's because this concept of what constitutes desirable loudspeaker/room interaction is not a new idea.

In this context, I believe there is room for improvement over "conventional" loudspeaker radiation patterns. I believe there are better "starting points" for how the energy is radiated out into the room, given that the desired "end goal" is minimal early reflections + plenty of spectrally-correct late reflections. And apparently my opinion is in the small minority, given the degree to which "conventional" loudspeakers dominate the marketplace!
 
Last edited:

Duke

Major Contributor
Manufacturer
Forum Donor
Joined
Apr 22, 2016
Messages
1,014
Likes
2,382
Location
Princeton, Texas
Factors that "reveal" the speakers (resonances, near-by reflections, stereo system errors etc) is detrimental to the depth illusion and vice versa. Dispersion is one factor (as many also report omni speakers to "disappear"). Since positioning and early reflections are part of the equation, room is as well.

YESSS!!

One note though, the D/R in the recording is not what D/R is in reality, since reflections in real life comes from other directions than the direct sound. Our brains interpret a low D/R as if sounds comes from a reverbant room, such as a large hall or church. Probably due to that we and or brains have experienced that before also IRL.

YESSS again!!

But for the "real thing", multichannel or very clever processing is needed.

Imo there is hope for two-channel, which implies hope for recordings which will probably never be available in a mult-channel format.

Un-attended-to playback room acoustics tend to "mask" the venue acoustic cues on the recording. If we can shift the in-room reflections from "predominantly early arrival" to "predominantly late arrival", we can in effect "unmask" the venue acoustic cues on the recording.

In my opinion.
 

goat76

Senior Member
Joined
Jul 21, 2021
Messages
332
Likes
282
My premise is that the ear/brain system tends to select the most plausible "package" of spatial cues, choosing between the "small room signature" package of the playback room and the "venue spatial cues" package on the recording.

My understanding is that the first reflections are the strongest indicators of room size, so we can DISRUPT the "small room signature" cues by suppressing and/or delaying the onset of the first reflections. Doing so ALSO pushes the temporal center of gravity of the reflections back in time. The net result is that the ear/brain system is no longer presented with a convincing package of "small room signature" cues, as now they are somewhat scrambled and self-contradictory.

My understanding is that the reverberation tails on the recording are effective indicators of venue size and venue acoustics and sound source distance, and the later-arriving in-room reflections are the most effective way of delivering those reverberation tails from many directions. The better we preserve those reverberation tails and present them to the listener from all around, the stronger the presentation of the "venue spatial cues" package. (This is what a good multi-channel recording does, as the rear channels deliver the reverberation tails from multiple desirable directions.) In two-channel the in-room reflections function as the "carriers" of those reverberation tails, so in my opinion we want to preserve the later-arriving in-room reflections rather than eliminate them (as some advocate; I'm not saying that YOU do).

IF this combination of room acoustic characteristics sounds a bit like "live end-dead end", well that's because this concept of what constitutes desirable loudspeaker/room interaction is not a new idea.

In this context, I believe there is room for improvement over "conventional" loudspeaker radiation patterns. I believe there are better "starting points" for how the energy is radiated out into the room, given that the desired "end goal" is minimal early reflections + plenty of spectrally-correct late reflections. And apparently my opinion is in the small minority, given the degree to which "conventional" loudspeakers dominate the marketplace!

While your theory is interesting indeed, I'm not too sure it really works like that, that the in-room reflections function as the "carriers" of the reverberation tails of the recording. I believe all those cues of the recorded space come exclusively with the direct sound from the two sound sources to our ears, and those signals are combined in our minds and create the stereo illusion. But that doesn't mean the late reflections from our listening room aren't important at all, I think they contribute to the enveloping sound which is also an important factor for a more convincing reproduction of the recorded music event.

As you say, the ear/brain can never process two different room sizes at the same time, only the dominating one will be processed as "a room", either it will be the recorded room or it will be the listener's room, not both at any single time. For that reason, I really can't see how any reflections in the listening room could function as a "carrier" of the reverberation tails of the recording, and convincingly help to portray the recorded room when they hardly share any similarities when it comes to the directions and such.

So I don’t share your ”carrier” theory, I think it will always be a competing factor between the recorded room and the listening room. The recorded room must be the dominant one which is only carried by the direct sound, while the listening room can contribute to the effect of more enveloping sound, which the two speakers in a stereo system can’t do on their own.
 
Last edited:

Matthias McCready

Active Member
Joined
Jul 2, 2021
Messages
186
Likes
213
As you say, the ear/brain can never process two different room sizes at the same time, only the dominating one will be processed as "a room", either it will be the recorded room or it will be the listener's room, not both at any single time. For that reason, I really can't see how any reflections in the listening room could function as a "carrier" of the reverberation tails of the recording, and convincingly help to portray the recorded room when they hardly share any similarities when it comes to the directions and such.

I do agree with you, that we can only listen in one space at time, however I would make the addendum that reverb is cumulative; which is why I prefer rooms to be fairly dead.

For example if I am mixing in a larger sneaky room; one of those spaces that physically appears to be a large ballroom, but secretly it is a hockey arena in disguise (acoustically atleast) the room will have its own slap-backs, reflections, and acoustical oddities. Even with all of that room that I am hearing, I can still hear and discern reverb on my sources, although I am less likely to use it. Or if I have a video sent my way (political gigs) where the audio was not well produced, and it was recorded with the wrong type of mic in a small but live room (drywall with no absorption) I can certainly distinguish that.

Granted this is large-room acoustics, which are a different animal than a listening room or mix/master studio.

----

Where this still applies to small room acoustics, is that things are cumulative.

In my systems I prefer linearity and neutrality. As an example if I put an extreme 12dB low-shelf boost at 80hz on my system, yes there are certainly are some songs where that could possibly be pleasing, however for most music it would not be. I would rather have something flatter, that translates that music as intended; I am going to have trust that the mix/mastering stages of the record added those types of boosts where appropriate.

In this vein having a listening room add a sense of space, sure maybe it could be pleasing for some genres or on some songs, however it won't be beneficial for all material. Personally I trust the creation process; that the recording engineers chose the room or venue, mic type, mic placement, and mic distance to capture that room how they wanted to or that the mix engineer added "space" where they wanted to.

I suppose at the end of the day people add all sorts of devices or things to their signal chain. Maybe it is a tube amp, intentionally designed to add harmonics or I have even met a gent who added a multi-band compressor with some quite audible settings, because he wanted his music to sound "slammed" like a radio station. Similarly I suppose if someone wanted to have a live listening room that is fairly live that is their prerogative, but it wouldn't by first choice. :)
 
Last edited:

Duke

Major Contributor
Manufacturer
Forum Donor
Joined
Apr 22, 2016
Messages
1,014
Likes
2,382
Location
Princeton, Texas
@goat76, THANK YOU for your well thought-out and considerate reply. You and I may not end up agreeing, but at least we will end up understanding one another's positions, and without having slogged our way through an argument to get there.

While your theory is interesting indeed, I'm not too sure it really works like that, that the in-room reflections function as the "carriers" of the reverberation tails of the recording. I believe all those cues of the recorded space come exclusively with the direct sound from the two sound sources to our ears, and those signals are combined in our minds and create the stereo illusion. But that doesn't mean the late reflections from our listening room aren't important at all, I think they contribute to the enveloping sound which is also an important factor for a more convincing reproduction of the recorded music event.

The in-room reflections carry the exact same signal as the direct sound, spectrally modified by the speaker's radiation pattern and the room's acoustics. So the recording venue's reverberation tails should be present in the in-room reflections until the in-room reflections have decayed to the point where they are no longer recognizeable and/or detectable. Whether or not the presence of the recording venue's reverberation tails in the in-room reflections is of audible consequence IS open to debate, and I think THAT is where you and I disgree.

As you say, the ear/brain can never process two different room sizes at the same time, only the dominating one will be processed as "a room", either it will be the recorded room or it will be the listener's room, not both at any single time.

YESSS!!

For that reason, I really can't see how any reflections in the listening room could function as a "carrier" of the reverberation tails of the recording, and convincingly help to portray the recorded room when they hardly share any similarities when it comes to the directions and such.

I agree that the carrier paradigm's SPECIFIC arrival directions for the in-room reflections do not match those of the venue's reverberation tails, but in GENERAL they approximate the real-world situation more closely than is envisioned by the competing paradigm - which is, that the direct sound is the exclusive carrier for the recording venue's reverberation tails.

(Just for the record, my guess is that reality doesn't follow our paradigms as closely as we might hope - mine included!)

So I don’t share your ”carrier” theory, I think it will always be a competing factor between the recorded room and the listening room. The recorded room must be the dominant one which is only carried by the direct sound, while the listening room can contribute to the effect of more enveloping sound, which the two speakers in a stereo system can’t do on their own.

One way to test my "carrier" theory might be to compare a setup having relatively little late-onset reflection energy versus the same setup except now with a lot of late-onset reflection energy, without changing either the direct sound or the early reflections or the overall spectral balance. If the recording venue cues dominate in one case but not the other, then we know which approach works best. If the recording venue cues dominate in both cases, then we might further inquire whether one sounds better than the other. If the recording venue cues dominate in neither case, then it's back to the drawing-board for both of us!
 
Last edited:

Axo1989

Addicted to Fun and Learning
Joined
Jan 9, 2022
Messages
735
Likes
572
Discussion of soundstage in audio reproduction tends to be wide-ranging and discursive. Partly because people use it to refer to stereo image or to envelopment, or both. I avoid the term soundstage when discussing reproduction, as the other two terms are more specific/less confusing.

In audio production, a soundstage refers to the performance stage or more specifically the space where the audio event is recorded. This can be large enough to accomodate an orchestra, or a small space for recording foley and special effects. Now, digital production overlaps with physical sound stages and the original term is a bit anachronistic.

Audio reviews use the term a lot, because it sounds good, but I don't think it's helpful. At least I prefer to consider stereo image, which can have width, depth and height—along with focus/shape and placement of individual sonic elements—and derives from the recorded and/or assembled/manipulated sound collage. Or envelopment which is usually a product of loudspeaker dispersion and listening room setup and acoustics gelling with recorded sonic characteristics.

Creating the stereo image in assembled music means applying sonic manipulation—amplitude, timing/phase, frequency shaping, reverb, contrast, etc—that mimic characteristics of natural sound in physical space. You can listen to these yourself by moving a sound source like snapping fingers or a clicker around your head with your eyes closed. Or getting a friend to do the same while you wear a blindfold for added drama. Azimuth, depth and even height can be mimicked effectively in stereo. The latter is harder to reproduce because the sonic effect of the pinnae are subtle and vary between individuals. And obviously the phase/filtering tricks that place a sound beside or behind with sound reproduced by a stereo triangle are hit-and-miss and generally more effective via binaural headphone or multi-channel speaker production/reproduction.

I mostly listen to constructed music rather than performance recordings so favour mitigating listening room defects as far as possible (but I do allow a little side refection for the euphonic width—aiming my particular speakers at LP is a nice balance, crossing in front goes a bit too far). The ability of a system to reproduce the recorded sonics and certain defects that get in the way have been covered adequately by a number of posts above.
 
Last edited:

Axo1989

Addicted to Fun and Learning
Joined
Jan 9, 2022
Messages
735
Likes
572
For posters like @Tks who are sceptical of certain aspects, I recommend LEDR for example via audiocheck.net or similar:

LEDR stands for Listening Environment Diagnostic Recording, a test to subjectively evaluate the accuracy of stereo image reproduction.

In the eighties, psychoacousticians began researching what are called pinna transforms, the way in which the shape of the outer ear filters the incoming sounds and permits our brain to infer their location. By embedding the filtering characteristics of the pinna into the audio signal, sound can be moved around the listener's head from a single pair of loudspeakers.

The LEDR test generates pinna-filtered audio that will literally float around your speakers, assuming your sound reproduction system is neutral enough to preserve the original signal characteristics.

That's a speaker test, so results will depend on your hardware/room and reasonable match of your pinnae to the average. This also relates to aspects of personalised spatial audio discussed in the iOS 16 thread.
 
Last edited:
Top Bottom