• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Sound stage depth?

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,523
Likes
3,745
Location
Princeton, Texas
@Tks, thanks for your in-depth response.

If I understand correctly, you consider the word "soundstage" to describe an attribute of a recording, not an attribute of a speaker. Is that correct?

If so, what wording would you use to describe the ability of speakers (or speakers + room, or the entire playback system) to convey the soundstage on a recording?

If we're comparing soundstage between two devices, all other factors must be made equal as much as possible, so we can't have things like different frequency response, that must be equalized before such comparisons are allowed to proceed if one is to demonstrate soundstage as a distinct concept.

If we're talking about loudspeakers, seems to me we ought to leave the designer's choices alone if we want to compare abilities to convey the soundstage on a recording. For example, equalizing the on-axis responses to be the same is going to also change the off-axis response of at least one of them, and the designer(s) may intend a particular off-axis response for good reasons.

So when you ask for clarification of what sort of demonstration of soundstage I'm asking for. I'm asking for things that can be toggled or put on a slider that increase or decrease soundstage...

So, something where you can do a quick A/B comparison? Like the Harman Speaker Shuffler system, or the Bang & Olufsen speaker which can switch between different radiation patterns?

... using a concept or concepts that are as distinct as possible from already existing concepts.

Seems to me this requirement greatly limits what you would be willing to have demonstrated.

But before demonstrations of soundstage are made, I'd more than anything would like an actual definition.

Good point. Do you have one?

Maybe we should be calling it "spatial quality". Maybe I have been careless to use the word "soundstage" instead of "spatial quality". Here are aspects of "spatial quality" as found in Figure 7.13, page 178 of Toole's book, third edition:

- Definition of sound images
- Continuity of the sound stage
- Width of the sound stage
- Impression of distance/depth
- Abnormal effects
- Reproduction of ambience, spaciousness, and reverberation
- Perspective (artificial or contrived; they are here; outside looking in; close but still looking in; you are there; other)
 
Last edited:

goat76

Major Contributor
Joined
Jul 21, 2021
Messages
1,269
Likes
1,385
My understanding (from Floyd Toole's book) is that the WORST direction for a reflection to arrive from is, the exact same direction as the first-arrival sound. So imo we don't want to minimize ALL of the reflections, only the FIRST ones. If we go overboard with absorbing reflections, we can end up with an overly dead presentation.

The first reflections are the worst offenders as far as degrading imaging precision, soundstage depth, and tonal balance, and the later reflections (assuming they are spectrally correct) are effective as "carriers" for the reverberation tails on the recording, delivering them from many directions, which is desirable if we wish to have a sense of immersion.

In the text you quoted, I was purely talking about hearing the recorded direct sound, hearing the recorded early reflection, and hearing the recorded reverberant tails, which will all give us the necessary clues that contain the depth and spaciousness of the recorded space. And for that, we must maximize the direct sound from our speakers and minimize the reflections from our own listening environment.

The early reflections that will come from the wall behind the speakers will only contain the lowest frequencies that are omnidirectional. If those are fundamental frequencies for us to get a good sense of the three-dimensional space of the recording, I do not know, but I don't think that frequency range will be significantly more harmful than the early reflections that contain the rest of the frequency spectrum, which will be part of the reflections from the side walls, back wall, floor, and ceiling.

I'm not saying we should go overboard with absorbing all reflections, just minimize them to the point where they will not overshadow the direct sound that contains the recorded spaciousness and reverberation. Some of the reflections are needed to hide some of the shortcomings of the simplistic stereo system and help to get a better sense of a more enveloping sound sensation, and some of them will simply make us feel more comfortable just being in the room. Other than that, the reflections will just "amplify" that "small room signature" you are talking about, and overshadow the recorded direct sounds, the recorded early reflections, and the recorded reverberant tails.

I don't know how common it is for the wall behind the speakers to constrain the soundstage depth.

The wall behind the speakers can only contain information about the depth of the listening environment and the speaker's relation and distance to that wall, it will not carry any information about the soundstage depth of the recording. The information about the depth of the recording is in the data of the recording and can only be better heard with a higher ratio of direct sound and less reflected sounds from the listening environment.

IMO, the only things that we really want our own listening environment to add are to hide the shortcomings of the stereo system, add some sensation of enveloping sound and make us feel comfortable just being in the room. And besides those points get a sense of distance to the speakers, which will otherwise make dry recorded sounds sound too dry and "in our face" like listening to headphones. The rest beyond that will just be destructive for us to be able to hear the recorded information.
 

goat76

Major Contributor
Joined
Jul 21, 2021
Messages
1,269
Likes
1,385
What I personally don't understand is when people talk about what speaker has better soundstage. That question seems like a category error for me until the entailment's of soundstage are explained by the person asking the question. And just to be more clear, I don't want the discussion to devolve into "well this speaker has a higher max SPL before distortion sets in, so the soundstage is great with this speaker if you like to really crank your system up". No.. If we're comparing soundstage between two devices, all other factors must be made equal as much as possible, so we can't have things like different frequency response, that must be equalized before such comparisons are allowed to proceed if one is to demonstrate soundstage as a distinct concept.

So when you ask for clarification of what sort of demonstration of soundstage I'm asking for. I'm asking for things that can be toggled or put on a slider that increase or decrease soundstage, using a concept or concepts that are as distinct as possible from already existing concepts. I don't want a demo of a Spatial Audio remastered album, because that type of soundstage I understand already, that's wholly a post-processing toolbag being used. What I want to understand are those people who make claims about devices having more soundstage than the other, yet don't use already existing concepts as the primary account for the phenomena as I have. If you hold the similar defintion of soundstage that I do, then you wont be doing this, you'll admit all soundstage is comprised of, are the things I mentioned (or if the untouched recording is concerned, it's simply reflections and time delay aspects inherent to the setting where it was recorded). But if you think there's soundstage inherent in a driver or something to that effect, please, show me what I would need to tweak on paper technically speaking in order to increase my driver's soundstage, or decrease it to nothing (witty replies like "low pass most of the sound" doesn't count for obvious reasons).

If you think of it more with the view that different loudspeakers can only degrade the incoming signal to various degrees, then it makes more sense that a particular speaker can have a better representation of the soundstage than another. The speakers can never increase the soundstage beyond what's in the actual recording, if something is heard beyond that it's the listening environment you hear instead of the recorded spaciousness.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
Here is a report about distance cues that find that the D/R is the main source for distance perception. IMO, D/R present on the recording should be preserved when reproduced in-room (no early reflections arriving from speakers or front wall). At the same time, we have speakers in the room as well. Besides summing localisation within the 2 ms window and phantom imaging within the stereo triangle, I think that especially side wall reflections are important to have at a certain level (but outside the 2 ms range), mainly to reduce errors of the stereo system. Such errors cause "identification" of the loudspeakers as sound source and if dispersion is "correct" it will also aid to fill in the 1.5-2 kHz comb cancellation dip that is present in stereo. This may also include a on axis dip 2.5-5 kHz to counteract the comb filter errors. Finally, wall reflections cause some artificial sound sources outside the speakers, leading to widening of the sound stage and making speakers even more "invisible" to the listener. So there are D/R cues to be preserved (mainly in recording) and there are reflections to add to minimise stereo errors and counteract speaker localisation cues for perception of "depth".

 

goat76

Major Contributor
Joined
Jul 21, 2021
Messages
1,269
Likes
1,385
In processing land this can be further manipulated, for example a source can be artificially widened in the left/right domain. If you take a source, such as two mics on a mono guitar cab, hard pan the mics, and delay one side 8-13ms this will "widen" things out. Now is this decision creative and artistic OR simply mucking about and causing phase problems? That probably depends.

A musical example of this would be this album, the mix feels spacious on headphones, but can feel artificially wide on many systems, while I cannot state with certainty, I would guess there is time manipulation going on (probably a decision made in the recent release of it for better or worse) or at minimum that the mics were placed quite far apart in the original recording.

Also, interestingly, if you took the same source and doubled it, hard panned them, and did this, the one that arrived first would be perceived as louder and closer, even though the volume is the same. Even a 1-2ms is quite perceivable for a timing relationship between sources.

Another interesting example of phase manipulation, would from Roger Waters album Amused to Death, specifically the track "Perfect Sense, Part 1," listen to when the piano comes in at 40 seconds, where do you hear it coming from, and where does it move to? On a reasonably good system, in decent room it is jaw-dropping, and while I don't know what wizardry was committed in the mixing process, I would guess it has to do with the phase relationship between the L/R speakers on this particular source.

That's exactly what I do to my own recordings which are highly guitar driven. Two closed microphones with different characteristics (one dark, one bright) with the exact distance on the guitar cab, panned 100% left and right with delay on one side separating them further leaving space for the other sound sources centrally placed in the mix. The delay time will be different depending on what's being played and what best suits the song, and which compromise sounds the best because some phase issues will of course be introduced by the delay.

But all that will of course just affect the width of the guitar, so a third microphone placed at a distance will add some sense of depth and spaciousness to the overall guitar sound.

And like you say, it only takes 1-2ms to separate the channels, but sometimes I use 20-60ms to really add some sense of delay to the sound.

And for the "out of phase" thing used in the Amused to Death album, I had a plugin that I, unfortunately, don't recall the name of, that did the same thing. I played with that to get some handclapping to appear to be coming from a spot on the outside of the speakers, but it only works convincingly with sounds like that, with guitars and similar instruments where the mid frequencies are important, the lack of body is very apparent in a similar way as 3D effects in movies look somewhat transparent. :)
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,523
Likes
3,745
Location
Princeton, Texas
In the text you quoted, I was purely talking about hearing the recorded direct sound, hearing the recorded early reflection, and hearing the recorded reverberant tails, which will all give us the necessary clues that contain the depth and spaciousness of the recorded space.

That's what I thought you were talking about.

I'm not saying we should go overboard with absorbing all reflections, just minimize them to the point where they will not overshadow the direct sound that contains the recorded spaciousness and reverberation.

Agreed!

Note that those reflections contain "the recorded spaciousness and reverberation" because they are repetitions of the direct sound. If they begin neither too early nor too late, are neither too loud nor too soft, decay neither too fast nor too slow, arrive from many directions and are spectrally correct, then they will be beneficial.

Maybe we're saying more or less the same thing but emphasizing different aspects?

The wall behind the speakers can only contain information about the depth of the listening environment and the speaker's relation and distance to that wall, it will not carry any information about the soundstage depth of the recording.

Agreed!

The information about the depth of the recording is in the data of the recording and can only be better heard with a higher ratio of direct sound and less reflected sounds from the listening environment.

Sufficient time delay before the strong onset of reflections enables the spatial information on the recording to be perceptually dominant. I prefer this approach over maximizing the direct-to-reverberant sound ratio.

Quoting Siegfried Linkwitz:

"Reflections generated by the two loudspeakers should be delayed copies of the direct sound to the listener (across all frequencies, with at least 6 ms delay) - under these conditions the direct sound dominates perceptually, the cognitive faculty of the brain better able to separate the static listening room acoustics from the acoustics embedded in the recording.”
 
Last edited:

goat76

Major Contributor
Joined
Jul 21, 2021
Messages
1,269
Likes
1,385
Maybe we're saying more or less the same thing but emphasizing different aspects?

Yes, it's possible we would land on the exact same spot in the ballpark, even if we approach this from somewhat different angles. :)
 

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,221
Likes
5,494
@Tks, thanks for your in-depth response.

If I understand correctly, you consider the word "soundstage" to describe an attribute of a recording, not an attribute of a speaker. Is that correct?
No problem, but no that is not solely what soundstage is definitionally to me. There's that of course (multi channel recording, binuaral etc..), but then there's soundstage that can be added in with post processing effects after recording (reverb, channel panning, and other DSP).

If so, what wording would you use to describe the ability of speakers (or speakers + room, or the entire playback system) to convey the soundstage on a recording?

Hypothetically if I didn't believe soundstage was also a post processing effect, the only relevance a speaker or room itself would have on soundstage, would be basics that make a speaker good objectively as the object at the end of a sound reproduction chain, and that's basically fidelity (fidelity to me simply means a driver as distortion and noise free as possible while being capable of reproducing the typical 20-20Khz range of sound), for anything under 100Hz though I'm sure you'll forgive me for saying a subwoofer would have to help a speaker in virtue of what driver designs are capable of today. And of course, SPL. There's not going to be much "soundstage" or much of anything if I can't hear what's being played, so the speaker will need to maintain fidelity as volume increases. Naturally volume can't increase infinitely for every driver size (which is why the soundstage of calamitous weather will never sound as good as it would when you have something like a movie theater setup).

So, something where you can do a quick A/B comparison? Like the Harman Speaker Shuffler system, or the Bang & Olufsen speaker which can switch between different radiation patterns?

Maybe, it either would be ideal, or pointless depending on if that's what soundstage is being defined as. But if radiation pattern is the primary causative factor, then soundstage isn't really interesting, because that would mean any head movement of mine, would be me editing the soundstage effect. Which is why I would really like to see soundstage demonstrated on headphones or with IEMs since there's less factors to have to take account of. Though if you can get two speakers with identical frequency response and distortion metrics at the same SPL output, I'd be okay with going to AB between those speakers to gauge what this soundstage thing is; that is being referenced.

Seems to me this requirement greatly limits what you would be willing to have demonstrated.

Of course it would, that's precisely the goal if at all achievable. But again, I want a definition first, and then this request could give more leeway, or be made even more restrictive. I don't want to juggle broad generalities to where this phenomena can't be pinpointed or identified with appreciable precision. Truly, ideally speaking. What I would appreciate most, is if someone can boil down soundstage to a single number in the same way something like SINAD is. Sure THD+N are two numbers used to come to eventually the SINAD compilation, thus it's not wholly distinct from other metrics (it's also not distinct because in can be represented in dB scale, which in of itself is it's own concept obviously). But what I won't settle for is the current non-descript subjective gesturing based on anyone you ask of what soundstage is where no technical agreement can be made as to how best to represent it on-paper.

So, it doesn't need to be distinct, but it needs to have parameters that can be toggled to achieve it with granularity and consistency.

Good point. Do you have one?
Umm, yeah, but it's not a distinct concept, simply an amalgamation. My definition was presented broadly in the 3rd paragraph. I think that encapsulates all notions of what people are describing with natural language when they try to gesture what soundstage is. For anyone that doesn't hold to this definition (like the aforementioned people that believe a driver has inherent soundstage properties distinct from the concepts I mentioned) those would need to provide a more robust definition.

I personally don't think you're going to get "soundstage" on a driver design/engineering phase of product creation. If one believes that is the case, I'd love to hear the case being made.

Maybe we should be calling it "spatial quality".
Maybe, but that now begs the question of as to why. If it's the case that spatial quality is primarily distinct from soundstage, then we perhaps aren't talking about the same concept, so it's not clear what utility is gained from talking about a tangential or at best, some orthogonal concept. But if the argument is, soundstage is comprised of components, one being spatial quality, then sure, then we can talk about spatial quality.

The problem one faces when doing this (not including wholly distinct and clear concepts as definitional gears to define a term) is they run the risk of piling on so many concepts into a definition, it loses weight as a term. Like, if I throw in so many concepts to describe soundstage to where it basically sounds like I could be talking about great audio quality recordings in general, it's not then clear why I wouldn't just simply talk about great audio quality recordings - I have no reason nor thrust to be talking about soundstage if it's indistinguishable from great audio recordings..

Here are aspects of "spatial quality" as found in Figure 7.13, page 178 of Toole's book, third edition:

- Definition of sound images
- Continuity of the sound stage
- Width of the sound stage
- Impression of distance/depth
- Abnormal effects
- Reproduction of ambience, spaciousness, and reverberation
- Perspective (artificial or contrived; they are here; outside looking in; close but still looking in; you are there; other)

Don't have the book, but now I see a MASSIVE problem. Just a few sentences ago, I was wondering if 'spatial quality' is mostly distinct from soundstage, or if spatial quality is a subset of soundstage definitionally. It's actually worse now it seems.. It seems soundstage is a subset of spatial quality based on the 3rd aspect you mention from the book. Right there it says "width of sound stage". What this means is 'spatial quality' needs to be tossed out current as a talking point, because one of the premises that serves as a descriptor for spatial quality, is a term we're currently holding under contest or inquiry. Surly you can see why this is a problem. It's like talking about the taste of food, when we don't have a description of what a tongue is.

But if there are technical definitions that are given for each of these 7 aspects to spatial quality, then that's fine. Then we can talk about spatial qualities I guess. Or anything else for that matter at that point.
 

dasdoing

Major Contributor
Joined
May 20, 2020
Messages
4,209
Likes
2,674
Location
Salvador-Bahia-Brasil
depth is in the reverb of the recording > the more room you hear, the less deep (since your room is smaller then the one in the recording).
with that beeing said, not everybody seams to be able to hear a real ambience in a reverb. it is a little like a 3d object in a 2d plane; you need a little imagination for it to work
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
I would say that depth have few cues, dominant being D/R. In relative terms loudness and HF loss are also factors. This is for distances > 1 m. The loss of visual cues needs to replaced by imagination.
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,523
Likes
3,745
Location
Princeton, Texas
Once again, @Tks thank you for your well thought-out and in-depth reply.

It sounds to me like every step of the way you and I would be grappling with definitions of terms, and the definitions of the terms used to describe those terms, and isolating this or that factor under sufficiently controlled conditions, and what is or is not allowed, and what is or is not a theoretically valid concept.

I no longer forsee you and I ever having a meeting of the minds sufficient for you to be comfortable sitting down with a remote control unit in your hand and toggling between Condition A and Condition B. I don't think I could ever satisfy your conditions so we would never get to the point of an actual demonstration, even if there were no logistical hurdles. If you think otherwise, let me know and we can keep trying.

I thank you for exploring the possibility with me this far.
 
Last edited:

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,221
Likes
5,494
Once again, @Tks thank you for your well thought-out and in-depth reply.

I no longer forsee you and I ever having a meeting of the minds sufficient for you to be comfortable sitting down with a remote control unit in your hand and toggling between Condition A and Condition B. I don't think I could ever satisfy your conditions so we would never get to the point of an actual demonstration, even if there were no logistical hurdles. If you think otherwise, let me know and we can keep trying.

And I thank you simply for your patience and determination in digesting what I had to say in it's entirety. I honestly don't like writing a lot, I just have trouble in being confident I am coming across as clear to the other person, so apologies for making you go through those walls of text twice.

Though your posts to others shows that we hold sufficient common ground on the matter. Like when your discussion about reflection properties and their timings. So it seems you and I already see eye to eye on soundstage from a recording procedure. I don't believe there's any mystery there (besides of course the particulars, as even I don't hold to some insanely good and fool proof definitions). My summary of what I take soundstage to be, is born out of trying to put into general terms the things people are subjectively describing. If it's not due to inherent conditions of the recorded sounds based on the settings, and if it's not the post processing of the sound during mastering. Then I truly haven't the faintest of clues of what people are talking about when they invoke the soundstage property of the audio they hear. Though I haven't actually met someone explicitly tell me soundstage has nothing to do with recording setting, and also nothing to do with the post process phase of editing. Thus I don't think most people hold to some crazy idea of soundstage, though I have seen a few claim soundstage is it's own thing (though they stop short of actually generating a compelling descriptor that demonstrates such).

Oh and really fast, besides doing AB demoing, another avenue would be to show me what industry engineers and professionals are doing during design to where they're directly attempting to make a driver have "more soundstage" (if you actually hold to the idea that a driver itself is the actual thing that is reponsible for the effect). If those procedures could be revealed, we could basically skip AB demos entirely.
 

Cote Dazur

Addicted to Fun and Learning
Joined
Feb 25, 2022
Messages
619
Likes
758
Location
Canada
I'm not sure what they're talking about.
3D, in stereo image, may be created by our brain, but it is beside the point, as for many, including me, it is as real as anything.
In my main system I have excellent depth perception on many (most) recordings. In my other systems, the depth can also be perceived, but to a lesser extend. To me, it is all about speaker placement, where you sit and how the room reacts. The electronic or ancillary have very little effect, almost none.
The recording will make the most difference once the room and speaker are optimized.
In my experience, speaker well away from any boundary and nothing between the speaker and between you and the speaker is a huge help.
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,523
Likes
3,745
Location
Princeton, Texas
And I thank you simply for your patience and determination in digesting what I had to say in it's entirety. I honestly don't like writing a lot, I just have trouble in being confident I am coming across as clear to the other person, so apologies for making you go through those walls of text twice.

You're not making this easy. You give me zero excuses to get riled up; quite the contrary in fact. I try blowing you off anyway, and...

Though your posts to others shows that we hold sufficient common ground on the matter. Like when your discussion about reflection properties and their timings. So it seems you and I already see eye to eye on soundstage from a recording procedure.

... you respond with pointing out where we share common ground. So now it's like I'm trying to break up with my conjoined twin.

How about this: Let's forget about me wanting to "demonstrate something", and let's just have the conversation that seems to be happening anyway.

Just so you know, I know very little about the actual recording process; my comments have been mostly if not entirely about loudspeaker/room interaction. I'm just a speaker geek turned small-time manufacturer.

A couple of times you've mentioned "making drivers have more soundstage" (or something like that) being a position taken by some. I'm much more of a radiation-pattern-obsessed guy, but imo it is possible for some drivers to have less precisely-defined sound images. This involves the short time interval preceding the onset of the Haas effect, so let me explain:

The Haas effect (suppression of localization cues from reflections) does not begin at the instant a new sound arrives; it begins after about .68 milliseconds. That initial .68 millisecond "window" is enough time for a sound which arrives from one side of your head to wrap around and reach the other ear, with the arrival time difference at the two ears informing the ear/brain system of the azimuth (horizontal direction) of the sound source. As you well know, after that first .68 milliseconds the Haas effect kicks in and the ear ignores directional cues from reflections for the next 20 milliseconds or so, but during that interval it is still picking up loudness (and therefore timbre) cues.

If a driver or loudspeaker baffle has diffraction or a reflection, that is a re-radiation of the initial sound delayed by an inherently short path-length difference, thus typically arriving at the ears well within that initial .68 millisecond time window. This results in a FALSE azimuth cue (recall that the inter-aural arrival time difference is how the ear/brain system computes azimuth). Well actually it results in four false azimuth cues... two from each speaker, each arriving at the two ears. The result is a blurring of the sound image width. So for good sound image definition, we want drivers (and baffles) which do not have reflections or diffractions arriving within that first .68 milliseconds. Next-best would be for any (hopefully fairly weak) reflections or diffractions to arrive as early as possible within that first .68 milliseconds, as this will result in a smaller-angle "smearing" from the resulting false azimuth cues. One implication of this is, narrow-baffle speakers generally should have better sound image definition than wide-baffle speakers (all else being equal).

So I'm not really into "this driver images better than the rest", but I am into "that driver has inherent imaging problems". Diffraction horns, sharp-edged waveguides, and non-flush-mounting of tweeters are examples of things which can cause a driver to have imaging problems arising from false azimuth cues. (This is not the only issue which can arise from these very early reflections and diffractions, but I don't think it's one that is very well known. Strong early reflections or diffractions can also make the location of the loudspeakers obvious to the ears, which is a distraction from the spatial information on the recording.)

The foregoing probably seems like a trivial tangent in the big scheme of things, but imo one has to get a lot of little things right at the loudspeaker (and room-interaction) level in order for the spatial cues on the recording to become perceptually dominant.
 
Last edited:

goat76

Major Contributor
Joined
Jul 21, 2021
Messages
1,269
Likes
1,385
Lots of talks, not enough music. :)

Have a listen to the following song. Can you hear the recorded three-dimensional room behind your speakers, if so, how far into that room is the drumset positioned according to you?

Low - Dinosaur Act

 

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,221
Likes
5,494
You're not making this easy. You give me zero excuses to get riled up; quite the contrary in fact. I try blowing you off anyway, and...

How about this: Let's forget about me wanting to "demonstrate something", and let's just have the conversation that seems to be happening anyway.
Riled up? I'm a bit confused, did I incite some sort of negative emotion? I hope not.

I read what you wrote, but I'm not sure what you would like to have a conversation about precisely. Nothing seems to be anything I'd disgree with. You bring up loudspeaker baffles having an effect (obviously, as mounting a driver to a glob of jelly or a speaker housing that's made of all-metal will impart obviously different sonic behavior, one that may sound completely odd in relation to another, obviously I'm exaggerating hypotheticals here to simply illustrate I have no contention with the claim as it's straightforwardly acceptable in the way you present it).

There was also the Haas Effect, which can be simulated in the digital realm, which is how many of these mono-to-stereo widening examples have been produced (so again, no qualms here). Though some of your numbers run contrary to what I've been exposed to. Since I'm not as deep into these things as a manufacturer might be, I'd simply take your word for it on authority alone on this front. Oh and by the numbers, I mean that ~20ms number. I can hear differences with channel panned delay of mono sound even at ~10ms to 20ms, though any less than that, it just sounds "louder" on the other channel instead of "wider" overall. Oh and I also was under the impression that the discernability of the 'echo effect' starts at ~40ms, I wasn't aware Haas actually is at ~20ms. Though I doubt any of these particulars are of merit, because as I've said, I'll take your claims in virtue of you being a more experienced authority on these matters, I just thought I'd mention it so you're aware where my baseline were prior.

As far as false Azimuth cues. I'm not sure why we would be evaluating "faulty" baffle or speaker designs that would lead to these false cues based on timings (unless of course this is actually one of the primary aspects of what you take soundstage to be, thus any speaker with the least false azimuth cues, would equal a speaker with the best soundstage, or should I say imaging). Instead we should be taking speakers with minimal reflections and diffraction. There's nothing really particularly interesting about a driver if most of the directionality is driven by things you mentioned like waveguides and such. That's just trivially true, in the same way a massively distorting woofer will eventually distort enough to ruin "imaging" simply in virtue of basically ruining everything about the sound produced as a byproduct.

I know you opened up by telling me most of your talking points from from the perspective of speaker room interactions, but if soundstage exists in things like headphones and IEMs, then I find it too unwieldy to start soundstage discussions since that's yet another thing that needs to be juggled (unless of crouse, again, room reflections are tied to your notion of soundstage definitionally). What would happen for instance if someone asks you: "Yeah, no problem with anything you're saying here, but now start talking to me about soundstage with respect to IEMs". Unless of course you're ready to say that soundstage is simply tied to bore-geometry (in the same way baffle design or waveguides are partly what determine directionality in speakers). And if that's all soundstage actually is, then Y-axis imaging/soundstage is going to need to be explained as to how such a thing is possible in stereo (or better yet, even mono) without the use of visual or pre-contextualized cues (like silly binural demos that psychologically pre-load you by telling you which sound is coming from the top, or which is coming from the bottom). I won't ask that question though, because I first need a definition to the best of your ability, but we've abandoned soundstage as the main talking point, so sorry for even bringing it up like this again. Though I would like to ask you something before I conclude: Do you actually believe something like vertical imaging is remotely possible in a blind evaluation of a recorded sound that was recorded in an anechoic chamber, and you listening to said sound on headphones also in a chamber? I ask this because I want to remove all factors that contribute to generalizing the concept or bogging it down with too many moving parts. So no reflections in the recording, no reflections during playback, and no reflections of the pinna (though for this it would need to be IEMs).

In conclusion, nothing you mentioned here is anything I'd really contest as far as it's effects on soundstage goes, better housing design can fully lend itself to a more "expansive" sound image experience (obviously the actual numbers are for manufacturers to figure out what they think is best, but I can't imagine why anyone would opt for anything less than maximal expansiveness for their given form factor at least where speakers are concerned, unless of course that ruins some other intents for certain metrics).
 

tmtomh

Major Contributor
Forum Donor
Joined
Aug 14, 2018
Messages
2,636
Likes
7,491
At the risk of stating the obvious, it appears we are really talking about three things:

1. Recordings contain differing amounts of information that our brains might interpret as soundstage depth: EQ, reverb, and so on. I'd also add that a number of pop/rock recordings contain a combination of close and distant-mic'd voices and instruments. The first thing that comes to mind is the early Led Zeppelin records, which were not first or unique but were somewhat groundbreaking in using distant mic'ing for drums, and (if memory serves) close mic'ing but with small amps on the lead guitar. There's often a palpable illusion of depth with John Bonham's drums clearly sounding farther back than the rest of the band, because of all the extra ambience from the mic arrangement.

2. Different listening rooms have qualities that might - or might not - allow the depth-illusion in recordings to be easily detected. So if one has the "right" room setup and speaker placement, then one might hear soundstage depth in those recordings that "have" it, while if one does not have the "right" setup, one might hear minimal soundstage depth in all recordings.

3. Some listening rooms might have qualities that actually help increase or flat-out create an illusion of soundstage depth. With such rooms, almost all recordings in one's collection would produce an illusion of depth, as the illusion would be added by the room to all recordings (as opposed to scenario #2, where the room could allow for the preservation of the illusion if it happened to be encoded in the recording.)

My own experience includes items 1 and 2, as far as I can tell. A few years ago I moved into a new house with a dedicated listening room that is larger, more symmetrical, and much better damped than my old one. The old room was carpeted, had upholstered furniture, and did not actively sound "live" - but the new room is definitely deader. The new room provides much better soundstage precision, and it's not subtle. The L-R width is a bit greater, but the more dramatic change is the placement precision within the L-R soundstage - and with many recordings, an illusion of some depth that was basically 100% new to me when I started listening in the new room.

So... since my recordings now seem to vary considerably in how much depth they seem to have, and since the new room quite obviously has fewer reflections than the old one, it seems to me that the most logical (albeit not certain or experimentally verified) conclusion is that the illusion of depth is primarily encoded in the recordings and that the new room is allowing me to hear that.

I have no particular investment in being right about this, though, so if someone has a better hypothesis, I'm all ears (pun intended :) ).
 
Last edited:

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
At the risk of stating the obvious, it appears we are really talking about three things:

1. Recordings contain differing amounts of information that our brains might interpret as soundstage depth: EQ, reverb, and so on. I'd also add that a number of pop/rock recordings contain a combination of close and distant-mic'd voices and instruments. The first thing that comes to mind is the early Led Zeppelin records, which were not first or unique but were somewhat groundbreaking in using distant mic'ing for drums, and (if memory serves) close mic'ing but with small amps on the lead guitar. There's often a palpable illusion of depth with John Bonham's drums clearly sounding farther back than the rest of the band, because of all the extra ambience from the mic arrangement.

2. Different listening rooms have qualities that might - or might not - allow the depth-illusion in recordings to be easily detected. So if one has the "right" room setup and speaker placement, then one might hear soundstage depth in those recordings that "have" it, while if one does not have the "right" setup, one might hear minimal soundstage depth in all recordings.

3. Some listening rooms might have qualities that actually help increase or flat-out create an illusion of soundstage depth. With such rooms, almost all recordings in one's collection would produce an illusion of depth, as the illusion would be added by the room to all recordings (as opposed to scenario #2, where the room could allow for the preservation of the illusion if it happened to be encoded in the recording.)

My own experience includes items 1 and 2, as far as I can tell. A few years ago I moved into a new house with a dedicated listening room that is larger, more symmetrical, and much better damped than my old one. The old room was carpeted, had upholstered furniture, and did not actively sound "live" - but the new room is definitely deader. The new room provides much better soundstage precision, and it's not subtle. The L-R width is a bit greater, but the more dramatic change is the placement precision within the L-R soundstage - and with many recordings, an illusion of some depth that was basically 100% new to me when I started listening in the new room.

So... since my recordings now seem to vary considerably in how much depth they seem to have, and since the new room quite obviously has fewer reflections than the old one, it seems to me that the most logical (albeit not certain or experimentally verified) conclusion is that the illusion of depth is primarily encoded in the recordings and that the new room is allowing me to hear that.

I have no particular investment in being right about this, though, so if someone has a better hypothesis, I'm all ears (pun intended :) ).
I would agree with most of it but also add a factor regarding speaker characteristics and speaker positioning. Factors that "reveal" the speakers (resonances, near-by reflections, stereo system errors etc) is detrimental to the depth illusion and vice versa. Dispersion is one factor (as many also report omni speakers to "disappear"). Since positioning and early reflections are part of the equation, room is as well.

One note though, the D/R in the recording is not what D/R is in reality, since reflections in real life comes from other directions than the direct sound. Our brains interpret a low D/R as if sounds comes from a reverbant room, such as a large hall or church. Probably due to that we and or brains have experienced that before also IRL. But for the "real thing", multichannel or very clever processing s needed.
 

goat76

Major Contributor
Joined
Jul 21, 2021
Messages
1,269
Likes
1,385
At the risk of stating the obvious, it appears we are really talking about three things:

1. Recordings contain differing amounts of information that our brains might interpret as soundstage depth: EQ, reverb, and so on. I'd also add that a number of pop/rock recordings contain a combination of close and distant-mic'd voices and instruments. The first thing that comes to mind is the early Led Zeppelin records, which were not first or unique but were somewhat groundbreaking in using distant mic'ing for drums, and (if memory serves) close mic'ing but with small amps on the lead guitar. There's often a palpable illusion of depth with John Bonham's drums clearly sounding farther back than the rest of the band, because of all the extra ambience from the mic arrangement.

2. Different listening rooms have qualities that might - or might not - allow the depth-illusion in recordings to be easily detected. So if one has the "right" room setup and speaker placement, then one might hear soundstage depth in those recordings that "have" it, while if one does not have the "right" setup, one might hear minimal soundstage depth in all recordings.

3. Some listening rooms might have qualities that actually help increase or flat-out create an illusion of soundstage depth. With such rooms, almost all recordings in one's collection would produce an illusion of depth, as the illusion would be added by the room to all recordings (as opposed to scenario #2, where the room could allow for the preservation of the illusion if it happened to be encoded in the recording.)

My own experience includes items 1 and 2, as far as I can tell. A few years ago I moved into a new house with a dedicated listening room that is larger, more symmetrical, and much better damped than my old one. The old room was carpeted, had upholstered furniture, and did not actively sound "live" - but the new room is definitely deader. The new room provides much better soundstage precision, and it's not subtle. The L-R width is a bit greater, but the more dramatic change is the placement precision within the L-R soundstage - and with many recordings, an illusion of some depth that was basically 100% new to me when I started listening in the new room.

So... since my recordings now seem to vary considerably in how much depth they seem to have, and since the new room quite obviously has fewer reflections than the old one, it seems to me that the most logical (albeit not certain or experimentally verified) conclusion is that the illusion of depth is primarily encoded in the recordings and that the new room is allowing me to hear that.

I have no particular investment in being right about this, though, so if someone has a better hypothesis, I'm all ears (pun intended :) ).

As @Thomas_A said it's important to get the speakers positioning right, they need to "work together" so that the best possible projection of the stereo image can be had. And the room acoustics must be kept at bay, otherwise, the reverberation from the listening room will overshadow the small details that are needed to give us the important cues of the recorded room.

#3 in your list doesn't have much to do with high-fidelity reproduction of the recorded signal, that's a coloration created by the listening room that will, to some agree, overshadow the recording and add its own characteristics to everything that is played. It's not an illusion, that's the real reverberation from your listening environment. But still... I don't think we should eliminate all of that, it can contribute to some enveloping characteristics which I personally like. :)
 

Tangband

Major Contributor
Joined
Sep 3, 2019
Messages
2,994
Likes
2,789
Location
Sweden
I would agree with most of it but also add a factor regarding speaker characteristics and speaker positioning. Factors that "reveal" the speakers (resonances, near-by reflections, stereo system errors etc) is detrimental to the depth illusion and vice versa. Dispersion is one factor (as many also report omni speakers to "disappear"). Since positioning and early reflections are part of the equation, room is as well.

One note though, the D/R in the recording is not what D/R is in reality, since reflections in real life comes from other directions than the direct sound. Our brains interpret a low D/R as if sounds comes from a reverbant room, such as a large hall or church. Probably due to that we and or brains have experienced that before also IRL. But for the "real thing", multichannel or very clever processing s needed.
Regarding acoustical recordings with only two microphones in a concert hall:

It takes the ability to do your own recordings to fully understand what the sence of depth with two stereo loudspeakers is about - in two channel playback its an illusion because recording a grand piano and fixing an illusion of sitting 8 metres from it when doing playback from two speakers , might result that you put your stereomicrophones 1 meter from the middle of the inside of the grand piano .

In real life we also listen with our eyes, and the precedence effect + cocktailparty effect is very real in the concert hall.
The microphones and the ear/brain works very different at distances more than 5 ms airtravel, ie 1,7 meters or more, and early reflections in the recording sounds really bad - thats why one always have rather high microphone stands when doing acoustical recordings .
 
Last edited:
Top Bottom