• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Soundstage

Juhazi

Major Contributor
Joined
Sep 15, 2018
Messages
1,725
Likes
2,910
Location
Finland
Here are my measurements of three different speakers at same location in my living room.

No roomeq applied, but MR18w and AINO use dsp-crossovers. Ainogradient is dipole 4-way, but bass is downfire monople and it makes cardioid pattern between 100-200hz - that explains it's better clarity in that range! IR shows that the dipole speaker gives more reflections around 10-20ms. Dipole speaker has most spl response wiggles in midrange, around Schröder limit. MR18 has also downfire bass, ER18 is on a stand, woofer 80cm above floor.

aino er18 mr18 L spl 300ms clarity-vert.jpg
aino er mr L filtered IR-vert.jpg
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Regarding the concerns raised by @Cosmik and others about the rearward wave from a dipole being in opposite phase to the front wave, I can't help noting that the path length of the first frontwall reflection from a pair of stereo speakers will inevitably be different from the path length of the first front wall reflection of an acoustic source in the location of the phantom image.

To illustrate, the actual path length of the first front wall reflection from the right speaker is shown in red, while the path length of an imagined source in the location of a phantom image is shown in pink (speakers are black circles, phantom "source" is a grey circle, and listener is a blue circle):

View attachment 24674

In other words, there is a mismatch between the path length of the first front wall reflection from the speaker and the path length of the first front wall reflection that a real object in the location of the phantom centre would have created.

As a result, the relative phase of the direct sound and the reflected sound will be different from a pair of stereo speakers than it would have been from a sound source in the location of any phantom image.

How different it is will depend on:
  • the location of the phantom image
  • the distances between each speaker and the front wall, the other speaker, and the listener
  • and moreover will vary with frequency
Nevertheless, we interpret sound objects as clearly coming from the phantom centre (and other phantom locations between the speakers).

My point is that having a speaker that produces in-phase output both forward and rearward will never (in stereo) result in the phase of the front wall reflection matching the phase information of a hypothetical sound source in the location of the phantom image - at all or even at most frequencies.

(Of course, the same goes for all other reflections too, but we are discussing dipoles here so I'm limiting this comment to front wall reflections in this case.)

In fact, it is conceivable that dipoles in a certain position relative to each other and the listener, and at certain distances from the walls, will produce a front wall reflection that, at certain frequencies, better matches the front wall reflection that a phantom image would have produced. Indeed, at certain frequencies it's almost certain that this will be the case.

Given this, perhaps it's unsurprising that dipoles don't sound as weird as one might imagine they should.
If I am understanding the point, this is a frequency domain-only conception of how an image is formed.

The real question is how the human hears transients i.e. the most common form of sounds we hear; the 'shape', the envelope, the wavefronts. Humans don't gain much information from continuous tones (see my earlier post about having to create artificial Doppler shift through head movements). Although a nearby wall can give you the same overall phase shift as the front wall etc., the difference between the two will still be signalled by the 'tails' of transients.

If I were designing an artificial hearing system (one that's designed to find the direction of gunfire in a city, perhaps) I would train/evolve/develop/model it on the basis that reflections are derived from the original sound, not an inverted version. To train it for an arbitrary inversion would be to lose discriminatory ability and waste time, effort and resources.

It's a reasonable assumption that humans are likely to have evolved similarly to maximise performance and minimise wasted resources.
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
If I am understanding the point, this is a frequency domain-only conception of how an image is formed.

I think you're misunderstanding ;) Nothing to do with frequency domain here (except insofar as phase is concerned).

The point is this: In any stereo (or multichannel) setup, the path length from the rear of a speaker to the front wall (or indeed any other wall) to the listener will inevitably be different from the (imaginary) path length from the location of the phantom image to the front wall to the listener (see the illustration in my earlier post).

It follows from this that, of course, the phase of the real reflection (from the speaker) will inevitably be different from the phase of an imaginary reflection originating from the location of any phantom image (differing path lengths = differing relative phase).

Think of this by comparing a mono setup with a stereo setup. In the mono setup, all sound originates from a single speaker, and there are no phantom images. The path length from the rear of the speaker to the front wall to the listener will always match the path length from the source of the image, ie it will always be correct, and the phase of the reflection will therefore also always match the phase of the reflection of the image.

In stereo, however, the path length from the rear of the speakers to the front wall to the listener will never match the (imaginary) path length from the location of the phantom image to the front wall to the listener (unless of course the image is hard panned L or R).

If the path length doesn't match, the relative phase of the reflection at the listening position will be false (except of course at certain arbitrary frequencies, and then the delay is still going to be false anyway).

Indeed, how it’s phase is different and the extent to which it differs will depend on the location of the speakers, walls, listener, and phantom image, and will vary with frequency.

Moreover, there will not necessarily be any greater similarity between the phase of the reflection and the phase of an (imaginary) reflected phantom image, regardless whether the rearward wave from the speaker is in-phase or out-of-phase with the front wave.

In all cases, the fact that the true source of the sound (ie the speakers) differs from the location of the phantom image means that the phase of all reflections is false, and indeed not necessarily more nor less false whether the rearward wave is in-phase or out-of-phase with the front wave.
 
Last edited:

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I think you're misunderstanding ;) Nothing to do with frequency domain here (except insofar as phase is concerned).
Phase is what I'm referring to, though. Phase of the individual components of the sound is the frequency domain way of looking at it. The other is to think of it as a transient with a certain shape in the time domain. Mixing the inverted transient with the non-inverted will give rise to shapes that may not be (so easily) discounted by our hearing as 'reverberation'.

To say that this is not possible is to second-guess human hearing and to forget the extremely strong effect of listening to antiphase stereo on speakers or headphones - the latter being 'spectrally correct', and no physical comb filtering has occurred but it makes your head swim. "Ah, but it's decorrelated in the room" is just words, because no one can say that a room isn't statistically going to give you all kinds of combinations of "correlated" reflections.

For sure, the phantom stereo image is not 'real', but it emanates from two speakers simultaneously and for text-book recordings is based on relative volume level not timing. I don't know, but I imagine there's a fair chance that it will work out that for a symmetrical system/room, the reverberation seems consistent with the location of the phantom image - ish.

I can't claim that the room combines with the recording to create a literal recreation of the original acoustic scene, nor does it create an acoustically consistent new scene. (Maybe if the sound comes only from one speaker and is a dry recording it does, but not normally). I see the stereo image as a jewel-like illusion that could be witnessed in a peculiar environment such as an anechoic chamber, or it can be in the comfort of an ordinary room. Just as I can watch a TV in a darkened, non-reflective room for 'perfect' measurements, I prefer to watch a TV in an ordinary room even though, technically it is not as perfect an image. However, I notice that I *never* confuse the TV with the room, or get a sense that the room is distorting the picture or vice versa. Similarly, with ordinary stereo I *never* get a sense of the shape of the room changing - even if I apply some of the techniques that are thought of as 'room correction' and adjust them dynamically. But introduce effectively some new speakers (by inverting the signal from the back of the main ones) and I'm not so sure. Certainly I experience uncomfortable 'in-ear' and 'in-head' sensations when there's anti-phase about (my experience with the Kii Three marred by that due to careless placement, I think). I think my brain is trying to come to terms with what it is hearing - and failing.
 

kevinh

Senior Member
Joined
Apr 1, 2019
Messages
338
Likes
275
The ear brain system will 'ignore' some information and be aware of other information.
Think of why tube amps with bad specs still sound better than some SS amps with 'better' specs. In the amp case the ear will 'mostly' ignore 2nd and 3rd harmonic distortion. Lots of tube amps have larger 2nd harmonic, then smaller 3rd harmonic and little to no higher order harmonics. Some SS amps will have higher order harmonics than the 2nd harmonic & 3rd, these tend to not sound so good (ideally you get something like the NCore amp with great measurable and 'good' harmonics.

With room acoustics it is important to minimize/eliminate reflections <10ms, 15 would be better, you ear will ignore these reflections. The a 'good' decay of reflections (t60) will allow the speakers to deliver the most accurate information to you brain. In this respect controlling the floor and ceiling reflections as well as the wall reflections is very important.
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
Phase is what I'm referring to, though. Phase of the individual components of the sound is the frequency domain way of looking at it. The other is to think of it as a transient with a certain shape in the time domain. Mixing the inverted transient with the non-inverted will give rise to shapes that may not be (so easily) discounted by our hearing as 'reverberation'.

I’m struggling to understand the distinction though. It can’t be the timing, since the timing in both cases is the same.

So the only relevant difference between the front wall reflections of dipoles vs monopoles is the phase inversion. You can only possibly be arguing, then, that the problem is that the phase of the dipole reflection is false.

I’m simply pointing out that the timing is false in both cases to begin with, thus the phase must necessarily also be false regardless.

For sure, the phantom stereo image is not 'real', but it emanates from two speakers simultaneously and for text-book recordings is based on relative volume level not timing. I don't know, but I imagine there's a fair chance that it will work out that for a symmetrical system/room, the reverberation seems consistent with the location of the phantom image - ish.

But this is my point. The reflection is never consistent with the phantom image. And I fail to see how it is bound to be any less or more consistent whether the reflection happens to be inverted or not. Indeed which particular reflection happens to be more correct depends upon entirely arbitrary circumstances (see my earlier post).
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,574
Likes
3,887
Location
Princeton, Texas
OTOH, contrary (I think?) to what you're suggesting, research suggests in fact that strong early reflections actually improve speech intelligibility (clarity). Have a look here, in particular at Section 3.3:

"It has long been recognized that early reflections improve speech intelligibility, so long as they arrive within the “integration interval” for speech, about 30 ms [45]. More recent investigations found that intelligibility improves progressively as the delay of a single reflection is reduced, although the subjective effect is less than would be predicted by a perfect energy summation of direct and reflected sounds."

The published literature gives seemingly at-first-glance mixed messages on the desirability of "early" reflections, but I think there is actually a useful overlap region amid what the various researchers are saying.

I didn't make note of which papers I got these quotes from, but David Griesinger has said:

"The earlier a reflection arrives the more it contributes to masking the direct sound."

(Griesinger uses the term "presence" similar to the way we might use the term "clarity", and "envelopment" similar to the way we might use "spaciousness", "ambience", or "you are there".)

"Envelopment is perceived when the ear and brain can detect TWO separate streams:
A foreground stream of direct sound.
And a background stream of reverberation.
Both streams must be present if sound is perceived as enveloping."

"When presence is lacking the earliest reflections are the most responsible."

"Presence depends in the ability of the ear and brain to detect the direct sound as separate from the reflections that soon overwhelm it."

Griesinger is writing about concert halls but I think the same principles apply to home audio, though the time intervals are of course much shorter.

Earl Geddes is focused much more on home audio, and he has said that reflections arriving before 10 milliseconds are generally detrimental, while reflections arriving after 10 milliseconds are generally beneficial. I don't have a quote in front of me but I worked with him for a little while on the Summa so I'm somewhat familiar with his thinking on the subject.

So Griesinger emphasizes that the ear and brain need to be able to separate the direct sound from the reflections. Imo this implies that, in a small room, ideally we'd have a time gap in between the first-arrival sound and the onset of reflections. Recording studio control room design often deliberately deflects or absorbs the first reflections to accomplish this time gap, and based on the path lengths typically involved, it looks to me like they're often shooting for a no-early-reflection gap after the direct sound of something over 10 milliseconds but usually less than 20 milliseconds. (I'm not advocating studio control room acoustics for home audio, just mentioning it).

So, how does what Toole is saying fit in?

Well, the time interval he gives for "early reflections" which improve speech intelligibility extends out to 30 milliseconds. So Geddes is not saying (nor imo do Griesinger's principles imply) that the entire 30 milliseconds is undesirable in a small room; only the first part of it.

My conclusion for home audio is, if possible, we want to minimize or suppress reflections arriving before 10 milliseconds ("early" within the context of a small room), and encourage or even enhance those arriving after 10 milliseconds. This is imo necessary for the ear/brain system to separate the first-arrival sound from the reverberant sound. Splicing in Toole's wisdom, from 10 to 30 milliseconds the reflections are enhancing clarity with no significant downsides. In my experience those "late" (within the context of a home listening room) reflections also enhance timbre and richness, and they carry ambience cues from the recording which the ear picks out of the background noise (under good conditions). All of this is assuming the reflections are "spectrally correct", which is in part a loudspeaker radiation pattern issue (as is the minimizing of "early" reflections).
 
Last edited:

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I’m struggling to understand the distinction though. It can’t be the timing, since the timing in both cases is the same.
The case of the inversion in stereo headphones is where you have exactly the same timing and the same spectrum yet you still hear the difference (and how!). It is this effect that I am talking about.

I was going to write more, but can we clear that one up first? What is your explanation for the immensely strong effect that you hear when you (a) invert one speaker, and (b) invert one channel of headphones?

The words "correlation" and "de-correlation" may crop up, but is there a binary distinction between those conditions? I suspect that for many people, if it looks really 'furry' on a laptop screen when viewing a tone-based frequency response they decide it's "de-correlated" but we don't listen by looking at a laptop screen.
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,574
Likes
3,887
Location
Princeton, Texas
What is your explanation for the immensely strong effect that you hear when you (a) invert one speaker, and (b) invert one channel of headphones.

(Not directed at me, but I hope you don't mind if I reply.)

Inverting the polarity of one channel affects the first-arrival sound, before the Precedence Effect kicks in. That's why it makes such a big difference.

Once the Precedence Effect kicks in the directional cues have already been extracted from the first-arrival sound and false directional cues from the ensuing reflections - regardless of their phase - are suppressed.
 
Last edited:

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
My conclusion for home audio is, if possible, we want to minimize or suppress reflections arriving before 10 milliseconds ("early" within the context of a small room), and encourage or even enhance those arriving after 10 milliseconds. This is imo necessary for the ear/brain system to separate the first-arrival sound from the reverberant sound. Splicing in Toole's wisdom, from 10 to 30 milliseconds the reflections are enhancing clarity with no significant downsides. In my experience those "late" (within the context of a home listening room) reflections also enhance timbre and richness, and they carry ambience cues from the recording which the ear picks out of the background noise (under good conditions). All of this is assuming the reflections are "spectrally correct", which is in part a loudspeaker radiation pattern issue (as is the minimizing of "early" reflections).

I have read Greisinger (a while back) and generally agree with all that you say about him (subject to my potentially faulty memory of course).

In any case, in a small room, 30ms is just too late for any early reflection. This is indeed why Toole asks us to forget about "envelopment" when discussing small rooms.

Regarding Geddes and his 10ms threshold, I'm less convinced. I'm not sure to which listening studies Geddes refers when he singles out 10ms as a relevant threshold. I'm not aware of any studies suggesting any relevant distinction between pre- and post-10ms arrivals.

I realise that Geddes system design is based on this 10ms rule, which is why he calls for speakers to placed such that they do not produce first reflections off adjacent walls. I'm just not aware of any underlying psychoacoustic research for this position - perhaps you can direct me to something?
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
The case of the inversion in stereo headphones is where you have exactly the same timing and the same spectrum yet you still hear the difference (and how!). It is this effect that I am talking about.

I was going to write more, but can we clear that one up first? What is your explanation for the immensely strong effect that you hear when you (a) invert one speaker, and (b) invert one channel of headphones?

The words "correlation" and "de-correlation" may crop up, but is there a binary distinction between those conditions? I suspect that for many people, if it looks really 'furry' on a laptop screen when viewing a tone-based frequency response they decide it's "de-correlated" but we don't listen by looking at a laptop screen.

The problem is that you're merging two different effects.

I think the most elegant way to demonstrate this is simply to create some demo files to illustrate the differences. Please the find the dropbox folder here.

The audio is an excerpt from a recording by a friend of mine. The first file to listen to is the one labelled "mono". This is simply a mono mixdown of the recording.

Next, listen to "L phase inverted". This file inverts the left channel of the mono mixdown to demonstrate the classic phase inversion effect you refer to - which of course is obvious and unpleasant.

Next, listen to "L 10ms delay". This file does not invert either channel's phase, but delays the left channel by 10ms relative to the right channel. I chose 10ms as it is about the same delay as that of the first front wall reflection for a speaker placed about 1.5m from it. IMHO it sounds a bit strange, but not as strange as the "L phase inverted" track. And it certainly sounds more spacious in quite a nice way (this effect is often used in mixing to give a spacious effect, albeit usually with one "voice" and not the whole mix).

Finally, listen to "L 10ms delay + phase inverted". This file both inverts the phase of the left channel and delays it by 10ms relative to the right channel.

I think you'll agree that, once a 10ms delay has been added to the left channel, inverting it doesn't sound better or worse, weirder or more normal, than when a 10ms delay is added to one channel but phase is not inverted - it just sounds different. The phase inversion effect you talk about is certainly no more present here than when we delay the channel by 10ms but do not invert the phase.
 
Last edited:

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,574
Likes
3,887
Location
Princeton, Texas
In any case, in a small room, 30ms is just too late for any early reflection. This is indeed why Toole asks us to forget about "envelopment" when discussing small rooms.

I think we can get an enjoyable amount of envelopment from the ambience cues on a good recording if we can "unmask" them by managing the reflections.

Regarding Geddes and his 10ms threshold, I'm less convinced. I'm not sure to which listening studies Geddes refers when he singles out 10ms as a relevant threshold. I'm not aware of any studies suggesting any relevant distinction between pre- and post-10ms arrivals.

I realise that Geddes system design is based on this 10ms rule, which is why he calls for speakers to placed such that they do not produce first reflections off adjacent walls. I'm just not aware of any underlying psychoacoustic research for this position - perhaps you can direct me to something?

I'm not aware of any such study either, though the general principle of working with the ear/brain system by suppressing early reflections while encouraging (or at least allowing) later ones has been used in the recording studio world for decades.

My guess is that Earl would say something like "more than 10 ms is better, but try to get a ballpark minimum of 10 ms."

Having spent a lot of time with dipole speakers before I met Earl, his 10 ms recommendation was consistent with my experience (regarding the path-length-induced time delay of the backwave), so no convincing was necessary on me.
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
I'm not aware of any such study either, though the general principle of working with the ear/brain system by suppressing early reflections while encouraging (or at least allowing) later ones has been used in the recording studio world for decades.

My guess is that Earl would say something like "more than 10 ms is better, but try to get a ballpark minimum of 10 ms."

Taking up an old thread here, but @Duke I've been thinking about this 10ms figure and I suspect that the basis of it is to do with interaural cross-correlation. Research has shown that reflections with IACCs closer to zero (highly uncorrelated) tend to increase perceived spaciousness, while IACCs closer to 1 (highly correlated) tend not to.

Below is a a graph reproduced by Toole in his paper Loudspeakers and Rooms for Sound Reproduction - A Scientific Review.

It shows IACC for a 36° lateral reflection at various delays for various types of music program. As delay increases beyond 4-10ms (depending on program), IACC gets closer to zero. Once delay passes 4-10ms, IACC flattens out. If we take 10ms as the more stringent figure, this would suggest that 10ms is a good starting point for our minimum lateral reflection delay for optimal IACC.

1556701971891.png
 
Last edited:

Hipper

Addicted to Fun and Learning
Joined
Jun 16, 2019
Messages
753
Likes
625
Location
Herts., England
Here is how the Precedence Effect works (this is simplified somewhat for the sake of brevity): When a new sound comes along, the ear/brain system grabs its directional cues (among other things) and then puts a copy of the sound into a short-term memory. Then all other incoming sounds are compared with the sounds stored in this short-term memory based on their spectral content. If there is a match, then it is classified as a reflection and its directional cues are suppressed, but it still contributes to loudness, timbre, and spaciousness. After a sound has been in this short-term memory for a little while (about 40 or 50 milliseconds), it is deleted.

By suppressing false localization cues from reflections, The Precedence Effect enables us to localize a sound source in a reverberant environment. But if the time delay for the reflection is longer than the Precedence Effect lasts, the reflection is interpreted as a new sound, and we hear it as an echo.

What is the difference between 'directional cues' and 'spaciousness'? Surely one leads to the other?

Like you I understand that if a reflection is delayed long enough (40 or 50ms after the arrival of the direct sound) it will become an echo, and, as I see it, its directional information will be the point of reflection, so giving an impression of spaciousness. Is that correct?

Mine is a small room, 386cm x 420cm, and therefore perhaps it's to small to get the envelopment that others seem to get. The distance sound travels in 50ms is about 17 metres. The direct sound is 1.5 metres. My first reflections vary between 2.9m and 5.1m, and we can add say 4 metres for second reflections, then 8 metres for third reflections etc.. It seems then that to get beyond the 17 metre/50ms point I need fifth reflections and greater. I've no idea how much energy these will have lost before reaching my ears.

This idea leads to two questions about the formation of a soundstage.

1. Which frequencies are most useful?

2. How much energy/dB is lost from each reflection? I appreciate this depends on what it reflects off but let's assume a plastered brick wall (coincidentally that is what I have!).
 
Last edited:

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,574
Likes
3,887
Location
Princeton, Texas
Hello Hipper, sorry it has taken me more that a year to respond. I had somehow overlooked your post.

What is the difference between 'directional cues' and 'spaciousness'? Surely one leads to the other?

By "directional", I mostly mean azimuth. When you can close your eyes and point your finger right straight at the person talking on the other side of the room, that's because of the directional cues in the first arrival sound, and the suppression of what would be false directional cues in subsequent reflections.

By "spaciousness", I mean the auditory sensation of being within an acoustic space. When you can close your eyes and tell that the room you and the other person share is a medium-sized living room instead of an auditorium or a closet, it's because of the spatial cues.

Like you I understand that if a reflection is delayed long enough (40 or 50ms after the arrival of the direct sound) it will become an echo, and, as I see it, its directional information will be the point of reflection, so giving an impression of spaciousness. Is that correct?

Yes, but we are also picking up spatial cues while the Precedence Effect is in effect. The room need not be large enough to support a distinct echo. With eyes closed we can easily tell by the sound whether we are in the living room, or have inadvertently wandered into the closet again, and neither of these rooms support an echo like an auditorium can.

Two questions about the formation of a soundstage.

1. Which frequencies are most useful?

2. How much energy/dB is lost from each reflection? I appreciate this depends on what it reflects off but let's assume a plastered brick wall (coincidentally that is what I have!).

I don't know which frequencies are the most useful for the formation of a soundstage. I have assumed they all need to be there for the sake of timbre, so I haven't paid any attention to which ones matter most when it comes to soundstage. If I did happen across that information, it was not retained. Sorry!

Nor do I know how many dB (and at what frequencies) are lost with each reflection off a plastered brick wall. My instinct is, not very much energy is lost, and that all audible frequencies are probably reflected with just about the same intensity.

Mine is a small room, 386cm x 420cm, and therefore perhaps it's to small to get the envelopment that others seem to get.

If you have free reign in that room, here are two approaches which might work:

1. Borrow some Maggies or Quads or other fullrange dipoles and set them up along the short wall, about five feet out from the wall behind them. Toe them in very aggressively such that the dipole "null" to the side of the baffle is roughly pointed at the place on the wall where the first sidewall reflection would normally be (resulting in NO early same-side-wall reflections). Notice now that the backwave will bounce two or three times before reaching the listening area, and that the first sidewall reflection for each speaker's frontwave will be the long bounce off the opposite side wall. In other words, we are using the speaker's radiation pattern along with setup geometry to minimize early reflections, while enabling a fair amount of relatively late-onset reflections. I'm not sure what to do about the reflection off the wall behind the listening area. Maybe build large reflectors angled to re-direct those first reflections away from the central sweet spot. Or diffuse them, or absorb them (if your head is within 60 cm or so of the back wall). Or get advice from someone who knows backwall reflections better than I do. But in general I'm in favor of using no more absorption than you absolutely have to.

2. Hire a professional acoustician and tell him what your goals are. Then laugh at me about how far off my suggestions were!
 
Last edited:

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
I find Archimago's definition interesting, as I'd always considered soundstage to be also about the size of the sonic scene and the images therein. In terms of the placement of these objects specifically, I'd prefer the term "imaging". But perhaps my way of looking at it is idiosyncratic - these words get thrown around a lot without any specific definition given.

I agree with the distinction you're making between "soundstage" and "imaging".


Stereophile's Glossary also uses similar definitions:

ambience The aurally perceived impression of an acoustical space, such as the performing hall in which a recording was made.

phantom image The re-creation by a stereo system of an apparent sound source at a location other than that of either loudspeaker.

imaging The measure of a system's ability to float stable and specific phantom images, reproducing the original sizes and locations of the instruments across the soundstage.

stereo imaging The production of stable, specific phantom images of correct localization and width.

soundstaging, soundstage presentation The accuracy with which a reproducing system conveys audible information about the size, shape, and acoustical characteristics of the original recording space and the placement of the performers within it.


Real-stereo (1-mic to 1-channel to 1-speaker) is two-dimensional. It can produce phantom images which are perceived closer or distanced and located between the speakers in the azimuth plane.

Soundstage is an effect which can either be "enhanced" in the production stage by manipulating the signal (pan-potting, dynamic compression, phase, reverb, etc.) or by the user through early side-wall reflections to increase envelopment, soundstage width and phantom image width.
Harmonic distortion and other signal-correlated artifacts may also increase the sense of spaciousness, and a dip in the presense region will produce an effect that will be perceived as a more distanced perspective (as described by the BBC engineers).


One could say that the most accurate reproduction of a real-stereo recording should produce the sharpest images and a soundstage that is limited the space between the two speakers' axis.

cRXtsiW.png


http://www.sengpielaudio.com/Visualization-ORTF-E.htm
jJVQpLZ.png
 

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
But if your channels are playing different signals (which they will be more often than not)

I think that more often than not there'll be a centered phantom image, but when not there'll almost always be some portion of the signal. Hard-panned phantom images are fortunately not that common.
 

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
My understanding is that phase and time consistency are indeed different. Two signals can be in phase, but one is delayed with respect to the other and so be 360° out of phase, i.e back in phase. Similarly, two signals can be a fixed, say, 45° out of phase but time-locked that way, so they stay 45° out of phase. It's easier thinking in terms of single frequency sine waves, and stereo music signals are a random collection of different frequencies, of different amplitudes and with no specific phase relationship, so it's easier to think in terms of time delay rather than phase.

Just as an aside, my old Meridian DSP5000 'speakers achieved balance control not by affecting the level of the signal, but by delaying one channel with respect to the other, thus seemingly pushing back one channel. For an off-centre listener, that had the same effect. They did the same trick in the vertical direction, delaying the drive to each of the 'speakers, to correct for listening height, but I never could tell the difference.

S.

This video is very illuminating:

 

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
That's absolutely right. The "good" recording studios use high quality audio equipment, including speakers such as Genelec, JBL, Dynaudio, Neumann... Some of the control rooms in theses studios are large enough to keep a reasonable distance with the speakers.

A huge majority of classical music engineers/producers use large B&Ws. ESLs are quite uncommon, probably because they require distance from the front wall, are fragile (limited max SPL) and not sold by pro audio dealers (perhaps warranty/replacement issues).

Faulkner, BIS (sometimes) use Quads, Diament doesn't do classical but uses Maggies.
 

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
By all means justify a dipole speaker as a quirky effect, but in terms of using dipoles as exemplars of soundstage, they are no different from a DSP-based artifical spaciousness effect or whatever.

Is the quirky effect of a dipole conceptually that much different from the quirky effect of a wide-dispersion monopole speaker in a room with untreated early reflection zones?
 
Top Bottom