• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The quantification of stereo imaging

March Audio

Master Contributor
Audio Company
Joined
Mar 1, 2016
Messages
6,378
Likes
9,321
Location
Albany Western Australia
The problem you have here is you need to ask what are you quantifying? Stereo image is a complete illusion. It's fabricated by the choices of the recording engineer, producer and artist. Microphones simply don't capture sound in the way we hear it in a real space. They don't do the processing our brains do so you will never hear the real positioning of instruments as you would when you are there in person.

Even simple 2 mic positioning and spacing can lead to quite different effects. Here are 3 files of me walking around a pair of stereo mics in different configurations. ORTF, SSP and XY.

https://1drv.ms/u/s!AnQ0c7fb_4zLgQx2nPioseeakE1O?e=ZaGtHE

https://1drv.ms/u/s!AnQ0c7fb_4zLgQtRnKPw21753Gvt?e=4WdaVN

https://1drv.ms/u/s!AnQ0c7fb_4zLgQ2YfRBxAtRrciVR?e=ZLahzJ


Most studio recordings are mixed and panned. The producer places the instrument where they want and can add effects to give the illusion of more of a 3d positioning.

Beyond this the acoustics of your room and speaker dispertion characteristics play a role.
 
Last edited:

Sgt. Ear Ache

Major Contributor
Joined
Jun 18, 2019
Messages
1,895
Likes
4,162
Location
Winnipeg Canada
I'm in total agreement with several others in this thread that imaging and soundstage are fundamentally unquantifiable because they result from a confluence of things all coming together at the same time and what works for one recording won't for another. It is an illusion. Good speaker positioning and room acoustics as well as good EQ that allows for the recording to be transmitted to your ears as accurately as possible can help, but it still isn't like there is a magic bullet mathematical solution. With my own set up I have a sweet spot that is literally less than a cubic foot in which I get a really nice tight image. Sitting in my listening spot, I can move my head back and forth from that magic spot and hear the image focusing as I do. My go to test track is Bubbles by Yosi Horikawa and with that recording, I not only get very solid 3D imaging where I can hear elements that are close to me and farther away, but I also get very strong soundstage extending beyond the left and right position of my speakers. With my eyes closed, the illusion that elements in the recording are originating 4 or 5 feet outside the physical footprint of my speakers is incredibly strong. It's really fun to listen to. However, other recordings don't achieve anything like that. The sweet spot is still there, but mostly all I get is a really strong sense that the singer is in the room along with just some very good stereo imaging.
 
Last edited:

March Audio

Master Contributor
Audio Company
Joined
Mar 1, 2016
Messages
6,378
Likes
9,321
Location
Albany Western Australia
The mixing engineer is not often at the scene of the initial recording so he will try to transcribe, embellish, or even modify the initial sound scene with spatialization, selection of frequencies used to place each instrument in the audio spectrum, reverbs, delays, compression, and other more sophisticated processing.
Some mixes are very impressive like the famous Roger Waters - Amused to Death perfect sense part1
many of them have nothing to do with the reality of the initial sound recording and you end up with a drum set that is too big or a saxophone that is too big ...
sometimes we are in the circle of confusion, at least more in the realm of illusionism than reality. what do we expect from high fidelity ? the pleasure of an artificial 3d sound stage or the raw reality of a sound recording ?



Sting used Q sound also. Soul Cages, All this time. The guitar pops out right next to your left ear.
 

Sgt. Ear Ache

Major Contributor
Joined
Jun 18, 2019
Messages
1,895
Likes
4,162
Location
Winnipeg Canada
I'll say this though, when it does come together just right it's a pretty profound experience! lol. It's similar to when you stare at one of those stereoscopic 3D posters and you get your eyes focused just right so that the hidden image appears in sharp clarity. It's euphoric! I think it causes a little dopamine release in the brain. I listen to Bubbles probably once a day just to get that... :D
 

Putter

Senior Member
Forum Donor
Joined
Sep 23, 2019
Messages
498
Likes
779
Location
Albany, NY USA
Great thread. When the 3D is happening the sound projects in thick layers way into the room and the speakers disappear. Magnificently cool thing about stereo imaging when it's happening. Jaw dropping coolness. One of the reasons I love electronic music that's crafted with spacey 3D effects. It sounds so damn awesome.

Easy test if/when it's working: Can you close your eyes and find the speakers in your mind's eye? IME it's often easy to see the speakers but it's gratifying to find a system can do it and it's the program material when it isn't. I'm convinced many systems can't and it's a frustrating obstacle when I can't make something I've assembled do it. Maddening with so many variables to sort. One of which (thanks to ASR) is wondering if a crucial component is crappy, failing, or noisy enough to ruin the resolution of the sound.

My experience is that locating speakers has to do with resonances in the speaker, i.e. the speaker calls attention to itself when it vibrates in sympathy to the music. The more vibration free which is to say inert the more it will 'disappear'.

As far as imaging goes, you need to separate it into subsets.

There's the live recording or live in the studio recording with minimal processing where mic placement and mixing are likely the most important factors.

The other kind is the more conventional recording of separate tracks which is made to appear as a 3D or at least 2D image through various production tricks. It's here where Floyd Toole's 'Circle of Confusion' starts with a lack of standardization.
 
OP
dshreter

dshreter

Addicted to Fun and Learning
Joined
Dec 31, 2019
Messages
808
Likes
1,258
It’s measurable. By measuring distances from the speakers to the rear and side walls as well as to the listening position.
Here’s a calculator that uses Cardas method of speaker placement:
Speakers placed along the short wall
Speakers placed along the long wall
It's very helpful that there are guidelines for how to achieve a strong stereo image, but it's not quite the same as being able to measure whether or not the intended result has been achieved. Ideally we could have empirical proof beyond what we can hear.
 

McFly

Addicted to Fun and Learning
Joined
Mar 12, 2019
Messages
907
Likes
1,880
Location
NZ
Also the recently discussed JA interview, he mentions playing pink noise on both channels and seeing if you get a very thin pinpoint location in the middle with no sense it is coming from the sides.

Agreed on the pink noise - There is a test tones/calibration check CD on Spotify I use often which has both pink noise in and out of phase (and much more tones etc), useful for quick and rough setups checking that phase on speakers is correctly wired.
 
OP
dshreter

dshreter

Addicted to Fun and Learning
Joined
Dec 31, 2019
Messages
808
Likes
1,258
Imaging is much more about the room and positioning than the speakers. I can't see any point trying to create quantification to it.
Besides this, some people prefer a more diffuse and wide soundfield, which often also gives wider listening area (like myself, I'm not a mixing/mastering engineer)
I'm fully aligned that imagining is achieved through the combination of the equipment, positioning, and room. I also agree some listeners may prefer a tighter or more diffuse image - the different perspectives on reflection vs absorption vs diffusion may relate to this. However, knowing exactly what you have can help with achieving a preferred outcome, and developing guidelines for what tight vs diffuse imaging actually is.

Like all the other qualities of a sound system discussed here, I don't see why measuring imaging or the localization ability of a system should be dismissed out of hand without evidence to support that claim.
 
OP
dshreter

dshreter

Addicted to Fun and Learning
Joined
Dec 31, 2019
Messages
808
Likes
1,258
I was disabused of the value of the general applicability of soundstage when, many years ago, I realised that most recordings are a cut and paste of individual artists/instruments in an unknown(to me) spatial relationship. Isolation booths, different studios, various cities and countries - different times and spaces. At root, an assortment of isolated/individual 'tracks' manipulated into a 'performance' at a mixing/mastering desk.
I'd like to take this discussion in the direction of localization instead of soundstage, as I think the latter is probably too controversial to adequately define. To your point, a recording does specifically set out to localize the audio across an x-axis relative to the listener. When a system doesn't preserve or communicate that localization, I would propose it has been less successful in achieving accurate reproduction. An extreme example of this would be playing a system in mono should score very low in its ability to articulate localized sound. Similarly, aiming the speakers away so you hear very little direct sound can also result in room filling sound with poor localization or stereo image. That should also score low if this were a measurable attribute at the listening position.
 

Frank Dernie

Master Contributor
Forum Donor
Joined
Mar 24, 2016
Messages
6,454
Likes
15,810
Location
Oxfordshire
One of the most rewarding attributes of a stereo system that is setup well in a room is the presence of clear stereo imaging. For me, this is the sensation that sound is emanating directly from the space between the speakers, but others even go as far as to describe a 3d sound stage that can extend in front or behind the speakers.

I've listened to high quality speakers in poor rooms or located poorly, and that sense of imaging is lost and seems more like a wall of sound or even worse, sensing that the sound is emanating from the speakers. The same happens if drifting too far from the sweet spot in a well constructed system too.

Given the importance of imaging (at least to me), are there methods for measuring by microphone(s) how well a system is performing in this regard? If not, are there ideas for how this could be achieved? I'd love to be able to measurably optimize this attribute when dialing in a room.
IME it is every bit as much a characteristic of the recording as of the system.
I like to use Q sound recordings - Roger Waters Amused to Death has amazing 3d sound if the system is set up OK, and here I mean speaker and furniture position not electronics.
The only time it was badly influenced by my equipment was when a firmware update for my Devialet dual mono amps caused some sort of inter channel latency effect or something which was quickly corrected - presumably not shown up in measurements because the firmware was issued, but obvious on "Amused to Death".
 
OP
dshreter

dshreter

Addicted to Fun and Learning
Joined
Dec 31, 2019
Messages
808
Likes
1,258
I'm going to return to my earlier comment on how auditory localization works:
http://acousticslab.org/psychoacoustics/PMFiles/Module07a.htm
We localize sound based on phase (period-related time), intensity level, and spectral differences between the portions of the sound arriving at each ear. Interaural differences in arrival time (phase) and intensity constitute the most important sound localization cues. The theory outlining their contribution to sound localization judgments is referred to as the duplex theory of sound localization and was introduced by Lord Rayleigh (1877-1907).

One proposed measurement would be to look at the SPL difference between the left ear and right ear location with sound coming from a single speaker. For high frequency content, this should be fundamental to the ability to localize sound, and if the SPL is identical at both ears the sound cannot easily be localized. I'm not suggesting this differential should necessarily be maximized, but this is a good starting point for investigation.

How phase should vary from ear to ear to achieve localization is less obvious to me (only an enthusiast). The article suggests that the Strongest separation in lateralization (i.e. virtual location inside the head) occurs for IPDs = 1/4 of a cycle. So for a test tone, perhaps you could measure the phase differential at the ear locations at say 300Hz from a single speaker. The closer to a 1/4 cycle difference, the stronger the localization, the closer to zero phase difference, the weaker it would be.

Does anyone know if typical microphones like a UMIK-1 are sensitive enough to measure this? Does anyone have ideas on how to actually do these tests?

dIntensity.png
 

audiophile

Active Member
Joined
Oct 7, 2019
Messages
177
Likes
140
Does anyone know if typical microphones like a UMIK-1 are sensitive enough to measure this?
UMIK is great but you’ll probably need an anechoic chamber to measure small SPL variations accurately. In my room it may show up to 0.5 dB difference at some frequencies between two consecutive runs without changing anything.
 
Last edited:

bluefuzz

Major Contributor
Joined
Jan 17, 2020
Messages
1,071
Likes
1,835
A couple of years ago I built myself a pair of Linkwitz LXminis. As many who have heard these speakers would testify, they have a big airy soundstage both deep, wide and tall – very much independent from the speakers themselves. However, in my room at least, the soundstage has never been what I would call 'pinpoint accurate'. More a slightly diffuse blob of sound hovering between the speakers. Vocals especially have the character of a huge disembodied Cheshire Cat like mouth in the center of the soundstage. Quite appealing but not especially realistic.

So, just before Christmas, I bought myself a NAD C658 streamer/DAC thingy which amongst other features has Dirac Live built in. Running Dirac on the LXminis resulted in a quite astonishing transformation of the soundstage. The diffuse blob became the preverbially 'holographic' presentation we all desire. Quite palpably like putting your glasses on. On good recordings instruments are now precisely located within the 'blob' and the Cheshire Cat has shrunk to a much more believable size.

I've since experimented with Dirac on a pair of heavily modded vintage JBL L100s (!) – rebuilt with symmetric driver layout, solid bracing, internal damping and not least a proper crossover using good quality modern components. They already sounded very good (and in some respects have a more precise soundstage than the LXminis) but you probably couldn't find a more 'different' pair of speakers to the LXminis, yet with a bit of tweaking in Dirac I can reproduce a surprisingly similar soundstage.

Whatever Dirac is doing it certainly seems to be a fairly reliable method of creating a realistic and believable soundstage in my room with wildy different speakers.
 
OP
dshreter

dshreter

Addicted to Fun and Learning
Joined
Dec 31, 2019
Messages
808
Likes
1,258
UMIK is great but you’ll probably need an anechoic chamber to measure small SPL variations accurately. In my room it may show up to 0.5 difference at some frequencies between two consecutive runs without changing anything.
That’s a great point, and potentially an issue. Perhaps if simultaneous multi-channel measurements were made you could look at the correlation over time, and see an average dB and phase differential that was still meaningful.
 

March Audio

Master Contributor
Audio Company
Joined
Mar 1, 2016
Messages
6,378
Likes
9,321
Location
Albany Western Australia
.

One proposed measurement would be to look at the SPL difference between the left ear and right ear location with sound coming from a single speaker. For high frequency content, this should be fundamental to the ability to localize sound, and if the SPL is identical at both ears the sound cannot easily be localized. I'm not suggesting this differential should necessarily be maximized, but this is a good starting point for investigation.

Massive problem with that. If you have ever performed any acoustic measurements you will know that there are massive variations created by just moving the mic a few mm. Our brains processing if the various acoustic positional clues is infinitely more sophisticated than trying to measure with a mic.
 
OP
dshreter

dshreter

Addicted to Fun and Learning
Joined
Dec 31, 2019
Messages
808
Likes
1,258
Massive problem with that. If you have ever performed any acoustic measurements you will know that there are massive variations created by just moving the mic a few mm. Our brains processing if the various acoustic positional clues is infinitely more sophisticated than trying to measure with a mic.
I'd suggest that those massive variations seen by moving the mic just a few mm are a good thing, and indicates that the measuring equipment has high sensitivity. The ear/brain system is incredibly sophisticated in terms of how it interprets sound, but I don't follow the train of logic that this can't be measured. To the contrary, amplitude and phase are known to be measurable with high fidelity, and the science points to these being the main drivers of auditory localization.

This is how I would propose to begin experimentation, and see whether or not the SPL difference is easily detectable as a first step.

Theoretically, playing a test tone, I will see SPL difference from left to right location. When placing a speaker dead center, I should see that the SPL is the same between the two locations. If the differential from the left rig (60 degree) is a significant multiple of the differential from the right rig (dead center), then I might have a usable measure.

I'd prefer to do this with simultaneous dual-channel measurements using a rig like the miniDSP EARS, so I could look at the time correlation of the difference in SPL and not just the averaged values. I'm also pretty sure that simultaneous recording would be required to measure a difference in phase. But before I go buying new gear, I'll just see what's achievable by moving a UMIK-1 around a dummy head to do the SPL test. If it's successful I can consider making the additional investment needed for measuring phase as well.

I'll report back if I'm able to do some initial experiments this weekend.

Test methodology.png


https://www.minidsp.com/products/acoustic-measurement/ears-headphone-jig
 
OP
dshreter

dshreter

Addicted to Fun and Learning
Joined
Dec 31, 2019
Messages
808
Likes
1,258
A couple of years ago I built myself a pair of Linkwitz LXminis. As many who have heard these speakers would testify, they have a big airy soundstage both deep, wide and tall – very much independent from the speakers themselves. However, in my room at least, the soundstage has never been what I would call 'pinpoint accurate'. More a slightly diffuse blob of sound hovering between the speakers. Vocals especially have the character of a huge disembodied Cheshire Cat like mouth in the center of the soundstage. Quite appealing but not especially realistic.
I haven't had the opportunity to hear any of the Linkwitz designs before, and would really like a chance to experience for myself. Your descriptions suggest that there could be distinct attributes and measures for imaging intensity and focus. The large airy soundstage suggests high intensity or large scale imaging, and perhaps DIRAC is able to bring a greater level of focus.

For some of the ideas I've proposed here, headphones are an interesting test case. With their inside the head sound, they present very small scale but precise stereo image. Any successful method to measure imaging should be able to quantify both of these qualities.

Thinking further on the problem, channel cross-talk from speakers to ears may not actually be a limitation. By being able to hear the phase shift presented from ear to ear by each individual speaker source, that may actually reinforce the stereophonic image rather than deteriorate it.
 

March Audio

Master Contributor
Audio Company
Joined
Mar 1, 2016
Messages
6,378
Likes
9,321
Location
Albany Western Australia
I'd suggest that those massive variations seen by moving the mic just a few mm are a good thing, and indicates that the measuring equipment has high sensitivity. The ear/brain system is incredibly sophisticated in terms of how it interprets sound, but I don't follow the train of logic that this can't be measured. To the contrary, amplitude and phase are known to be measurable with high fidelity, and the science points to these being the main drivers of auditory localization.

This is how I would propose to begin experimentation, and see whether or not the SPL difference is easily detectable as a first step.

Theoretically, playing a test tone, I will see SPL difference from left to right location. When placing a speaker dead center, I should see that the SPL is the same between the two locations. If the differential from the left rig (60 degree) is a significant multiple of the differential from the right rig (dead center), then I might have a usable measure.

I'd prefer to do this with simultaneous dual-channel measurements using a rig like the miniDSP EARS, so I could look at the time correlation of the difference in SPL and not just the averaged values. I'm also pretty sure that simultaneous recording would be required to measure a difference in phase. But before I go buying new gear, I'll just see what's achievable by moving a UMIK-1 around a dummy head to do the SPL test. If it's successful I can consider making the additional investment needed for measuring phase as well.

I'll report back if I'm able to do some initial experiments this weekend.

View attachment 47220

https://www.minidsp.com/products/acoustic-measurement/ears-headphone-jig

No it just means it's extraordinarily difficult take meaningful measurements.

You would have more success in an anechoic chamber
 
Last edited:

eliash

Senior Member
Joined
May 29, 2019
Messages
410
Likes
211
Location
Bavaria, near lake Ammersee
Many of you still have a "test-switch" built into your amp, it´s the mono-switch.
Just listen to different pieces of music in mono and evaluate where you hear the sound of each instrument and voice. Of course the monophonic image is not perfectly stable and hovers a bit.
To quantify good from bad, I (for myself) came to the conclusion that at least voices (especially their upper formants) should not deviate too much from the center position, because that sounds specificly artificial to me, if suddenly popping up from a different direction.
If so, you can work on opposite* wall reflections, which helps to control that, listening in stereo mode afterwards.

*sometimes also from adjacent wall reflections, e.g. even small metallic surfaces from an equipment's front plate at the side...
 
Last edited:
Top Bottom