• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

How do we perceive “soundstage” and “imaging”?

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,273
Likes
9,789
Location
NYC
I know but just wanted to state the fact that a very convincing soundstage experience with headphones is certainly possible. IME it's more convincing than 2 channel stereo via speakers.
Granted. Unfortunately, most music is not recorded this way. That is why the ATMOS/Sony 360 formats are of interest. They would be single-format, streamable files that would play as "binaural" with the suitable app. See: https://www.stereophile.com/content/immersive-audio-aes
 

ajawamnet

Active Member
Joined
Aug 9, 2019
Messages
288
Likes
460
As to headphones, they fail to take into consideration head related transfer/impulse functions and the auris externa - see the lectures from Dr Land at Cornell below...

Also see this -https://www.jneurosci.org/content/24/17/4163

And this: https://pdfs.semanticscholar.org/0e76/923ed6c85fcdd8d9a2f269d5c7493b3c3abd.pdf
"Clearly, localization is not isolated to simply the sounds heard. Many more effects contribute to
localization than that proposed by the duplex theory. Although Wightman & Kistler have shown that a
virtual auditory space can be generated through headphone delivered stimulus, they are still lacking some
key features. The ability to accurately reproduce elevation localization may be a problem for aircraft
simulations. Other cues such as head movements and learning may also help in sound localization. For
commercial applications where localization does not need such accuracy, an average HRTF can be
created to externalize sounds."


Also see this https://pages.stolaf.edu/wp-content/uploads/sites/406/2014/07/Westerbert-eg-al-2015-Localization.pdf
headcues.png

There's been many a patent that discusses trying to get headphones to accurately mimic human hearing and it's interaction to the environment

In addition - see this https://core.ac.uk/download/pdf/33427652.pdf

I wrote this a while ago on the Hoffman forum:
Typically, bass is pretty much omnidirectional below about 80-100 - the entire structure begins moving.

During studio construction one of the things we do with infinite baffle/soffit mounting designs is to isolate the cabinets from the structure to minimize early energy transfer - this keeps the structure from transmitting bass faster than air to the mix location. Sound travels faster thru solids - recall the ol' indian ear-on-the-rail thing?

Why does sound travel faster in solids than in liquids, and faster in liquids than in gases (air)?

One thing you want to avoid is the bass from speaker coupling to the building structure and arriving at you ear sooner than the sound from the speakers. This can cause a comb filtering where you lose certain frequencies due to cancellation.

Google recording studio monitor isolation and note the tons of isolation devices sold for this reason...

Here's a doghouse design for UREL 813's I did a while ago:
hex6.jpg


As to mixing - I rarely use pan pots for directional info in my mixes. I use various time-based methods to try and simulate the precedence effect as well as directional cues and stimulate impulse responses / head related transfer functions (HRTF).
Head-related transfer function - Wikipedia
"A pair of HRTFs for two ears can be used to synthesize a binaural sound that seems to come from a particular point in space."

One thing it does is to really open up the mono field, since instruments are now localized and can be sized depending on the early reflections I set up in something like a convolution reverb.

A great write up here on the Convolvotron:
HRTF-Based Systems – The CIPIC Interface Laboratory Home Page

Part of a great resource for modern sound localization efforts for HMI audio:
The CIPIC Interface Laboratory Home Page – Electrical and Computer Engineering

As to low frequency information:
From Sound localization - Wikipedia :
Evaluation for low frequencies
For frequencies below 800 Hz, the dimensions of the head (ear distance 21.5 cm, corresponding to an interaural time delay of 625 µs) are smaller than the half wavelength of the sound waves. So the auditory system can determine phase delays between both ears without confusion. Interaural level differences are very low in this frequency range, especially below about 200 Hz, so a precise evaluation of the input direction is nearly impossible on the basis of level differences alone. As the frequency drops below 80 Hz it becomes difficult or impossible to use either time difference or level difference to determine a sound's lateral source, because the phase difference between the ears becomes too small for a directional evaluation.[11]

Interesting info here from Dr. Bruce Land on sound localization - end of #25 and into #26


Note the comment concerning using a bus on a DAW to mimic HRTF. Also note his refernece to the CIPIC database .

This prevents the "ear pull" associated with unbalanced RMS levels across the ears. As Dr. Land mentions, your ear localizes based on time as well as amplitude. The interaural time difference ITD (Interaural time difference - Wikipedia ) is as critical as Interaural Level Differences (ILD). As he states, humans learn early on to derive directional cues from impulse responses at the two ears.

One thing that has to be said is the significant differences in head related transfer function between various people - but note the chart where he mentions the one person with a -48dB notch at 6kHz - the curves up to around 5kHz are fairly close and in the A weighted range...

Another great lecture on sound localization from MIT:
20. Sound localization 1: Psychophysics and neural circuits

I used various time-based techniques on this - a remix of Whole Lotta Love from the original multitracks:
Remix of WLL

Watch/listen to the Comparison video...

Another technique is to use double tracking and artificial double tracking (ADT) which will spread the instrument/spectra across the panorama - Automatic double tracking - Wikipedia
- tho this can lead to mono compatibility issues... some effects that do this use a bunch of bandpass filters whereas you can set the delays for each band. Note what George Martin and Geoff Emerick mention using older analog style, tape-based ADT during the Anthology sessions. Again, these techniques reduce the unbalanced feeling across the head but still open up the stereo field to allow all the instruments to sit in the stereo image.

Double tracking - both natural and ADT - is prevalent in a lot of the metal mixes - for instance:
Remix of the Curse of the Twisted Tower
Note that the opening of the first clips was locked into what it is since it didn't exist on the multitracks and was flown in on the original release. But on the other samples compare the mixes and notice they don't sound as disjointed across the panorama as does the older, pan-only mixes. One of the band members commented on how he was able to hear his solos better.[/QUOTE]
 
Last edited:
Joined
Sep 18, 2019
Messages
87
Likes
53
Location
Italy
I would like to ask a sub question in this regard. Let's say we have a mono recording, for simplicity. Will it ever be possible for headphones in the future to make sure that the sound does not form inside the head but in the front with normal recordings? In nature there are very few sounds that we perceive inside our head. Except when we have phlegm in the ears or eat or pat on the forehead
 

thefsb

Addicted to Fun and Learning
Joined
Nov 2, 2019
Messages
796
Likes
657
So, engineers, scientists, and psychoacoustic specialists:
How do we perceive the spaciousness of a recording and how do companies, like headphone or in-ear makers, help create that sense of space?
On the one hand “soundstage” and “imaging” are to me among the many words in the audiophile lexicon that I just cannot get a grip on.

"How do we perceive the spaciousness of a recording..."? sounds more reasonable but don't we first have to make some assumptions about the recording? Like what it is of and how it was recorded?

Most of the recordings I have are stereo and most are not recordings of real acoustic events. Either way I have no expectations of perceiving spaciousness from them and I don't expect much to be gained from trying.
 

ajawamnet

Active Member
Joined
Aug 9, 2019
Messages
288
Likes
460
I would like to ask a sub question in this regard. Let's say we have a mono recording, for simplicity. Will it ever be possible for headphones in the future to make sure that the sound does not form inside the head but in the front with normal recordings? In nature there are very few sounds that we perceive inside our head. Except when we have phlegm in the ears or eat or pat on the forehead

There was an AES paper written about this a while ago:
http://www.aes.org/e-lib/browse.cfm?elib=2591

A paper on out of head Headphone Sound Externalization
http://research.spa.aalto.fi/publications/theses/liitola_mst.pdf

As i mentioned, there's a lot of research and patents that deal with various methods to externalize headphone audio. None that I know of are completely accurate in their representation of sound localization.

http://www.y2lab.com/en/project/3d_headphone_technology/

Video from that page:
Yamaha 3D Headphone Technology : Post-production Demo

https://ieeexplore.ieee.org/document/1461788
 
Last edited:

ajawamnet

Active Member
Joined
Aug 9, 2019
Messages
288
Likes
460
I will also add that I was doing a recording for my kids' middle school - their string ensemble. I had just made an X-Y mic contraption that I wanted to test out. I was using a pair of matched Rode small diaphragm condensers...

I recall putting on the headphones after setting it up - and looking to my right rear to see who was talking behind me... needless to say there was no one there - I was hearing the teacher talking in front of the podium. One of the more bizarre aural experiences I've ever had (and I did sound for stuff like the Pgh Symphony Orchestra at their outdoor gigs - we used a silly expensive Telefunken Stereo mic)

When I got home and looked at on the Lissajous, I was surprised at how tight the pattern was - not like the typical scrambled eggs pattern you see in pop or other types of music, where you're seeing 90 degree and upwards phase between the left and right. Just really tight. And the aural cues were amazing.

And the teacher hated the recording - it brought out every nuance - a lot that were not so good.
 

Darkweb

Active Member
Joined
Dec 3, 2019
Messages
113
Likes
104
On the one hand “soundstage” and “imaging” are to me among the many words in the audiophile lexicon that I just cannot get a grip on.

"How do we perceive the spaciousness of a recording..."? sounds more reasonable but don't we first have to make some assumptions about the recording? Like what it is of and how it was recorded?

Most of the recordings I have are stereo and most are not recordings of real acoustic events. Either way I have no expectations of perceiving spaciousness from them and I don't expect much to be gained from trying.
Do you have a 2 channel system? Frankly I find your post baffling. Recording engineers have numerous tricks to place mic’d instruments or synths all over the place three dimensionally in a mix.

If you aren’t hearing a soundstage and sounds placed within it try pulling your speakers a good 2 or 3 feet off the front wall and listen again.

Only the worst of the worst stereo recordings and gear will be flat with sounds pinned at the speaker baffle.
 

thefsb

Addicted to Fun and Learning
Joined
Nov 2, 2019
Messages
796
Likes
657
Do you have a 2 channel system? Frankly I find your post baffling. Recording engineers have numerous tricks to place mic’d instruments or synths all over the place three dimensionally in a mix.

If you aren’t hearing a soundstage and sounds placed within it try pulling your speakers a good 2 or 3 feet off the front wall and listen again.

Only the worst of the worst stereo recordings and gear will be flat with sounds pinned at the speaker baffle.
I can perceive sounds as coming from somewhere between the loudspeakers. It's not useless but such limited one-dimensional positioning hardly amounts to a perception of spaciousness, such as the experience I have on the street, in a supermarket, in the park, at a basketball game.

I still don't know what it means to hear a soundstage.
 

majingotan

Major Contributor
Forum Donor
Joined
Feb 13, 2018
Messages
1,511
Likes
1,781
Location
Laguna, Philippines
Sorry I should have answered this before. Yes, because virtually all of the recordings I have are 2 channel.

He's not referring to the music files/recordings. He's referring to your setup: 2 channel = 2 speakers, one on the left and another one on the right.


I find that closing my eyes simulate sound stage similar to me sitting in the front row with the bands singing like they have their own sense of space. Some engineers are so good at mixing and mastering where recordings can easily show "depth" allowing you to pinpoint where the mics are exactly placed during the recording and a lot of Netflix shows out there show "lifelike" imaging and sound stage as well
 

thefsb

Addicted to Fun and Learning
Joined
Nov 2, 2019
Messages
796
Likes
657
He's not referring to the music files/recordings. He's referring to your setup: 2 channel = 2 speakers, one on the left and another one on the right.
I answered yes meaning yes I have a two-channel system.

I find that closing my eyes simulate sound stage similar to me sitting in the front row with the bands singing like they have their own sense of space.
I don't. I've no recollection of experiencing anything but stereo from stereo.
 

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,273
Likes
9,789
Location
NYC
I still don't know what it means to hear a soundstage.
OK. In an effort to understand the context of your query:
First, name your 2-3 favorite recordings, please. Then answer these questions:
1. Do you ever hear different voices/instruments differently placed laterally across the front from left to right?
2. Do you ever hear different voices/instruments differently placed in depth from nearer to farther back?
3. Are you ever aware of any reverberations, echos or other sounds surrounding the performers?
4. Can you tell us what your 2 speakers are, how are they positioned with regard to your listening position, how close are they to the wall behind them and any details of your room, such as size, surfaces and large contents?
I've no recollection of experiencing anything but stereo from stereo.
Recreation of a "soundstage" is the most important advantage of a properly constructed stereo.
 

majingotan

Major Contributor
Forum Donor
Joined
Feb 13, 2018
Messages
1,511
Likes
1,781
Location
Laguna, Philippines
I don't. I've no recollection of experiencing anything but stereo from stereo.

I don’t know how you perceive stereo but a lot of us with room treated 2 channel setup experience this incredible reproduction of instruments in the space that make us feel we’re transported to the actual venue. I know there’s a lot of psychoacoustics involved, but once you hear a good 2 channel room treated setup, you’ll understand all the subjective flowering words that audiophiles rave. We here try to understand the science behind those, not that we reject what subjectivist perceive unless they’re clearly snake oil with scientific explanation
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
You will not hear anything else than left-right localisation of sources (width) + reverb/reflections in the recording venue (which can psychoacoustically mimic "depth"). There is no directional cue in the z-axis other than stereo system errors ("rainbow effect", the center phantom image is usually placed above the speakers). Due to some other speaker tricks you can also get the "imaging" as coming from behind the speaker/front wall.

Important for hearing the details in imaging is that no reflections other than from bass region should be below 1-2 ms from the speaker wall side. That will corrupt the intention with the stereo image. So use damping behind speakers - this will both take care of that problem and the SIBR problems.
 

thefsb

Addicted to Fun and Learning
Joined
Nov 2, 2019
Messages
796
Likes
657
OK. In an effort to understand the context of your query:
First, name your 2-3 favorite recordings, please.

Favorite in terms specifically of the recording or of the music? Either way it's tremendously difficult. How about: Rohan de Saram playing Nomos Alpha, Charlemagne Palestine ‎playing Schlingen-Blängen, High Rise playing Pop Sicle.


Then answer these questions:
1. Do you ever hear different voices/instruments differently placed laterally across the front from left to right?
2. Do you ever hear different voices/instruments differently placed in depth from nearer to farther back?
3. Are you ever aware of any reverberations, echos or other sounds surrounding the performers?
4. Can you tell us what your 2 speakers are, how are they positioned with regard to your listening position, how close are they to the wall behind them and any details of your room, such as size, surfaces and large contents?
  1. Yes, that's normal.
  2. Occasionally but as I recall when I have it seems unrealistic and annoying. I particularly remember when the Pat Metheny album Rejoycing was released, it was a problem. I listened again now on my workstation system (Tidal with Genelec near-field monitors) and it still sounds bad.
  3. I'm not sure what you mean. Yes, I can hear those sounds but I can't hear sounds "surrounding the performers", I infer. In a good recording it sounds nice and natural but not 3 or even 2 dimensional, just spread out between the speakers.
  4. I could but I have the same experience throughout my life with many systems including "audiophile" stuff set up for demo, and pro systems in recording and broadcast studios.
 
  • Like
Reactions: PSO

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,273
Likes
9,789
Location
NYC
Favorite in terms specifically of the recording or of the music? Either way it's tremendously difficult. How about: Rohan de Saram playing Nomos Alpha, Charlemagne Palestine ‎playing Schlingen-Blängen, High Rise playing Pop Sicle.
I do not have those on my server but I will scout them out for listening.
  1. Yes, that's normal. OK.
  2. Occasionally but as I recall when I have it seems unrealistic and annoying. I particularly remember when the Pat Metheny album Rejoycing was released, it was a problem. I listened again now on my workstation system (Tidal with Genelec near-field monitors) and it still sounds bad. I will check this one, too, since it seems so notable to you.
  3. I'm not sure what you mean. Yes, I can hear those sounds but I can't hear sounds "surrounding the performers", I infer. In a good recording it sounds nice and natural but not 3 or even 2 dimensional, just spread out between the speakers. Yes and, often, beyond them.
  4. I could but I have the same experience throughout my life with many systems including "audiophile" stuff set up for demo, and pro systems in recording and broadcast studios. Well, those are the variables as well as the recordings themselves, Another is that some individuals have been known not to perceive the stereo illusion.
 

digitalfrost

Major Contributor
Joined
Jul 22, 2018
Messages
1,521
Likes
3,086
Location
Palatinate, Germany
I'm not sure what you mean. Yes, I can hear those sounds but I can't hear sounds "surrounding the performers", I infer. In a good recording it sounds nice and natural but not 3 or even 2 dimensional, just spread out between the speakers.
I had this problem until I started digital room correction. In my case, the speakers were you too bright, just like a picture with the highlights turned all the way up, you cannot see/hear depth anymore.

Nowadays I enjoy recordings where you can hear the air in the room, the space between the instruments very much. See also

https://audiophilestyle.com/ca/reviews/dutch-dutch-8c-loudspeaker-review-r739/#soapbox

https://audiophilestyle.com/ca/revi...ker-comparison-with-binaural-recordings-r768/
 

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,273
Likes
9,789
Location
NYC
Favorite in terms specifically of the recording or of the music? Either way it's tremendously difficult. How about: Rohan de Saram playing Nomos Alpha, Charlemagne Palestine ‎playing Schlingen-Blängen, High Rise playing Pop Sicle.
I could not find the first one on Qobuz but I did find the other two as well as the Pat Metheny track and I understand what you are saying. These are among the most airless and claustrophobic recordings I have come across. There was some hint of ambiance on the Schlingen-Blängen track but only before the music started. I listened for several minutes and all I heard was the sound of 2 speakers with only a bit of difference between them.

As for the Nomos Alpha with Rohan de Saram, I found a couple versions on YouTube and there was one from 1966 (
), which seems to be the same as the one in your link and it, too, sounded as if there was a blanket over the cello (along with the microphones). However, there was a 2012 recording, also with Rohan de Saram and it sounds entirely different. I'll not say more about it but I suggest you give that one a listen (
) and tell us if you hear something different.

There's a wide range of musical taste among these choices but they share a claustrophobic acoustic.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
For x-mas you should listen to

 

Darkweb

Active Member
Joined
Dec 3, 2019
Messages
113
Likes
104
You will not hear anything else than left-right localisation of sources (width) + reverb/reflections in the recording venue (which can psychoacoustically mimic "depth"). There is no directional cue in the z-axis other than stereo system errors ("rainbow effect", the center phantom image is usually placed above the speakers). Due to some other speaker tricks you can also get the "imaging" as coming from behind the speaker/front wall.

Important for hearing the details in imaging is that no reflections other than from bass region should be below 1-2 ms from the speaker wall side. That will corrupt the intention with the stereo image. So use damping behind speakers - this will both take care of that problem and the SIBR problems.
Don’t agree with most of this but I’m too lazy to argue.

However if “rainbow effect” is what gives vocalists a realistic height in my system then I welcome this “error” with open arms. Nothing kills immersion for me quite like hearing a 3 foot tall singer.

interestingly some electronics I’ve tried in my system seem to reduce soundstage height cues but I have no explanation why that would be.
 
Top Bottom