• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

What determines the vertical location of instruments and voices in recordings?

Berlin

Active Member
Joined
May 5, 2021
Messages
269
Likes
489
Location
Berlin
I would actually expect instruments and voices to be reproduced at the height of the midrange speakers and tweeters. With many recordings this is the case. Some recordings, however, are so good that I have the illusion that the singers are standing in my living room, i.e. their voices are reproduced at a height that is above the height of my speakers. I would like to understand the psychoacoustic explanation for this phenomenon...
 

Inner Space

Major Contributor
Forum Donor
Joined
May 18, 2020
Messages
1,285
Likes
2,938
Mixing two-channel stereo allows for total precision for side-to-side and front-to-back placement, but there's no knob or fader marked "up and down" - that dimension is outside of the mixer's control.

Sidebar - occasionally you get random phase effects that combine with a speaker's vertical lobing to produce results on the vertical axis, but they're unpredictable and accidental. There are complex EQs that produce a comb-filter effect to battle the natural comb filter at the listener's head - like Q Sound - but again, the end result, while often intriguing, is fundamentally unpredictable per listener.

In my experience, the result you inquire about is simply great tracking - i.e. microphone placement was great and the vocal was recorded really well, with chest, throat and head sounds all present and correct, such that the live illusion is preserved so well that the listener's brain thinks, yeah, that's a real singer, and therefore pattern recognition and expectation places the apparent mouth at a real-life distance from the floor.

In other words, yes, it's an illusion - convincing, for sure, but not electronically created. It's your brain saying, "That sounds like a person, and most people are between 5' and 6' tall."
 

JayGilb

Major Contributor
Joined
Jul 22, 2021
Messages
1,371
Likes
2,309
Location
West-Central Wisconsin
In other words, yes, it's an illusion - convincing, for sure, but not electronically created. It's your brain saying, "That sounds like a person, and most people are between 5' and 6' tall."
5' or 6' feet is probably the typical placement height of a mid-range driver which radiates in the human vocal range (1k-4khz).
 

Inner Space

Major Contributor
Forum Donor
Joined
May 18, 2020
Messages
1,285
Likes
2,938
5' or 6' feet is probably the typical placement height of a mid-range driver which radiates in the human vocal range (1k-4khz).
Really? 60" to 70" seems high to me. I have large-ish speakers, on stands even though they're supposed to be floorstanders, and the top of the cabinet is at 50", the treble driver at 44", and the mid at about 30" max. I think you're way overestimating.
 

EB1000

Senior Member
Joined
Jan 8, 2020
Messages
484
Likes
579
Location
Israel
Two channel binaural recording heard using headphones should be able to place sounds along the vertical axis. If Dolby/DTS really wanted, they could come up with a listening mode that can decode binaural and pan the sound to the correct channels in an immersive multichannel setup. Unfortunately, no such listening mode exists in mainstream processors. I've tried DSU and DSX NX and Auro 3D with binaural, and it didn't work as expected. My friend has a JBL AVR with Logic16 mode, and he says that it's the only mode that correctly pan binaural recording into 9.1.6 surround.
 

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,690
Likes
6,013
Location
Berlin, Germany
Mixing two-channel stereo allows for total precision for side-to-side and front-to-back placement, but there's no knob or fader marked "up and down" - that dimension is outside of the mixer's control.

2-Speaker playback of 2-channel source material can contain some limited and crude height enconding by triggering HRTF-based cues. This already happens automatically from the angle of incidence from either speaker to either ear, combining a sound field at the ears. The center signals create a comb filter pattern which manifests itself not only as timbre change but also as increased vertical height because the brain interprets the unknown filter pattern to best effort and the closest matching pattern is that from a slightly elevated frontal position. Whereas hard left/right panned sources appear right at the mid/tweeter hight, as does the center in a 3-speaker setup (like Trinaural upmixing).

In the same way, additional height cues can be factored in for parts of the mix by applying specific EQ on top of that, in relation to other broadband or similar image sources, including the use of crosstalk-canceling to have better control of the final sound field near the individual ears.
Further, butnot finishing the list, reverb cues can be "blown up" to be more 3D in the source signal, making for a stronger contrast for the dry image sources notably with the height impression.

It's all very fragile and the mechanisms are rather weak and unstable and easily affected by setup and listening conditions (for example, the comb filter pattern from regular floor bounce plays a big role here, too) plus our HRTFs are higly individual and what works for some people may not for others.
 

Soundmixer

Senior Member
Joined
Mar 8, 2021
Messages
433
Likes
296
If Dolby/DTS really wanted, they could come up with a listening mode that can decode binaural and pan the sound to the correct channels in an immersive multichannel setup
Dolby already has a binaural encoding mode in their Dolby Mastering suite.
 

antennaguru

Senior Member
Joined
Jun 16, 2021
Messages
391
Likes
416
Location
USA
I have a couple of STEREO Flamenco music recordings that very clearly demonstrate vertical imaging - the guitar playing and singing is at the normal heights, while the shoe tapping/stomping is down at the floor with the Castinets clacking overhead above the singing height. It is very system dependent though.
 

youngho

Senior Member
Joined
Apr 21, 2019
Messages
486
Likes
799
Perception of elevation seem to be primarily pinna effects and learned head-related transfer functions, which can result in subjective perception of elevation as related to frequency. This is discussed in Spatial Hearing by Jens Blauert. "Pratt established that auditory events of high musical pitch, which he called "high tones," are localized at a higher elevation angle than are auditory events whose pitch is low, which he called "low tones." Trimble varied the fundamental frequency of the sound event continuously...Roffler and Buttler measured the same effect with greater exactitude...in every case, the elevation angle of the auditory event was described as varying as a function of the frequency of the sound event...while still aware of these results, the present author made a similar but more general observation."
 

Mart68

Major Contributor
Forum Donor
Joined
Mar 22, 2021
Messages
2,610
Likes
4,862
Location
England
I have Curtis Mayfield album 'Superfly' on vinyl and CD. On vinyl playback the image is reproduced well above the tops of the speakers, on the CD it isn't. What accounts for this given the rest of the system is the same. Different mastering? Or some artefact of vinyl playback?

The deck is nothing fancy just an SL1200 with Nagaoka MP50.
 

DDF

Addicted to Fun and Learning
Joined
Dec 31, 2018
Messages
617
Likes
1,355
A friend invited me to sample the incredible height presented by his speakers. It was true, center performers were hovering near the ceiling.

I was intrigued so pulled out the screwdriver and found that one tweeter was accidentally wired in reverse polarity! He was red faced but I thanked him, what a great learning experience. It didn't create a 7kHz peak as normally associated with height cues, it was a binaural effect.

There's no reason that similar trickery couldn't be accomplished in the recording itself, using sub-banded processing to invert higher frequency phase of one channel.
 
Top Bottom