• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Proximity effect and the psychoacoustically wrong target FR curves

I think there are a few issues, some of which we agree upon, some we don't:
1. Changing the distance of the sound source from the outer ear canal to the inner ear canal, when the ear canal is sealed (think IEM), is there a proximity effect? I think we agree that there is, and you gave an explanation above. My explanation is end correction, but in the end they may very well be compatible.
2. Changing the distance of the sound source from the outer ear to the outer ear canal, when the ear canal is semi-sealed (think open back headphones), is there a proximity effect? I'd argue yes, because even open back headphones can create a semi-sealed pressure chamber. Proximity effect becomes audible when the chamber size decreases.
3. Changing the distance of the sound source from an open space to the outer ear, when the ear canal is open (think near-field monitoring), is there a proximity effect? You say no, and now I tend to agree with you.

And then I think we disagree on the brain's reaction to these proximity effects. Whether the cause of the proximity effect is due to the decrease in distance or the decrease in the size of the chamber, I postulate that when the brain recognizes this change (possibly through the change in temperature, change in airflow or the physical contact of the seal), it anticipates more bass and warmth.
What is important to add, is that volume & distance binding is relative towards the current momentary dynamics, much like contrast. Eliminate the dynamics and you can no longer tell the distance.

One more important aspect is that there actually *are* successful (based on hardware) solutions which let you approximate speaker soundscape on headphones to an impressive extent; at least I heard there are, useful enough to approximate surround systems.

IMO the angle between the ear & the speaker is much more important due to HRTF being dependent on angles; I don't know of any people who would have listened to speakers placed directly towards the ears (barring some, maybe, 7.1? or similar soundsystem examples). Seeing that many vendors (Bose, KRK, Sennheiser as of late) have products with slanted speakers, considered ones with widest soundscape, I would look for the angle dependence rather than the proximity effect.

It is interesting to consider that there might be a situation, where a more-distant sound source has more treble than a less-distant sound source due to being angled differently. In fact, even with near-fields and varying propagation of tweeters you cannot assume that the treble is always correctly present, and yet missing the treble doesn't translate to head having doubts whether speakers are nearer than they are.
 
Haven't read the entire thread. But I don't think the proximity effect is really relevant in headphone design. For starters, I can't find a reputable source supporting the OP's idea that the human ear has a proximity effect like a cardioid mic... Quite the contrary in fact. The diaphragm in the human ear seems to behave more like an omni mic! DistortingJack on Gearspace describes the difference this way: "the ear is pressure-based, not pressure-gradient..." I don't understand physics well enough to know what that means. But will take him at his word.

A human voice and other sound sources will sound bassier very close to the ear than a bit farther away. But I think that is simply the result of how sound waves propogate, and arrive at the ear, at different distances (ie acoustic coupling vs. wave propogation in a room). And it is something which can be easily measured at the eardrum reference point, in dB SPL.

If, for example, you wanted a human voice to sound like it was right by your ear, you could record it on an omni mic and probably add some bass with EQ to acheive that. Or even record it with a cardiod/directional mic, like Wolfman Jack, and other DJ's would do on their radio shows. But is that how you want to hear everything on a pair of headphones? Like someone is standing by your head, shouting directly in your ears?... I think not. And more to the point, the objective science on headphone frequency response preference doesn't really support it.
 
Last edited:
That still leaves the question of why people prefer more bass than a diffuse field response. Which is a different (and unrelated imo) question.

A true diffuse sound field will have equal energy at all angles and (importantly) all frequencies at the listening position. IOW, the DF sound source is flat at all frequencies and directions. But that's not how we produce or listen to recorded music.

We generally listen and master in semi-reflective rooms, usually with two speakers in front of us. And the speakers don't disperse sound evenly in all directions. They disperse more broadly in the lower frequencies. The air and surfaces in the room also have a timbral contribution, and absorb some of the higher frequencies. In short, it is far from a flat diffuse sound field typically used to measure and calibrate a HATS rig, like the HBK 5128.

The characteristics of the room and speakers effect what we hear. And also what we expect to hear in a recording. Subjective testing of both speakers and headphones confirms this.

The direct sound from the speakers may be nearly flat at the listening position, like a flat diffuse field. But the wider dispersion at low frequencies and absorption at higher frequencies means the reflected and diffused sound from the rest of the room will be titled appreciably more towards the lower frequencies. And what we ultimately hear when listening in such a space is a blend of these flat and tilted sounds.

If we have good measurements of stereo speakers made from inside the ears of a HATS rig at the sweet spot or listening position of a semi-reflective room, then we can try to calculate or estimate the difference between that and the HATS in-ear response to a diffuse sound field. Imo, the difference is roughly the equivalent of a good loudspeaker's sound power response on a spinorama plot. Or about -1.0 to -1.5 dB per octave.

The sound power curve on a spinorama graph represents a speaker's diffuse response or output in a semi-reflective room at the listening position, measured on a free-standing mic rather than inside the ear with a HATS. And it generally tends to follow a slope in the -1.0 to -1.5 dB per octave range, above F0 in the bass. This is something you can fairly easily see and confirm on the spinorama plots of various good speakers here on ASR, and elsewhere (including Pierre Aubert's spinorama site).

We don't have many good in-ear measurements of good stereo speakers in semi-reflective rooms to use for this type of comparison though. So the above is just an educated guess on the difference. It appears to align well though with much of the objective and anecdotal information that's currently available on headphone preferences. And also some of Harman's early in-ear tests of speakers in reference listening spaces.
 
Last edited:
Floyd Toole on speakers, and what happens to them in different kinds of rooms. This is from 2015. But I don't think his views have changed much on most of this, based on other recent interviews and papers he's written.


The size of a room, and proximity of the walls to the speakers and listener can also effect the measured bass response at the listening position. So I suppose you could think of that as sort of a "proximity effect". I don't think that's exactly what the OP was describing though. And not sure if this would be analogous to the proximity effect in a cardioid mic.

As I explained above though, imo what's missing from the in-ear response to a flat diffuse field is the timbral contribution of a speaker's dispersion or directivity characteristics, and also the semi-reflective behavior of the room on that sound. And imo these characteristics are fairly well approximated by simply combining the measured DF response of the HATS with a curve or slope approximating a good speaker's sound power curve.

This is the basis of the DF+SP model I'm currently using with HBK 5128 measurements btw.
 
Last edited:
That still leaves the question of why people prefer more bass than a diffuse field response.
One answer is that this is supposedly to compensate the missing tactile bass of sound coming from loudspeakers.
 
Long time HF user here. Decided to start this discussion on ASR since afaik you are more welcoming to science-related discussions.

The proximity effect in audio is an increase in bass or low frequency response when a sound source is close to a microphone. When you speak closely into a microphone, your voice sounds more chesty. When you speak far into a microphone, your voice sounds more "normal". So far so good. Our eardrums also have proximity effects, because they function much like a microphone. When someone speaks to you at a distance, they sound like what they sound like. When someone speaks next to you with their lips almost touching your outer ears, they sound more chesty and intimate. Whispers sound especially different when heard close vs far.

View attachment 324389

Our brain is, unfortunately, adjusted to the proximity effect. When someone you know speaks to you at an extremely close distance, you expect them to sound more bassy. If you've only heard someone at an extremely close distance, you'd expect their "real voice", the voice if they were to speak to you at a normal distance, to be much less bassy than what you've heard, even if you've only heard their "close voice".

And this is why all IEM and headphone targets that intend to "mimic a 2-channel system in a room" are problematic. When we know the transducer is playing music next to our ears (because we have physical contact with the headphones), our brain expect the music to sound a certain way, a way that is different from what we know this music should sound in a good 2-channel system. We'd expect to not feel the bass through the body, but hear it in the ears; we'd expect every instrument to sound drastically closer. Having never heard the music in a 2-channel system likely won't help, because our brain is used to hearing things at different distances and perceiving them at said distance. Having never heard *any* music in a 2-channel system probably helps, or if one lived their life in hearing aids, but then we are talking about a different mindset of musical enjoyment entirely.

But why can't we just apply a known downward-sloping proximity effect FR on top of our existing headphone targets, problem solved? It is more complicated, actually. Every proximity effect I've mentioned so far is mono: a microphone picking up the sound of an instrument, a person whispering into *one of* your ears. The "stereo" proximity effect, by my estimation is a less steep downward-slope than mono. Imagine a person speaking in front of you. Now imagine a person speaking in front of the tip of your nose. They'd sound chestier, but less so than if they were speaking into only one of your ears. By how much? We don't know. In fact afaik there isn't a concept of "stereo proximity effect" in acoustic engineering.

Ok, then do a research on this topic and apply this "stereo proximity effect" on top of our existing headphone targets, problem solved? It is more complicated, again. Remember how the images of instruments are always "in your head" when listening to headphones? The distance of things in music reproduced by headphones are neither "in front of you, closer than the tip of your nose" nor "next to you, closer than touching your outer ears". The distance, far or near, is literally "in your head". I'd guess that our brain is initially quite confused by this too. So what does it say about the proximity effect? I don't know. Could be somewhere between the stereo proximity effect and the mono proximity effect. Could be something totally different.

The "tilted target FR curves" we are seeing these days are a great start. I hope in the future we can use tilt for specific frequencies and/or tilt with a curvature; that'll enable us to mimic proximity effect to a large degree. If you just want to enjoy a good sound, inductive science will certainly get you to where you need the target to be. Who is to say our brain will like a proximity-effect-adjust in-room target anyway? Once you hear it, you'll know it's right! But for the sake of a deductive science, a lot more work needs to be done to understand proximity effect of our eardrum and the psychoacoustics of perceived "distances."
There is some incorrect thinking here.

Proximity effect is specific to directional microphones like cardioids. It is also a property of the recording, not the playback stage.

Headphones should reproduce recorded information neutrally (to your ear). All they really do, as devices, is reproduce a signal as given. As long as timing and intensity is captured well in the recording, the signal will contain all that is necessary.

Directional and distance effects in our hearing are always present. We can rely on them, and we don't need to introduce them again in playback.
 
One answer is that this is supposedly to compensate the missing tactile bass of sound coming from loudspeakers.

Some interesting tests and results, thewas. Thanks for the links on these.

I know that the tactility thing is an area of ongoing interest for a few folks. But I don't really see too much merit in it.

If you prefer more bass in a pair of headphones than the in-ear response of speakers in a semi-reflective room, then I suppose the lack of tactile response could be one of several potential explanations for that. I think there could be other equally good or better explanations for this though. And don't really see much compelling evidence for the tactility idea in the research.

Most of the evidence that I see and hear (including from my own ears) points to the in-ear response of speakers in a room as a good starting point for the baseline levels in headphones. Whether that holds true for all different types of headphones probably still remains to be seen... But that's where my money is.

I use the DF+SP approach with my over-ear headphones for now, because that's about the closest I can get to the above with my current gear and the available measurements. I'm shopping around for some new speakers to add to my setup though. And maybe that'll change some of my thoughts on this.

I've done informal comparisons between headphones and speakers at Sam Ash and Guitar Center in the past though. So doubt my views will change much on most of this... It is what it is, as the kids in LBC used to say (until it isn't).


Word.
 
Last edited:
There is also this interesting phenomenon called the "missing 6 dB".
 
You might be interested in what David Griesinger has to say about proximity - link to PDF. This is in a different context, about why musical instruments sound closer or further away, and it is not about headphones. An interesting observation is that EDT (Early Decay Time) and C80 (Clarity at 80ms, or the ratio of direct sound vs. total sound energy in an 80ms window) does not correlate with preference.
 
There is also this interesting phenomenon called the "missing 6 dB".

This is interesting. But I don't think it would alter my thinking on any of the above. Because the only comparison that really matters to me is the comparison made inside the ear, at the eardrum reference point or DRP. And the article above implies that when you do that in a blind test (with a waveform), there is no mismatch. This might be of some interest to those doing binaural applications though.

You cannot replicate all the temporal and directional qualities of sound in a room though by simply adjusting a headphone's FR to match a speaker's steady-state FR at the eardrum. There's a bit more to it than that!... In a standard pair of headphones though, without the aid of DSP or other effects, that may be about the best you can do.
 
I still recommend using HBK's lower resolution 1/3 octave diffuse field HRTF measurement curve for doing any DF compensated 5128 plots, btw. As opposed to the higher resolution DF HRTF curve developed by Headphones.com and Oratory1990 using free field data. Because the new higher resolution FF-derived DF curve appears to significantly undershoot the correct DF levels in the mid to upper treble (above about 8-10 kHz, give or take).

HBK's 1/3 octave DF HRTF curve has some drawbacks as well though. It's not as detailed. And is really only useful with 1/3 octave headphone measurement data. (Something that I believe was correctly pointed out to me by Oratory1990 and some others on Discord). And it only appears to be useful for compensation below about 15 or 16 kHz in the treble. Above that range, most headphone plots compensated with the HBK HRTF DF curve will drop off rather precipitously, suggesting that the 1/3 octave HBK DF curve may be overshooting correct DF above 16k.

HBK's 1/3 octave DF HRTF also works best with the DF+SP model imo.
 
Last edited:
For those who don't know, the 1/3 octave HBK DF HRTF curve that I'm referring to above is represented by the dashed purple curve on this HD-650 graph (made awhile back by Amir on the HBK 5128 HATS)...

index.php
 
Last edited:
Converting raw headphone measurement data to 1/3 octave resolution isn't especially difficult btw. It may depend on the starting resolution of the plot, but I use a graphic EQ in EAPO's Configuration Editor for this. And simply switch the GEQ graph from the Variable mode to 31-Band mode.

This is what the HBK 5128 plot of the HD 650 above looks like after conversion to a 31-band (or 1/3 octave) plot...

LORES HD650.jpg


And this is what the HD 650 plot above looks like after compensation with the inverse of HBK's 1/3 octave DF HRTF (the same purple curve shown above on Amir's original HD 650 graph, only turned upside down)...

LORES HD650 DF COMPENSATED.jpg


The result is a generally downward sloping curve above F0 in the bass, similar to a loudspeaker's sound power curve.
 
Last edited:
So another method of proximity and position detection, besides stero loudness differential across each ear, is ear to ear timing delay.

Binaural recordings effectively can enhance the stereo image (for headphones/iems) because the two microphones detect the same sound waves at slightly different times, which is interpreted by the brain in a similar way to a GPS system, or pair of cellphone tower triangulation, and so a binaural recording will preserve an approximation of the detected timing differential that a real pair of ears on a real human head would perceive.

Generally, the issue is that the microphones are still placed right next to each other, but for a proper binaural recording they should be placed as though there is a head between them and they are the ears, to properly capture the timing effects that improve stereo image realism in headphones and iems.
 
I should also mention the that volume drop with distance is not due to some kind of psychological effect, but due to physics.

There is a power law at work because sound energy propagates through the medium of air as a radiating wave moving in all three dimensions from the source. As the impulses of wave energy radiate away from the source they effectively lose energy and increasingly so in an unconfined space. That's why a light source looks dimmer from a distance and why the wave height from throwing a rock into a still pool reduces as the waves approaches the shore.

Partly this is an effect of friction within the media as it resists moving with the wave, but mostly this is because as you move father from a source of light or sound radiation you absorb a smaller and smaller fraction of the total energy of the wave, because it radiates in roughly all directions at once. I would expect that volume and light intensity reduces with the cube of the distance, but this is well-known physics, so you can just it look up and not rely on my memory. Either way it's not a linear reduction in volume with distance but exponential, just as with light.

This is easily demonstrated using an SPL meter to detect volume from various distances without changing the volume of the sound source. As distance increases, volume, or the energy of the sound waves against a sensor, reduces. As for the increased bass effect of sealing an iem in the ear canal, that's because bass has such a long wavelength that it needs confinement or distance, to be observable. That's why porting speakers for bass works. Give bass waves a longer path and they are better observed with no change in what the sound source is radiating.

Plus there's the issue of destructive interference often causing certain types of open headphones to have reduced bass response, especially in open planar magnetics. Let's just say it's a terribly design to not confine and port away the 180° out of phase bass or absorb it and you can see awful bass response measured in open planar magnetics quite easily with amir's measurements. It's not bad actuator design, per se, but the effect in planar-magnetics is a well-known issue attributed to destructive interference from 180° out of phase sound waves in the bass frequency and it's not trivial to solve for even in closed planar-magnetics. You have to totally sonically isolate the back side of the diaphragm from the front and somehow make sure the out of phase waves can't get in and interfere destructively with the bass made by the front of the diaphragm. It's a major challenge, and not possible to solve with open planars and why I would never consider buying them or open electrostats. They both produce 180° out of phase sound waves that destructively interfere because the diaphragm produces sounds in both the front and back of the moving surface and it's especially bad in the bass range, and unsolvable with open designs.

A lot of this stuff is just physics.
 
Last edited:
Back
Top Bottom