• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Dr. Edgar Choueiri explains BACCH

No, my head does not need to be "in a vice" to enjoy the benefits of Bacch. Not with headtracking, als not without headtracking. And not even with the ubacch intro version (without microphone measurements). But the bacch effect is the strongest with the head tracking on.
 
Sorry for the question, there is one thing that makes me puzzled about XTC.
XTC systems are stated to be particularly suitable where binaural encoded audio is reproduced (with binaural it seems to me that we mean exactly the coding of the sound as it would reach the ear canal in a real audio context).
But, even if we are in the presence of an ideal XTC with infinite cancellation of the crosstalk and an anechoic room, the binaural audio emitted by the speaker is again subjected to our HRTF before arriving in the ear canal (our head is there), and the perceived spatiality altered.
The possible compensation of this phenomenon is never mentioned in the explanation of BACCH or other XTC (ie. the convolution with the inverted HRTF, which has nothing to do with the use of HRTF for the calculation of the xtalk cancellation).
Is it something implied, or does the double HRTF involve an equally valid result in spatial terms?
 
Last edited:
Your puzzlement is correct and makes a good point. My take on binaural recording is that we do not need to try to correct or record individual HRTF, because it is all individual for each of us. The dummy head is using nobody's HRTF, just the ITD and ILD of the sound at the head. That gets sent to our ears on playback, and we proceed to use our own HRTF to finish the process on its way to our brains. If the speakers used for the XTC to your ears are flat, then the sound that you get into your ear canal will be exactly as if your ears were substituted for the dummy head's ears at the location, and then will be processed by your hearing mechanism's HRTF.

If your speakers and room are not flat - for a variety of reasons - the BACCH tone sweep will use your head and the little measurement mikes in your ears to measure the speaker/room combination and compensate so that they ARE flat. So all that is happening is that he is using your head as a measurement mike, which is probably a valid idea. It can find anomolies that are otherwise not discoverable with normal measurement systems such as YOUR pinna effects. Loudspeaker binaural is also great because it compensates for the IHL, of In Head Locatedness that headphones have when you move your head. BACCH of course compensates for that with the head tracking camera.

I think what we need is a diagram, or flow chart, of all of the effects that happen through the whole process from dummy head to your head and what the various systems do about them. Working on it!
 
Your puzzlement is correct and makes a good point. My take on binaural recording is that we do not need to try to correct or record individual HRTF, because it is all individual for each of us. The dummy head is using nobody's HRTF, just the ITD and ILD of the sound at the head. That gets sent to our ears on playback, and we proceed to use our own HRTF to finish the process on its way to our brains. If the speakers used for the XTC to your ears are flat, then the sound that you get into your ear canal will be exactly as if your ears were substituted for the dummy head's ears at the location, and then will be processed by your hearing mechanism's HRTF.

If your speakers and room are not flat - for a variety of reasons - the BACCH tone sweep will use your head and the little measurement mikes in your ears to measure the speaker/room combination and compensate so that they ARE flat. So all that is happening is that he is using your head as a measurement mike, which is probably a valid idea. It can find anomolies that are otherwise not discoverable with normal measurement systems such as YOUR pinna effects. Loudspeaker binaural is also great because it compensates for the IHL, of In Head Locatedness that headphones have when you move your head. BACCH of course compensates for that with the head tracking camera.

I think what we need is a diagram, or flow chart, of all of the effects that happen through the whole process from dummy head to your head and what the various systems do about them. Working on it!
If you record with a dummy head, like Neumann KU100 that also has ears, the audio will be HRTF encoded because the head will determine ILD, ITD and spectral cues imposed by the diffraction of the sound waves around the torso, head and pinna of the dummy. This is also explained on the first page of this thread. Clearly the HRTF will be a typical one on whose basis the dummy head was modeled.
If you play this back in headphones it's all good, net of the differences between dummy HRTF and personal HRTF. But with speaker + XTC, if you don't have a perfectly flat in-ear response, the sound recorded that way will not reach your ear canal intact because your head and your ears are in between. In this way, for example, the famous boost around 5kHz due to HRTF could be introduced twice to the signal reaching your ear canal.
The only way to preserve the original information (the one containing the correct spatial cues) would be to compensate for this effect by applying its inverse to the signal, analytically or applying calibration with in ear microphone.
Exactly as you explain. I find myself in this.
But, at this point the non-HRTF encoded audio would be reproduced "differently". In fact the in ear headphones are tuned to the Harman target or something similar that emulates typical speaker in ear response because of this.
This is why I am perplexed. But perhaps the practical difference is not so significant.
 
Last edited:
I am puzzling over whether HRTF should include pinna effects. If you say it includes the HRTF of the dummy head, then why would that be correct for us? I was thinking that the dummy only uses the ILD and ITD of the sound reaching its ears, and then records a flat response for those signals and sends them to us, where we use our own HRTF to listen. That would make sense. If we use over ear or XTC loudspeakers, the signal would then be processed by our own HRTF and would sound like we were there - plus or minus head tracking for the headphone situation or not needed for loudspeaker binaural.

Then, if the degree of success of the you are there effect depends on the degree to which we can eliminate the actual playback room, the loudspeaker binaural could be corrected for the room with the impulse response to the tone sweep using the in-ear microphones.

The steps would be:
ITD and ILD at the dummy head
recorded binaural signal
entrance at your outer ears binaurally with over ear headphones or loudspeaker XTC
Freq response room processing with in-ear microphones
processing by OUR HRTF complete with pinna effects
impression by the brain as if we were at the recording site.

As mentioned, headphone listening would require head tracking for externalization, loudspeaker binaural would need room correction measured at the ears.

As you say, important that all of these steps get processed only once.
 
I am puzzling over whether HRTF should include pinna effects. If you say it includes the HRTF of the dummy head, then why would that be correct for us? I was thinking that the dummy only uses the ILD and ITD of the sound reaching its ears, and then records a flat response for those signals and sends them to us, where we use our own HRTF to listen. That would make sense. If we use over ear or XTC loudspeakers, the signal would then be processed by our own HRTF and would sound like we were there - plus or minus head tracking for the headphone situation or not needed for loudspeaker binaural.

Then, if the degree of success of the you are there effect depends on the degree to which we can eliminate the actual playback room, the loudspeaker binaural could be corrected for the room with the impulse response to the tone sweep using the in-ear microphones.

The steps would be:
ITD and ILD at the dummy head
recorded binaural signal
entrance at your outer ears binaurally with over ear headphones or loudspeaker XTC
Freq response room processing with in-ear microphones
processing by OUR HRTF complete with pinna effects
impression by the brain as if we were at the recording site.

As mentioned, headphone listening would require head tracking for externalization, loudspeaker binaural would need room correction measured at the ears.

As you say, important that all of these steps get processed only once.
From Wiki:
[HRTF] it is a transfer function, describing how a sound from a specific point will arrive at the ear (generally at the outer end of the auditory canal).

Therefore it includes all the effects of external human anatomy.

Dummy head are modelled on a statistically representative HRTF of humans and created for recording binaural audio or make measurements in a way valid for most.
 
Last edited:
You must have meant arrive at the pinnae - if it just means arrive at the ear canal it would have bypassed the pinnae.

1741257810814.png


This is why I want to know if HRTF includes the pinnae (outer ear) or just the ear canal and eardrum with cochlea.
 
From ChatGPT:
Yes, HRTF (Head-Related Transfer Function) includes the effect of the pinnae. The pinnae play a crucial role in shaping sound waves before they reach the eardrum, especially for high frequencies. Their unique shape causes frequency-dependent reflections and diffractions that help the brain determine the elevation and front-back position of a sound source. HRTFs capture not only the effects of the pinnae but also the influence of the head, shoulders, and ear canal, making them essential for accurate spatial audio rendering.
 
Last edited:
1741264886867.png


It seems that speakers performing XTC, including Bacch, assume that individual HRTFs are nullified and that users are listening to uncorrected binaural recordings.
In most cases, binaural compensation is applied on the audio source side. If the speakers were to apply inverse compensation (nullifying HRTFs), the system would become incompatible with standard stereo playback, completely disrupting the tonal balance.
It also serves to reduce crosstalk through frequency-dependent normalization, achieving minimal loss while maximizing cancellation in a compromised yet effective manner.

1741265340037.png


And based on that, using the response from another user for whom I applied cancellation as an example, when observed in a normalized state, in a semi-anechoic space of a certain level, it is possible to achieve uniform XTC beyond approximately 700Hz, as described in the paper (around 19–22dB).

1741265432827.png


In this way, it is possible to achieve a uniform XTC level across all frequency bands. However, from my listening experience, this did not seem like a significant advantage. (A high XTC level is only necessary when the recorded audio contains extreme ILD.)
Therefore, bypassing XTC for the bass region and instead placing additional speakers (or drivers) dedicated to bass to achieve a larger ITD proved to be more advantageous and effective.
Personally, I tested this setup using a 450Hz linear-phase 6th-order LR crossover, creating an ITD of approximately 750µs. In this configuration, the L and R channels were set to have a 90-degree phase difference.

So, Returning to the original topic, I had no intention of interrupting your conversation, but I wanted to share a brief thought.
It seems that both of you are considering the issue from the perspectives of recording and playback.
Now, instead of speakers, imagine the playback scenario using IEMs or headphones.
Would you nullify the HPTF of the IEMs or headphones you are wearing?
(Of course, IEMs and headphones should have appropriate compensation filters tailored to individuals, but that is a separate matter from this discussion.)
 
Last edited:
View attachment 433712

It seems that speakers performing XTC, including Bacch, assume that individual HRTFs are nullified and that users are listening to uncorrected binaural recordings.
In most cases, binaural compensation is applied on the audio source side. If the speakers were to apply inverse compensation (nullifying HRTFs), the system would become incompatible with standard stereo playback, completely disrupting the tonal balance.
It also serves to reduce crosstalk through frequency-dependent normalization, achieving minimal loss while maximizing cancellation in a compromised yet effective manner.

View attachment 433713

And based on that, using the response from another user for whom I applied cancellation as an example, when observed in a normalized state, in a semi-anechoic space of a certain level, it is possible to achieve uniform XTC beyond approximately 700Hz, as described in the paper (around 19–22dB).

View attachment 433714

In this way, it is possible to achieve a uniform XTC level across all frequency bands. However, from my listening experience, this did not seem like a significant advantage. (A high XTC level is only necessary when the recorded audio contains extreme ILD.)
Therefore, bypassing XTC for the bass region and instead placing additional speakers (or drivers) dedicated to bass to achieve a larger ITD proved to be more advantageous and effective.
Personally, I tested this setup using a 450Hz linear-phase 6th-order LR crossover, creating an ITD of approximately 750µs. In this configuration, the L and R channels were set to have a 90-degree phase difference.
Interesting, thanks.
I'm not sure I understand though.
For listening with XTC to the binaural tracks must the diffuse field eq be applied to the track or not?
 
Interesting, thanks.
I'm not sure I understand though.
For listening with XTC to the binaural tracks must the diffuse field eq be applied to the track or not?
I don't know what you are listening to.
However, most binaural recordings are already equalized.
If it’s something you recorded yourself, you would need to apply EQ manually.
Of course, as I just mentioned, you can apply it to the recorded audio itself, or as you and the other person were discussing, you could apply it by nullifying the HRTF/HPTF of the speakers, IEMs, or headphones.
Thinking about the outcome of this should make things clear. (Imagine switching from listening to a binaural recording to a regular stereo recording. It should be evident whether the adjustment should be applied on the playback side or the recording side.)
 
I don't know what you are listening to.
However, most binaural recordings are already equalized.
If it’s something you recorded yourself, you would need to apply EQ manually.
Of course, as I just mentioned, you can apply it to the recorded audio itself, or as you and the other person were discussing, you could apply it by nullifying the HRTF/HPTF of the speakers, IEMs, or headphones.
Thinking about the outcome of this should make things clear. (Imagine switching from listening to a binaural recording to a regular stereo recording. It should be evident whether the adjustment should be applied on the playback side or the recording side.)
I mean, a binaural track should have applied the Diffuse Field EQ to be adequately listened to on normal speakers.
But is this equalization valid even if you listen it with speakers with BACCH applied?
 
I mean, a binaural track should have applied the Diffuse Field EQ to be adequately listened to on normal speakers.
But is this equalization valid even if you listen it with speakers with BACCH applied?
I'm not sure what nuance you're trying to convey. I don't understand why you're distinguishing Bacch or XTC from regular speakers.
I believe I've already addressed this:
  1. When listening with IEMs/headphonesBacch or other XTC: Crosstalk is either nonexistent or reduced to an inaudible level.
  2. When listening with regular speakers → Crosstalk is present.
The presence or absence of crosstalk does not alter your HRTF (or the response of the audio source).
Therefore, whether or not a compensation EQ is applied has no relation to the presence or absence of XTC (or the original binaural state, as with IEMs/headphones) in the context of the current discussion.
If the track you are listening to (or recorded yourself) has not been pre-equalized, then yes, you should apply EQ. This applies regardless of whether you are listening through XTC-enabled speakers, IEMs, or headphones.
If the track has already been pre-equalized, there is no need to think further about the audio source. Strictly speaking, extracting and applying a personalized diffuse field (DF) from an individual's full-sphere HRTF would be the most optimal approach for one's own recordings. However, the difference is not significant and does not have a major impact.
The question you're asking—"Is this pre-equalization EQ (for binaural recordings) valid when listening through Bacch (or other XTC-enabled) speakers?"—is the same as asking, "Is this pre-equalization EQ (for binaural recordings) valid when listening through IEMs or headphones?"
 
Last edited:
I'm not sure what nuance you're trying to convey. I don't understand why you're distinguishing Bacch or XTC from regular speakers.
I believe I've already addressed this:
  1. When listening with IEMs/headphonesBacch or other XTC: Crosstalk is either nonexistent or reduced to an inaudible level.
  2. When listening with regular speakers → Crosstalk is present.
The presence or absence of crosstalk does not alter your HRTF (or the response of the audio source).
Therefore, whether or not a compensation EQ is applied has no relation to the presence or absence of XTC (or the original binaural state, as with IEMs/headphones) in the context of the current discussion.
If the track you are listening to (or recorded yourself) has not been pre-equalized, then yes, you should apply EQ. This applies regardless of whether you are listening through XTC-enabled speakers, IEMs, or headphones.
If the track has already been pre-equalized, there is no need to think further about the audio source. Strictly speaking, extracting and applying a personalized diffuse field (DF) from an individual's full-sphere HRTF would be the most optimal approach for one's own recordings. However, the difference is not significant and does not have a major impact.
The question you're asking—"Is this pre-equalization EQ (for binaural recordings) valid when listening through Bacch (or other XTC-enabled) speakers?"—is the same as asking, "Is this pre-equalization EQ (for binaural recordings) valid when listening through IEMs or headphones?"
Not conveying anything, I have not exposed my doubt correctly, sorry. I understand that the Diffuse Field EQ must be applied to binaural track to compensate for the HRTF introduced by our anatomy, which comes into play when we listen to normal speakers. Otherwise the binaural audio tracks would sound bad with this setup.
What I don't understand is if this is equally valid with a system calibrated to have a flat in ear response, and if BACCH is actually calibrated to this target (as @geickmei said before). From your speech I seem to understand not though.
 
Last edited:
Not conveying anything, I have not exposed my doubt correctly, sorry. I understand that the Diffuse Field EQ must be applied to binaural track to compensate for the HRTF introduced by our anatomy, which comes into play when we listen to normal speakers. Otherwise the binaural audio tracks would sound bad with this setup.
What I don't understand is if this is equally valid with a system calibrated to have a flat in ear response, and if BACCH is actually calibrated to this target (as @geickmei said before). From your speech I seem to understand not though.

Okay, Let's clarify a few things.

1. "Flat in-ear response" Why make the ear response flat?
2. In BacchORC, binaural room correction is often performed using a microphone calibrated under free-field conditions.
1741307261611.png


1741307119102.png

1741307205675.png

But this is target-corrected, similar to how correction curves are viewed based on RAW measurements and DF/Harman targets in headphone or IEM measurements.

Are you saying that the in-ear flat response you mentioned is simply a correction that flattens all resonances?
Or does it mean that EQ was applied based on such a correction target curve?

Think about it again. I don't understand why you keep assuming that a specific EQ would or wouldn't be valid in XTC and regular stereo.
Don't focus too much on the fact that it's measured with an in-ear microphone—think about it conceptually. (Even without measurements, all of your HRTF is inherently reflected in the signal played through the speakers in real time, which should make it easier for you to understand.)
  1. Sound is played from the left speaker. At this moment, the left ear hears it first.
  2. After a time delay corresponding to the ITD (Interaural Time Difference), the right ear hears it. At this point, the right speaker emits a signal to cancel this. This cancellation signal must be distorted—if it weren’t, undesirable side effects from perfect cancellation would occur, so distortion is necessary to prevent such errors.
  3. At this stage, the right-ear signal from the left speaker is canceled. However, the left ear hears the cancellation signal, and this process repeats.
Are you listening through speakers (or speaker virtualization— IEMs, headphones)?
As you mentioned, try applying inverse compensation for your body response while listening to regular stereo music. That wouldn’t be appropriate for regular stereo, right? The same applies to binaural music.
You and other user are confusing the respective compensation areas between recording and playback.
 
Last edited:
Okay, Let's clarify a few things.

1. "Flat in-ear response" Why make the ear response flat?
2. In BacchORC, binaural room correction is often performed using a microphone calibrated under free-field conditions.
View attachment 433879

View attachment 433876
View attachment 433877
But this is target-corrected, similar to how correction curves are viewed based on RAW measurements and DF/Harman targets in headphone or IEM measurements.

Are you saying that the in-ear flat response you mentioned is simply a correction that flattens all resonances?
Or does it mean that EQ was applied based on such a correction target curve?

Think about it again. I don't understand why you keep assuming that a specific EQ would or wouldn't be valid in XTC and regular stereo.
Don't focus too much on the fact that it's measured with an in-ear microphone—think about it conceptually. (Even without measurements, all of your HRTF is inherently reflected in the signal played through the speakers in real time, which should make it easier for you to understand.)
  1. Sound is played from the left speaker. At this moment, the left ear hears it first.
  2. After a time delay corresponding to the ITD (Interaural Time Difference), the right ear hears it. At this point, the right speaker emits a signal to cancel this. This cancellation signal must be distorted—if it weren’t, undesirable side effects from perfect cancellation would occur, so distortion is necessary to prevent such errors.
  3. At this stage, the right-ear signal from the left speaker is canceled. However, the left ear hears the cancellation signal, and this process repeats.
Are you listening through speakers (or speaker virtualization— IEMs, headphones)?
As you mentioned, try applying inverse compensation for your body response while listening to regular stereo music. That wouldn’t be appropriate for regular stereo, right? The same applies to binaural music.
You and other user are confusing the respective compensation areas between recording and playback.
As I said, I exposed my question wrongly. XTC and HRTF compensation in binaural audio are clearly two independent things.
What was escaping me was the fact that diffuse field EQ can/must be applied in production of binaural tracks in order to be listened to on speakers, and, that BACCH target to a flat psychoacoustically compensated binaural response.
Thanks for the clarification.
 
XTC and HRTF compensation in binaural audio are clearly two independent things.
To make this easier to understand, I used IEMs and headphones as examples in some of my previous comments.
Aside from the inherent disadvantages of IEMs/headphones, which are internalized, and the advantages of speakers, which can be easily externalized, their purpose remains the same.
So, thinking of XTC as a separate category might make things more complicated, but you can simply think in terms of IEMs/headphones.
 
To make this easier to understand, I used IEMs and headphones as examples in some of my previous comments.
Aside from the inherent disadvantages of IEMs/headphones, which are internalized, and the advantages of speakers, which can be easily externalized, their purpose remains the same.
So, thinking of XTC as a separate category might make things more complicated, but you can simply think in terms of IEMs/headphones.
Yes, of course. Thanks for your explanation.
Out of personal curiosity, do you know by chance which HRTF is used in the u-BACCH version? (Gras Kemar, Neumann, B&K Hats?)
And then, in your opinion, using an HRTF to calculate XTC different from the one used to encode the audio played can involve some kind of artifacts? Or are just uncorrelated things? (ChatGPT thinks it is not an ideal combination)
 
Last edited:
do you know by chance which HRTF is used in the u-BACCH version? Gras Kemar, Neumann, B&K Hats?
1741338430722.png


And then, in your opinion, using an HRTF to calculate XTC different from the one used to encode the audio played can involve some kind of artifacts? Or are just uncorrelated things?
Well, I'm cautious about making definitive statements on this.
However, since you asked for my personal opinion, I'll share just that—everything below is purely my own perspective.
Why do people listen to binaural recordings? While the purpose and intent vary from person to person, traditional stereo recordings often lack spatiality and certain auditory cues. Binaural recordings, on the other hand, capture these elements, making them sound noticeably different from simple stereo recordings.
If we consider this difference as a kind of effect, then even if the HRTF doesn't perfectly match, the fact that the recording itself preserves the ILD (Interaural Level Difference) cues may be enough to provide that effect. You can still perceive proximity-based effects—such as extremely high ILD values from sounds near the face, like the snipping of scissors or a buzzing insect—if you regard them simply as effects.
However, as the sound source gets closer, discrepancies in ITD (Interaural Time Difference) and ILD become more pronounced. If we prioritize accuracy and expect it to perfectly replicate how we hear sounds up close in real life, then such mismatches could be considered distortions.
That said, such cases are relatively uncommon, and whether we perceive something as distortion depends on the reference point we choose.
(Of course, this applies not only to close sounds but also to distant ones—after all, you and I perceive them differently, and so does a dummy head.)
So, even though I study XTC and apply it, I don’t actually listen to binaural tracks as much as one might expect. Even when I do, I don’t analyze or place much significance on the listening experience.
I simply take it as an effect, and for the most part, both professional binaural recordings and the countless binaural tracks on YouTube meet my personal criteria for this kind of effect. That’s my standard.
And since I don’t particularly enjoy this effect, I rarely seek out binaural recordings—except for the ones I record myself.

I hope my personal opinion has been conveyed accurately without any distortion.

Below are some screenshots regarding the BACCH level tiers.
You can also check the link yourself—just scroll down to find the information.



1741339458459.png
 
Back
Top Bottom