• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Threads of Binaural virtualization users.

I am resorting to the binauralization of Atmos 7.1.4 from Apple Music with Binauralizer Studio 2 in Reaper.
Starting from the Atmos master and binauralizing it would be better, but alas not feasible with streaming music.
Binauralizer Studio allows you to upload SOFA files and quickly switch between them to correctly determine the subjective preference.
I ended up preferring the dummy head Neumann KU100, and in fact I'm not the only one, it seems to be the statistically favorite.
I am already satisfied so and I did not want to resort to personal HRTF estimation (as possible with some software).
I believe that the path of subjective preference rather than analytical determination of HRTF is still the most valid, because of 3 elements:
- Room divergence factor
- Repeatability of HRTF measurement
- There is no absolute spatial reference for the audio track

On the BRIR part I'm still deepening instead. For some reason APL Virtuoso gets the statistical preference, and I think this has to do with their integrated BRIR.

The binaurlization engine shouldn't make any difference, after all it is a convolver.
I know there are some binauralizers that calculate analytically... there was one in beta from Korean guys, that gets good reviews, but then it seemed to be abandoned... Can't find it now.
Welcome to this thread. I’ve often seen your username in other DSP and ART discussions as well. As noted at the start of the thread, I respect and listen attentively to every user’s virtualization experiences, whether shared Publicly or Personalization.
And I agree with you. Since most convolution engines perform well, the setup you use will depend on personal preference—what’s most comfortable for you is best. I think the KU100 is great and widely used for various binaural sources, including ASMR. However, when I first started working with BRIR and tried dummy heads like the KU100 or other people’s responses, I didn’t achieve the level of externalization (indistinguishable from reality) I was hoping for, so I continued with personalized measurements. There are a number of complex reasons for that. I clicked one of the software links and saw it’s Impulcifer—the one I use—and Korean users enjoy that too.
 
From head tracking to haptic bass, if you chase realism you end up strapping on gear like Iron Man (just kidding), and it risks undermining the original goal of listening with a light body and mind. That’s why I find myself wrestling with this.
Haha, I like the idea of an Iron Man armour for hifi. But for the whole body bass fanatics this (from pud.com) might be a more practical alternative:
1746537590882.png

Actually, there are quite a few other approaches besides that.
Yes, I understood that. It just explained why the BRIR had so little reflections.
I also have IRs with virtually no reflections within 40 ms (essentially free‑field conditions), but when I’m running these tests I usually apply the technique to other people’s responses first, not just my own.
Interesting. 40ms, that is a long time. Are these from big halls (late reflections) or with no reflections at all?
In my experience a free field HRIR does not work so well. Stereo is a compromise for reproduction in (listening) room, isn't it?
Sure, tweaking a pristine, “perfect” IR is a matter of personal satisfaction, but a method that works reliably on IRs measured in less‑than‑ideal conditions? To me, that’s a genuinely useful approach.
And leaving only the direct sound is just the beginning—reflections mark the start of a whole new journey.
Yes, I fully agree.
The question for me is, how does a BRIR with reflections (see above) look like that is as neutral and restrained as possible, that lets the spatial characteristics and cues from the recording present themselves as clearly as possible?
Personally, I incorporate Bacch’s frequency‑dependent regularization concept and implement acoustically transparent crosstalk cancellation in a DIY setup.
This is with speakers?
Yes. Because crosstalk is already inherent in a binaural recording, the playback system must remove or attenuate that crosstalk to hear it correctly. When you say you can’t remove “these ears,” are you referring to the physiological (body’s) response? If you mean someone else’s—or a dummy head’s—response, binaural recordings are generally pre‑equalized using diffuse‑field (DF) compensation.
What I mean is that all the characteristics of a BRIR (FR signature, ITD and frequency dependent ILD) are incorporated in the recorded signal of a binaural recording with a dummy head. The diffuse field compensation is a rather broad FR correction that (among other things) is a natural way to prevent that the ear gain piles up when listening either through speakers or through headphones (that create the ear gain - Harman curve). In other words it is similar to the EQ I apply after convolving with my BRIR (from ear canal microphones) to get a neutral sounding signal to use with headphones/earphones in the same way as one would use the direct stereo signal.
My idea would not be to delete the crosstalk from the recording as this is a natural part of binaural listening. The correction I think is necessary is to"correct" the HRTF from the dummy's to my own. Not only in a broad sense but with taking the signatures of the pinnae (dummy's and mine) into account. This is hardly possible as the dummy's is not known normally. This correction would be a simple EQ without further crossfeed.

At close distances, mismatches in ITD and ILD become more pronounced, and we may find those discrepancies unappealing, hearing them as mere effects. With in‑ear monitors (IEMs), because they bypass the pinna, they tend to internalize the sound—as noted earlier in the thread—whereas speakers tend to externalize it well.
Well speakers are external sound sources that will be perceived externally in a natural way (unless such trickery as Bacch is used). Somehow I do not get the point of this comparison.
So, as I’ve said before, it’s simply a difference in recording method and format; I never meant to imply that one is inherently higher quality than the other.
Modern recordings are really well made these days, and I enjoy listening to them too.
I agree about many "modern recordings" being quite good, but there are differences and some are not good. And I would prefer a good atmos mix over the corresponding stereo mix (both over virtualization) any time. And a good binaural recording - IF made with my own ears - WILL be better than a good stereo recording (again over virtualisation) in my view (spatiality, auditory envelopment, realism...). I absolutely agree with Toole about stereo. But it will be different too, as the artistic and technical goals are different (not the least being listening over speakers of course). That is a question of preference.
 
Welcome to this thread. I’ve often seen your username in other DSP and ART discussions as well. As noted at the start of the thread, I respect and listen attentively to every user’s virtualization experiences, whether shared Publicly or Personalization.
And I agree with you. Since most convolution engines perform well, the setup you use will depend on personal preference—what’s most comfortable for you is best. I think the KU100 is great and widely used for various binaural sources, including ASMR. However, when I first started working with BRIR and tried dummy heads like the KU100 or other people’s responses, I didn’t achieve the level of externalization (indistinguishable from reality) I was hoping for, so I continued with personalized measurements. There are a number of complex reasons for that. I clicked one of the software links and saw it’s Impulcifer—the one I use—and Korean users enjoy that too.
KU100 integrates diffuse field equalization, so perhaps statistical preference has to do with that.
I think the round robyn test (aka Club fritz) I linked is interesting because it highlights that HRTF measurement uncertainties is well above the limits of perceptual neutrality, therefore many studies that include HRTF measurements must be considered with equal uncertainty.
We cannot say that we have an absolutely representative KU100 HRTFs set for example.
The topic of the Room Divergence Effect in my opinion is also not negligible, because it plays a more relevant role than expected in binauralization.
Although this whole topic is finally related to listening pleasure, I find that the technical side has a certain charm for me, and perhaps it is the part that I am most passionate about and I like most of immersive audio.
 
Haha, I like the idea of an Iron Man armour for hifi. But for the whole body bass fanatics this (from pud.com) might be a more practical alternative:
Haha, that seems interesting.

Interesting. 40ms, that is a long time. Are these from big halls (late reflections) or with no reflections at all?
In my experience a free field HRIR does not work so well. Stereo is a compromise for reproduction in (listening) room, isn't it?
It was a large space—the very onset of early reflections, not the late ones. With a touch of the tweaks we discussed, you could even achieve a free‑field response. And yet, we all harbor that vague romantic notion of free field, don’t we? I certainly did, and so did other Korean users. My thread—and my experiences—are grounded in the conventional strengths and compromises of typical recording/playback methods, but sometimes the benchmark becomes simply what I want and what I’d love to hear.
For example, I’ve tried directing the speakers toward a high ceiling and tweaking the response intensity to test it out.
I’m not one for subjective descriptions, but sometimes it’s fun to get lost in a background stream that feels like stars falling from the night sky.

Yes, I fully agree.
The question for me is, how does a BRIR with reflections (see above) look like that is as neutral and restrained as possible, that lets the spatial characteristics and cues from the recording present themselves as clearly as possible?
That will vary depending on the environment/response each user designs. There are plenty of articles about this on ASR as well. (This applies not only to BRIR but to all speaker playback—after all, BRIR has speaker + space.)
I think you can strike an appropriate compromise that reflects your own preferences and goals—so long as the playback side doesn’t overpower the recording. If you search ASR for “Early Reflection,” “first reflection,” or “Sound stage,” you’ll find many long‑running discussions on exactly this topic.

This is with speakers?
Both methods work. BRIR is essentially the same as speaker + space. I got so deep into HRIR/BRIR that I even got rid of all my speakers, but I still have a small, fist‑sized $15 speaker on my desk. I use it occasionally for testing and to play the news on YouTube.

What I mean is that all the characteristics of a BRIR (FR signature, ITD and frequency dependent ILD) are incorporated in the recorded signal of a binaural recording with a dummy head. The diffuse field compensation is a rather broad FR correction that (among other things) is a natural way to prevent that the ear gain piles up when listening either through speakers or through headphones (that create the ear gain - Harman curve). In other words it is similar to the EQ I apply after convolving with my BRIR (from ear canal microphones) to get a neutral sounding signal to use with headphones/earphones in the same way as one would use the direct stereo signal.
My idea would not be to delete the crosstalk from the recording as this is a natural part of binaural listening. The correction I think is necessary is to"correct" the HRTF from the dummy's to my own. Not only in a broad sense but with taking the signatures of the pinnae (dummy's and mine) into account. This is hardly possible as the dummy's is not known normally. This correction would be a simple EQ without further crossfeed.
Yes—that’s essentially an extension of HPCF. The headphones and IEMs used for BRIR need to be equalized.
The problem is that a simple EQ can only compensate to a certain degree and may not be very meaningful. It depends on the microphone insertion method and recording technique—and once you factor in differences between the dummy head and my own ears, variations between different headphones, and the EQ needed for the particular headphones I use, you might see “some” improvement. But the core issue remains: the recorded ITD and ILD aren’t mine. (Of course, that’s less of a problem if the sound sources aren’t close.)
So yes, you can apply all those corrections, but as I said before, while I appreciate the binaural recording approach, I don’t have high expectations for the source material itself.

Well speakers are external sound sources that will be perceived externally in a natural way (unless such trickery as Bacch is used). Somehow I do not get the point of this comparison.
I’m referring to the difference between speakers and headphones/IEMs when listening to binaural recordings. Under the same conditions—speakers (with crosstalk cancellation applied) versus headphones/IEMs (with no crosstalk)—the latter tend to be perceived more “inside” your head. In contrast, speakers—unlike the pinna‑proximate effect of headphones—are heard as coming from a distance and thus externalize the sound much more easily.
Below is an excerpt from the Bacch paper.

1746543272627.png

1746543288055.png



I agree about many "modern recordings" being quite good, but there are differences and some are not good. And I would prefer a good atmos mix over the corresponding stereo mix (both over virtualization) any time. And a good binaural recording - IF made with my own ears - WILL be better than a good stereo recording (again over virtualisation) in my view (spatiality, auditory envelopment, realism...). I absolutely agree with Toole about stereo. But it will be different too, as the artistic and technical goals are different (not the least being listening over speakers of course). That is a question of preference.
Yes, of course preferences vary, and that’s the most important thing—I agree.
Personally, I actually don’t like Atmos mixes. The Atmos versions of the tracks I’ve listened to felt stimulating with lots of panning and dynamic elements, but I’m more interested in spatiality than in individual instruments or sources being panned around, so it didn’t really suit my taste. Above all, the sense of space in my stereo setup—shaped virtually to my own imagination—already surpassed the colorful mixing of Atmos… (But just as I respect your opinion, this is only my personal view. Please don’t misunderstand.)

First of all, I should head to bed soon—it’s quite late. Thank you for sharing so many insights in the thread today. If you happen to reply again, I’ll check it tomorrow.
 
The topic of the Room Divergence Effect in my opinion is also not negligible, because it plays a more relevant role than expected in binauralization.
In my experience this depends a lot on the use case and listening behaviour. If you have your eyes open, i.e during mixing, then the impact of your room will be very strong, as with any visual cue. With eyes closed it is a quite different story.
Virtuoso looks very interesting.
I think the round robyn test (aka Club fritz) I linked is interesting because it highlights that HRTF measurement uncertainties is well above the limits of perceptual neutrality, therefore many studies that include HRTF measurements must be considered with equal uncertainty.
Indeed, if preference studies use some kind of out-of-the-box calibration for comparison, then I would take that with a rock of salt.
Timbre/tonality at least has to be tweaked individually as timbre will change the spatial perception quite a bit too.
For some reason APL Virtuoso gets the statistical preference, and I think this has to do with their integrated BRIR.
Do you know how the rendering with the dummy head was performed in the study? Through speaker playback?
 
KU100 integrates diffuse field equalization, so perhaps statistical preference has to do with that.
I think the round robyn test (aka Club fritz) I linked is interesting because it highlights that HRTF measurement uncertainties is well above the limits of perceptual neutrality, therefore many studies that include HRTF measurements must be considered with equal uncertainty.
We cannot say that we have an absolutely representative KU100 HRTFs set for example.
The topic of the Room Divergence Effect in my opinion is also not negligible, because it plays a more relevant role than expected in binauralization.
Although this whole topic is finally related to listening pleasure, I find that the technical side has a certain charm for me, and perhaps it is the part that I am most passionate about and I like most of immersive audio.
Yes, I agree with you. The uncertainties and instabilities in measurements are at a level that can’t be ignored. And on top of that, the actual calibration curves of headphones and IEMs also play a significant role.
I’m actually split 50/50. I find the process of exploring and testing all of this more fun than just listening for pleasure. Surprisingly, I don’t listen to that much music. Even among audio hobbyists, some people love DIY speaker builds, while others are into measurements and DSP setups—same field, but everyone has their own focus. Even in this thread I started yesterday, each participant zeroes in on slightly different points. It’s really fascinating and a lot of fun.
As I just mentioned, I need to go to bed now, so I’ll check the thread again tomorrow. Have a good night. o_O
 
And yet, we all harbor that vague romantic notion of free field, don’t we? I certainly did, and so did other Korean users.
I did when I made my first experiments. But I was green then and I had to learn a bit about stereo first. I don't think stereo works well in free field. Multichannel might be different.
That will vary depending on the environment/response each user designs. There are plenty of articles about this on ASR as well. (This applies not only to BRIR but to all speaker playback—after all, BRIR has speaker + space.)
I think you can strike an appropriate compromise that reflects your own preferences and goals—so long as the playback side doesn’t overpower the recording. If you search ASR for “Early Reflection,” “first reflection,” or “Sound stage,” you’ll find many long‑running discussions on exactly this topic.
Sure, I know many of these discussions, but they never lead anywhere, just round and round.
Most of the time they end up discussing bass problems.
In BRIR "design" the world is different. Bass is not a problem, the right amount of neutral, restraint and sufficiently late early reflections in space are. And if one would know the goal, the question is, how to get the IR. You either need this nice room (might not even exist physically) or a synthesiser.
But the core issue remains: the recorded ITD and ILD aren’t mine.
Hmmh. ITD is not really a problem, is it? A mismatch in ITD will only broaden or narrow the imaging a bit.
And ILD seems to be for me not such a big thing either. It certainly is personal but mainly follows a smooth pattern.
This is my measured FR (1/24 oct) at the left ear. For left channel (blue) a right channel(brown).
Ignore below 400Hz, that is mainly room effects.
HRTF.png

My idea is, that the main personal characteristics are in the FR, which is highly personal and depends on head, ear position, pinna and concha.
The sound from the opposite side just has to go around the (rather smooth) head and follows the same trends.
Agreed, this is not 30° azimuth, more like 15°, but still.
I would happily use the ITD and the ILD trend of the dummy head that are the result of the concert hall acoustics (and player positions) and just correct the FR (as mentioned before).

I’m referring to the difference between speakers and headphones/IEMs when listening to binaural recordings. Under the same conditions—speakers (with crosstalk cancellation applied) versus headphones/IEMs (with no crosstalk)—the latter tend to be perceived more “inside” your head.
Ok, that got me more than a little bit confused. I read somewhere you got rid of all your speaker and yet you mention speaker frequently. At the same time meaning a speaker-base cross talk cancelling system.
Well that is a different thing for sure.

The Atmos versions of the tracks I’ve listened to felt stimulating with lots of panning and dynamic elements, but I’m more interested in spatiality than in individual instruments or sources being panned around, so it didn’t really suit my taste.
Interesting but I never heard that. Maybe its the preferred program, maybe its me?
My experience is with classical music and Rattles Mahler or the Boston Symphony (among many others, the list is long and keeps getting longer) sound great in stereo, but (bass in particular but not only bass) in Atmos is still a step up for me (and that is with surround channels convolved by the dearVR plugin which is not the pinnacle of things).

First of all, I should head to bed soon—it’s quite late. Thank you for sharing so many insights in the thread today. If you happen to reply again, I’ll check it tomorrow.
Good night!
 
This is a super interesting thread.

It seems like you guys have put a *ton* of effort into your setups - is there easy-ish way to try adding binaural virtualisation to regular 2ch recordings, just to see if it’s worth perusing? Something Mac based would be good.
 
I did when I made my first experiments. But I was green then and I had to learn a bit about stereo first. I don't think stereo works well in free field. Multichannel might be different.
This was also back when I was vaguely chasing only the direct sound—like hunting for treasure in the movie National Treasure. ;)

Sure, I know many of these discussions, but they never lead anywhere, just round and round.
Most of the time they end up discussing bass problems.
In BRIR "design" the world is different. Bass is not a problem, the right amount of neutral, restraint and sufficiently late early reflections in space are. And if one would know the goal, the question is, how to get the IR. You either need this nice room (might not even exist physically) or a synthesiser.
Either approach can be good, depending on your preferences and goals.

1746577572790.png

1746577605763.png

1746577580432.png

1746577614160.png

Personally, I manage it based on David Griesinger’s ideal form while avoiding exponential ETC, but it likely differs from person to person.
It might sound like I’m holding back what I say. And you’re right: you can’t simply label something as good or bad, since people have been debating this for years, and it clearly depends on individual taste and purpose.

Hmmh. ITD is not really a problem, is it? A mismatch in ITD will only broaden or narrow the imaging a bit.
And ILD seems to be for me not such a big thing either. It certainly is personal but mainly follows a smooth pattern.
This is my measured FR (1/24 oct) at the left ear. For left channel (blue) a right channel(brown).
Ignore below 400Hz, that is mainly room effects.
1746579084825.png


The green line is the normalization reference.
The blue and yellow curves show the ILDs of two different listeners measured with the same speaker, at the same angle, in the same space.
And yes, of course, this difference diminishes at greater distances.

1746579170585.png


Even when I remove the normalization and examine the raw responses, they still differ. Again, this difference shrinks at greater distances. So while you can safely ignore it up close, at close range there’s no avoiding it. And even if you try to correct based on the shape of the un‑normalized response, each person’s ILD profile and their familiar spatial cues are different. (Of course, we can still perceive the sound.)
So, when listening to recorded binaural recordings that have been pre‑equalized for DF compensation, you can generally perceive them without major issues.

My idea is, that the main personal characteristics are in the FR, which is highly personal and depends on head, ear position, pinna and concha.
The sound from the opposite side just has to go around the (rather smooth) head and follows the same trends.
Agreed, this is not 30° azimuth, more like 15°, but still.
I would happily use the ITD and the ILD trend of the dummy head that are the result of the concert hall acoustics (and player positions) and just correct the FR (as mentioned before).
We could certainly give it a try the way you suggested; I’ve tried that myself, too. After all, in the process of manipulation or synthesis, you can make just about anything happen.

Ok, that got me more than a little bit confused. I read somewhere you got rid of all your speaker and yet you mention speaker frequently. At the same time meaning a speaker-base cross talk cancelling system.
Well that is a different thing for sure.
It’s not about the speaker disappearing, but whether the sound is perceived inside your head (internalized) or out of head (externalized).
When you listen to ASMR through IEMs that tickle your ear canals, sometimes it feels like the sound isn’t just in your ears but almost drills into your brain. Of course, many people seek out that kind of stimulation.
Yet in a more extreme sense, the sound just bounces around inside your head, and it’s hard to feel like it’s coming from outside.
The example I was referring to is speaker‑based crosstalk cancellation. To fairly compare binaural recordings played over headphones/IEMs (where there’s no crosstalk) with speakers (or BRIR playback), you need to ensure the crosstalk is equally reduced in both cases.
Even when you convolve these pinna responses in their normalized form, there are still noticeable differences.
Interesting but I never heard that. Maybe its the preferred program, maybe its me?
My experience is with classical music and Rattles Mahler or the Boston Symphony (among many others, the list is long and keeps getting longer) sound great in stereo, but (bass in particular but not only bass) in Atmos is still a step up for me (and that is with surround channels convolved by the dearVR plugin which is not the pinnacle of things).
Yes, so I think it varies from person to person. It’s not your ears; it’s simply a matter of personal preference. I already have full‑sphere HRIR/BRIR—including elevation, beyond a 9.1.6 layout—but 99.9% of what I use is stereo. I only use multichannel occasionally, for example when watching movies.
 
Last edited:
This is a super interesting thread.

It seems like you guys have put a *ton* of effort into your setups - is there easy-ish way to try adding binaural virtualisation to regular 2ch recordings, just to see if it’s worth perusing? Something Mac based would be good.
On Windows, the easiest way to try it is with EQ APO and Hesuvi. Hesuvi comes with its own presets, and you can just click and listen right away. You’ll get a decent sense of binaural virtualization before any personalization, without needing additional VSTs or DAW setup. I’m not familiar with Mac-based options, as I don’t use a Mac.
Btw, Welcome to this thread.
 
This is a super interesting thread.

It seems like you guys have put a *ton* of effort into your setups - is there easy-ish way to try adding binaural virtualisation to regular 2ch recordings, just to see if it’s worth perusing? Something Mac based would be good.
On Mac: subscribe to Apple Music free trial, play spatial audio with in ear headphones and toggle Dolby Atmos option between Always On and Always Off to directly detect the difference. This should give you an idea.
If you have AirPods with your personal calibration even better.
Alternatively, Amazon Music and Tidal also provide spatial audio and I believe they work on iPhone (not really sure because I'm on Android).
The latter two apps, on mobile, take advantage of the latest Dolby codec (AC4-IMS) specifically developed for binaural audio with headphones. In addition to rendering the Atmos master directly, it also renders the distance.
Apple actually has its own binauralizer that converts audio rendered in 7.1.4 and can't use distance metadata because based on the older AC3-JOC codec. On the other hand, they can use head tracking.
In my opinion, AC4 with Dolby renderer sounds better.

PS. You cannot convert stereo tracks to binaural if this is what you mean...
There are some upmixers, who cost a bang, but don't work miracles.
 
Last edited:
PS. You cannot convert stereo tracks to binaural if this is what you mean...
There are some upmixers, who cost a bang, but don't work miracles.
Oh, OK - I seem to have gotten completely the wrong end of this stick, then.

I thought the idea was to emulate what it's like to listen to speakers in a room, using headphones. I'm not particularly interested in multi-channel audio. I got that impression (I think) from someone here on ASR who seemed to go to extreme lengths to measure (stereo) speakers using in-ear microphones.

Thanks for the help/clarification though.
 
I'm not particularly interested in multi-channel audio. I got that impression (I think) from someone here on ASR who seemed to go to extreme lengths to measure (stereo) speakers using in-ear microphones.
That was probably me. (But maybe not.)

I thought the idea was to emulate what it's like to listen to speakers in a room, using headphones.
If you want to perform personalized measurements, there are various methods, but the most affordable setup is an in‑ear microphone, an audio interface with two inputs, the headphones you plan to use, and a single speaker.
In‑ear microphones typically come in 3 mm or 6 mm sizes. When you’re just starting out, the 6 mm ones can hurt your ears, and the 3 mm fit more comfortably—but the 6 mm capsules do offer better performance. The Samson LM‑10x is a solid choice (though there are plenty of other mics out there).
For the interface, as mentioned above, you just need something with two inputs. I briefly used a MOTU unit myself, and my main recording interface was from Topping—but since it was released early in Topping’s product line, I ran into quite a few firmware bugs at the time.
As for headphones, just pick whatever feels most comfortable on your ears. And your speaker doesn’t have to be a matched pair—just buy an inexpensive one. If you start factoring in each speaker’s radiation pattern and overall fidelity, that’s a different conversation, but for your first attempt at recording in your room, a budget speaker will work just fine.

You cannot convert stereo tracks to binaural if this is what you mean...
There are some upmixers, who cost a bang, but don't work miracles.

I’ve briefly come across something like this before—it was quite intriguing. But, as you mentioned, to my knowledge there’s currently no (or virtually no) magic that can transform a standard stereo track into a fully authentic binaural one.
 
I read it a while ago, but IIRC it (in part) involved setting up and recording a pair of expensive speakers outdoors. I'll have to find the thread...
Yep. Maybe it wasn’t me. There are other binaural users around sometimes, too.

This thread is part of one of the threads I started in the early days of BRIR.
 
Beyond Mono to Binaural
I’ve briefly come across something like this before—it was quite intriguing. But, as you mentioned, to my knowledge there’s currently no (or virtually no) magic that can transform a standard stereo track into a fully authentic binaural one.
Yes, if we move from the commercial sector to the R&D one, there are several interesting projects. (As I have long argued, neural networks will be the key to binaural audio).

Oh, OK - I seem to have gotten completely the wrong end of this stick, then.

I thought the idea was to emulate what it's like to listen to speakers in a room, using headphones. I'm not particularly interested in multi-channel audio. I got that impression (I think) from someone here on ASR who seemed to go to extreme lengths to measure (stereo) speakers using in-ear microphones.

Thanks for the help/clarification though.
If you simply want externalisation of stereo audio in headphones there are various plugins that currently do. APL Virtuoso seems very appreciated and works standalone on Mac, so no need to introduce complicated setup and routing.
Give it a try.
And on top of that, the actual calibration curves of headphones and IEMs also play a significant role.
Yes, forgot to mention this. Very relevant.
Indeed, if preference studies use some kind of out-of-the-box calibration for comparison, then I would take that with a rock of salt.
Timbre/tonality at least has to be tweaked individually as timbre will change the spatial perception quite a bit too.
The point is related to actual measurement capability. Seems we cannot get measurements with uncertainty low enough to avoid perceptual alteration, so Working with absolute HRTFs is not entirely reliable, it is necessary to resort to statistics.
This is why I say that focusing too much on the aspects related to HRTF is not too worth it. As long as we are in the entertainment field it's better an approach that includes the verification of subjective preference.
 
Yes, if we move from the commercial sector to the R&D one, there are several interesting projects. (As I have long argued, neural networks will be the key to binaural audio).
Yes, I think I’ve seen a few. If such technologies become successfully commercialized, they’ll open up a whole new experience for everyone.

Yes, forgot to mention this. Very relevant.
Yes—these calibration curves are extremely important, but people often overlook or misunderstand them.

The point is related to actual measurement capability. Seems we cannot get measurements with uncertainty low enough to avoid perceptual alteration, so Working with absolute HRTFs is not entirely reliable, it is necessary to resort to statistics.
This is why I say that focusing too much on the aspects related to HRTF is not too worth it. As long as we are in the entertainment field it's better an approach that includes the verification of subjective preference.
I agree to some degree. However, this thread isn’t meant to focus exclusively on HRTF-related factors—I started it so we could cover most of the topics around virtualization together.
The reason I can’t fully agree is that, aside from statistical considerations, I’m neither the measured subject nor a dummy head. In that sense, each person has their own allowable tolerance for mismatches, and similar issues can stem from the uncertainties and instabilities in the measurement methods you mentioned.
Still, even imperfect measurements capture individual traits, and while some variation is tolerable, sometimes the result can sound like it isn’t working at all.
Finally, the way I (or any of us) distinguish front versus back and perceive elevation is inherently different, and when you factor in the HPCF variables we talked about earlier, the confusion only increases.
Of course, this really does vary from person to person. So I won’t share any further my personal opinions on the matter, since I respect and take into account my own viewpoint, others’, and yours as well. I’m simply a bit hesitant to label any approach as unnecessary or good/bad. Since you’re so knowledgeable about binaural audio, I’m confident you’ll understand my nuance without any misunderstanding.
 
Last edited:
I agree to some degree. However, this thread isn’t meant to focus exclusively on HRTF-related factors—I started it so we could cover most of the topics around virtualization together.
The reason I can’t fully agree is that, aside from statistical considerations, I’m neither the measured subject nor a dummy head. In that sense, each person has their own allowable tolerance for mismatches, and similar issues can stem from the uncertainties and instabilities in the measurement methods you mentioned.
Still, even imperfect measurements capture individual traits, and while some variation is tolerable, sometimes the result can sound like it isn’t working at all.
Finally, the way I (or any of us) distinguish front versus back and perceive elevation is inherently different, and when you factor in the HPCF variables we talked about earlier, the confusion only increases.
Of course, this really does vary from person to person. So I won’t share any further opinions on the matter, since I respect and take into account my own viewpoint, others’, and yours as well. I’m simply a bit hesitant to label any approach as unnecessary or good/bad. Since you’re so knowledgeable about binaural audio, I’m confident you’ll understand my nuance without any misunderstanding.
Yes, absolutely agreed. The way we use ITD/ILD is obviously common, therefore there is room to exploit them in order to create virtually valid experiences.
 
Yep. Maybe it wasn’t me. There are other binaural users around sometimes, too.

This thread is part of one of the threads I started in the early days of BRIR.
Ah yep, that's 100% the thread I was thinking of.

If you simply want externalisation of stereo audio in headphones there are various plugins that currently do. APL Virtuoso seems very appreciated and works standalone on Mac, so no need to introduce complicated setup and routing.
Give it a try.

Ah, this is exactly what I was after. Well, except that I'd like to work out how to integrate something like this into my normal WiiM-based streaming setup. Cheers!
 
Btw, What RT do you usually listen to (or what’s the RT in your own environment)?
I adjust it depending on the situation. Sometimes I listen with a very clear, anechoic response; other times I use a typical RT of around 200–300 ms. Occasionally, to invite a singer into my personal concert hall, I’ll set a long RT of about 2~2.5 seconds.


002.png
 
Back
Top Bottom