• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Threads of Binaural virtualization users.

I'm curious to hear more about how your process works. Are you chaining multiple SPARTA plugins? I've only used Binauraliser myself.

I took the 11 64-channel Eigenmike files that were closest to ATMOS bed and heights (used MATLAB to convert them to wav first), then ran them through the Eigenmike VST encoder plugin I got from their web page (so they become HOA 64-channel files). I then made 12 tracks in Reaper, routed audio in, and convolved each channel except LFE with the Sparta Matrix convolver using 64 HOA wav files I made previously. After that, every channel is routed to the master bus, which has Sparta ambiBIN in 6-order binaural with my sofa file on it. And headphone eq on top. I also kept the LFE separate and just used the binaural plugin to set it ahead of me in space + low pass filter at 60hz. The speakers from the impulse responses roll off around 50, so the clean bass channel helps out. I also measured the "speakers" with REW and got a flat response with LFE and left/right channels.

Side bonus is that I can now cook eggs on my M1 when all this convolution is going on :) I've never gotten this machine that hot.

But! I also got Virtuoso trial extension, and well... it does amazing things now that I use my sofa. The Reaper solution is fun, but the placement is much more realistic in Virtuoso. At first, I thought the pre-set rooms in Virtuoso use different files in the background, but after a while I figured it's the same engine (just different parameters like size, etc.) Definitely using it going forward.
 
@visualizer that's a pretty intense setup! I had wanted to try setting up some per channel reverb with Sparta Binauraliser, but I find I like to listen either with Virtuoso for some recordings and Binauraliser for others. Both plugins sound better to me when 2 channel tracks are upmixed to 7.1 from JRiver, but I'd also like to experiment with other options such Penteo.

What kind of tracks are you listening to? Are you routing Atmos tracks from Apple music? AFAIK this isn't possible on Windows; I currently don't have many in the way of multichannel recordings but am keen to find other sources of multichannel tracks and/or ways to playback Atmos tracks with my SOFA file.

Also, did any of you folks see the new plugin "Orbit Spatial" featured on Michael Wagner's YouTube channel? It's an Atmos renderer (Mac only) that allows you to load in custom SOFA files. I really need to get a Mac at some point this year!
 
@variance Yes, I've been listening to Apple Music Atmos mixes and just soloing speakers to compare the tonality and accuracy. Orbit Spatial looks like a cool option for getting an overview of the mix. If only there were a way to get mainstream music ADM files... Tho it is absolutely possible to get 12-track MP4 audio from Tidal if you need (via "Tidal downloader" with Atmos track download turned on).
 
I’ve found RWTH IKS Lab Eigenmike em64 impulse response database with impressive spherically arranged 36 speaker array....
I took the 11 64-channel Eigenmike files that were closest to ATMOS bed and ...
Let me ask, why this room?
From the paper:
"The results showed that the measurement room exhibits low reverberation times and high clarity, which indicate a suitable environment for simulating acoustic scenes."

Basically the room adds nothing to the reproduction, so it is suitable for simulation. In your case bypassing the whole simulation, gives the same or better result. Why did't you pick a room, where music really performed?
 
@variance I already tried the mesh2SOFA, wanting to DFE my previously made project. It seems to me it's not possible to skip any steps in the gui? For example, go straight to step 6 and only feed it the finished numcalc output?
Just wanted to update you (and anyone else reading this) that I created a new script called `sofa_mastering_tool.py` that can load an existing SOFA file and can separate create mastered versions of the input file and generate tilted DFHRTF files. This way, if you have an existing SOFA file and just want either the DFHRTF or a different samplerate, you can use this tool.

EDIT: I forgot to mention that if you want to use the new DFHRTF tool, be sure to either clone the full repo or at least grab the updated `generate_extras` and `generate_sofa_outputs` files, as the previous versions don't work with the new tool.

Some of the publicly available HRTF datasets are already diffuse-field equalized, but for others such as HUTUBS, you can use this tool to generate their DFHRTFs from their simulated resulps to compare to your own. I started testing with this and the results are pretty interesting so far! It's easy to see how different HRTFs even starting from 2 kHz.

1772394368342.png
 
I've been experimenting with different audio processing paths:
  • Stereo → JRiver Upmix → sparta_binauraliser or APL Virtuoso
  • Stereo → Ambisonic conversion → COMPASS Binauraliser
So far I like how the ambisonic path sounds the best, but it's a colossal hassle and the settings need to be tweaked carefully to actually work. (It's very easy to cause audio glitches this way.) Let me know if anyone is interested in this and I can post more about the settings I'm using.
 

Attachments

  • Stereo2BinauralAudioChain.png
    Stereo2BinauralAudioChain.png
    205 KB · Views: 46
Let me ask, why this room?
From the paper:
"The results showed that the measurement room exhibits low reverberation times and high clarity, which indicate a suitable environment for simulating acoustic scenes."

Basically the room adds nothing to the reproduction, so it is suitable for simulation. In your case bypassing the whole simulation, gives the same or better result. Why did't you pick a room, where music really performed?

I figured since the RT60 value was close to that of studio spaces, it would work well for "virtual studio".

I notice you prefer to add more reverberation to your playback, which just seems to be a different approach. Since I produce and mix music that is considered modern, the reference mixes I listen to are all mixed in a room with similar traits. I would even explain this further.

If a mix engineer mixes, for example, a pop record, he then manipulates reverberation behind different elements to make them appear closer or further back. A good reverberated mix will almost create an illusion of elements being "placed" in the sound field. If I then listen back to such a mix and add lets say, concert hall reverberation, it would send all the nuanced reverberation balance the mix engineer created to hall reverb - so I would be playing back reverberation inside reverberation, essentially breaking the intended sound image. I think you see where I am going with this. Adding reverb might work for classical music if the recording is reverberated naturally during the recording process, and there isn't too much reverb already present. But again, if I listen to a good orchestral Atmos mix in 7.1.4 through Virtuoso, it sounds very immersive to me, like I would be sitting in a concert hall.

Now you might say, why add all this very short reverb to the virtual speaker room at all if there is a reverb already added to every instrument in the mix? The thing is, we perceive reverberation differently when we listen to mixed music in a room and when we listen to it through headphones. For example, if I add a certain amount of reverb to a guitar or a vocal when mixing just on headphones without room simulation, it might sound proper to my ears, but if I then go and listen to the same audio in the studio mixing room on speakers, I will hear that there is too much reverb all of a sudden. That is because the natural room reverb makes us feel reverberation levels differently inside the mix. But! If I mix a track on speakers inside a room, the balance always sounds correct on headphones. Hence, the desire to create "virtual mixing rooms" for headphones with very balanced, minimal reverberation.

That's all, of course, if you mix or produce music. For just listening purposes, I think we can all choose what we prefer and whether we want to hear precisely what the artist and mix engineer intended, or whether we want to make listening more fun.

I also understand there is no way of making a stereo classical piece feel like you are sitting inside the concert hall without adding reverb channels to the sides and back. But I also think that's what the Atmos mix does anyway (or maybe the mix engineer decides to pan sources around you and make a soprano run circles around your head, which I hope nobody does :) )

@variance, many thanks, I will check the new scripts out! Also, a silly question - if you plot personal or other HRTFs in REW, do you run a sweep trough Sparta plugin with a sofa file inserted, or do you have a simpler way of doing it?
 
@variance, many thanks, I will check the new scripts out! Also, a silly question - if you plot personal or other HRTFs in REW, do you run a sweep trough Sparta plugin with a sofa file inserted, or do you have a simpler way of doing it?
Hey @visualizer. These plots are generated by my mesh2SOFA tools. I recently added an extra tool to the repo called "sofa_mastering_tool.py". Basically you just drag a SOFA file onto the window, and then you can either resample the SOFA files or export CSV files of the Diffuse-Field HRTF with a configurable tilt (set to 0 for neutral diffuse field). I then load those CSV files into REW and overlay them. I can make a quick video of how to do this if you'd find it helpful!

I'm also thinking of adding the ability to load multiple SOFA files onto this app so that it's easy to batch generate these DFHRTFs.

Note: If the result of exporting the CSVs is a flat line instead of curved, then the SOFA file has already been diffuse field equalized. I'm going to investigate if it's possible to "un-diffuse-field-equalize" these kinds of files. Basically I'm just curious to compare the various diffuse field responses from the various databases out there. To my knowledge nobody has done this, so we really haven't seen how different people's overall hearing is from a headphone point of view - we've all been looking at mannequin head DFHRTFs only!
 
I finally tried mesh2hrtf after scanning my head/ears with a borrowed iphone 12. The externalization was good, but for already mastered stereo music I didn't enjoy the experience that much, too much coloration. But I am very happy using those HRTF (30 and -30 grades) only for crossfeed. I share this simple setup in case someone want to give it a try (or improves it):

Setup: Only the contralateral HRTF is used (ear opposite to the sound source). The input signal is split: one stereo branch goes direct, the other stereo branch passes through the contralateral HRTF and is mixed at reduced level.

Latency alignment: To remove the absolute HRTF latency while preserving the natural ITD between ears do the following: Feed an impulse into the ipsilateral HRTF (not the contralatera) and locate the peak sample — this gives you the arrival time of the nearest ear. Trim that same number of samples from the contralateral HRTF output.

EQ with Autoeq: Apply linear-phase FIR convolution EQ after the HRTF: Harman-like target on the direct branch, flat with +3 dB bass boost on the crossfeed branch. This compensates for the tonal coloration introduced by the contralateral HRTF and ensures both branches sum coherently across the spectrum.

Crossfeed level: Attenuate the contralateral branch by around -14 dB before mixing. Calibrated by ear using hard-panned material — enough to narrow the image without collapsing to mono.

Mix: Sum direct and crossfeed branches. Each ear receives its direct signal plus a quieter, HRTF-colored, naturally delayed contribution from the opposite channel, reducing perceived stereo width.
 
Just wanted to let folks know that Mesh2SOFA has been updated and now supports multiple evaluation grids in the same run. The Mesh2Input plugin itself supports it natively but I had to figure out how to get it to work with my particular workflow. The reason I did it is to verify that the Default grid produces the same or very similar result as a true T-design evaluation grid.

I noticed that @korrika is using the sofa results for crossfeed in a manner very similar to how @fcserei with a "true stereo" type crossfeed. Maybe I should work on adding a feature to the "Generate Extras" step in Mesh2SOFA and/or a standalone tool that can pull the IRs needed to do this automatically. Is this a feature that people might use?
 
I took the 11 64-channel Eigenmike files that were closest to ATMOS bed and heights (used MATLAB to convert them to wav first), then ran them through the Eigenmike VST encoder plugin I got from their web page (so they become HOA 64-channel files). I then made 12 tracks in Reaper, routed audio in, and convolved each channel except LFE with the Sparta Matrix convolver using 64 HOA wav files I made previously. After that, every channel is routed to the master bus, which has Sparta ambiBIN in 6-order binaural with my sofa file on it. And headphone eq on top. I also kept the LFE separate and just used the binaural plugin to set it ahead of me in space + low pass filter at 60hz. The speakers from the impulse responses roll off around 50, so the clean bass channel helps out. I also measured the "speakers" with REW and got a flat response with LFE and left/right channels.

Side bonus is that I can now cook eggs on my M1 when all this convolution is going on :) I've never gotten this machine that hot.

But! I also got Virtuoso trial extension, and well... it does amazing things now that I use my sofa. The Reaper solution is fun, but the placement is much more realistic in Virtuoso. At first, I thought the pre-set rooms in Virtuoso use different files in the background, but after a while I figured it's the same engine (just different parameters like size, etc.) Definitely using it going forward.
What is Eigenmike? I think there is a lot to this thread that I've missed.

Room correcting virtual speakers with REW is next level lmao.

Virtuoso is great, probably the best algorithmic reverb I've heard, but it definitely is missing the realism that convolutional reverb can have. As a side note, it recently received an update with a new room that may be different from the standard reverb settings (as well as a free camera headtracking app!). I'm just about to install and test now.

I've been thinking about doing a custom ambisonic setup to record the impulse responses for, then use in HeSuVi real time. The customizability of an ambisonic renderer is obviously the appeal, the ability to customize rendering exactly how you like, but some of those plugins have a steep learning curve and very dubious or esoteric settings (since they are obviously used for research and not consumer focused tech). Getting even a standard 7.1 layout is also very tedious, and I can never seem to get the plugins to work well together (almost always have some clipping).

@visualizer that's a pretty intense setup! I had wanted to try setting up some per channel reverb with Sparta Binauraliser, but I find I like to listen either with Virtuoso for some recordings and Binauraliser for others. Both plugins sound better to me when 2 channel tracks are upmixed to 7.1 from JRiver, but I'd also like to experiment with other options such Penteo.

What kind of tracks are you listening to? Are you routing Atmos tracks from Apple music? AFAIK this isn't possible on Windows; I currently don't have many in the way of multichannel recordings but am keen to find other sources of multichannel tracks and/or ways to playback Atmos tracks with my SOFA file.

Also, did any of you folks see the new plugin "Orbit Spatial" featured on Michael Wagner's YouTube channel? It's an Atmos renderer (Mac only) that allows you to load in custom SOFA files. I really need to get a Mac at some point this year!
What is Penteo?

Correct, Windows doesn't have object-based audio output at all (mac or otherwise), all object-based audio goes through the Spatial Sound API which can only be decoded by headphone binauralization (Sonic, Atmos for Headphones, DTS:X) or output to an HDMI decoder (e.g. receiver). Unless the actual application outputs non-object-based encoded audio to linear PCM with an arbitrary channel count (think, decoded Atmos in Reaper pointed at a 16ch WASAPI device, which does work), then all object-based encoded audio goes through this API. So video games, True-HD movies, and other real-time use cases will not be decodable on Windows to a 16ch device (like it can be on Mac) without an update from Microsoft. Which unfortunately, probably will never happen. However, Apple Music can and will output 5.1 "Dolby Audio", and of course, standard 5.1 and 7.1 files (movies, music, or otherwise) can be output by any player that supports it, assuming you are outputting to a multichannel audio device in WASAPI.

I finally tried mesh2hrtf after scanning my head/ears with a borrowed iphone 12. The externalization was good, but for already mastered stereo music I didn't enjoy the experience that much, too much coloration. But I am very happy using those HRTF (30 and -30 grades) only for crossfeed. I share this simple setup in case someone want to give it a try (or improves it):

Setup: Only the contralateral HRTF is used (ear opposite to the sound source). The input signal is split: one stereo branch goes direct, the other stereo branch passes through the contralateral HRTF and is mixed at reduced level.

Latency alignment: To remove the absolute HRTF latency while preserving the natural ITD between ears do the following: Feed an impulse into the ipsilateral HRTF (not the contralatera) and locate the peak sample — this gives you the arrival time of the nearest ear. Trim that same number of samples from the contralateral HRTF output.

EQ with Autoeq: Apply linear-phase FIR convolution EQ after the HRTF: Harman-like target on the direct branch, flat with +3 dB bass boost on the crossfeed branch. This compensates for the tonal coloration introduced by the contralateral HRTF and ensures both branches sum coherently across the spectrum.

Crossfeed level: Attenuate the contralateral branch by around -14 dB before mixing. Calibrated by ear using hard-panned material — enough to narrow the image without collapsing to mono.

Mix: Sum direct and crossfeed branches. Each ear receives its direct signal plus a quieter, HRTF-colored, naturally delayed contribution from the opposite channel, reducing perceived stereo width.
What did you do to edit your mesh? I've asked this question a while back and received some great replies but i'm still unsure on the exact process for a "broken" mesh (e.g. a mesh with small holes behind the ears). I'm a complete beginner at blender, although I know that there have been some programs like the MESHtoSOFA, i'm about to review the rest of the thread but any advice for meshes would be appreciated. I recently got very good faceID scans when I was also taking some pictures and anthropomorphic measurements for EAC_Individualized and what not. Unfortunately, I don't think Mesh2HRTF (the original program) produces great localization. With all the AI tools and models out nowadays, I hope they potentially update the model soon.

I'm actually thinking of just biting the bullet and buying some binaural in-ear microphones. I have some decent speakers to use with Impulcifier, and then there are scripts like generate HRTF, which supposedly can give you a decent SOFA HRTF file without just in-binaural mics and room calibration.
 
That was my feeling too. I think they voiced it to users watching iDevices at arms length, that's why the close distance virtual sound.


Apple TV mode only kicks in if you connect your AirPods to an appleTV. The AppleTV rendering is a BRIR of a bigger room, the target market is for home cinema environment I guess.
I measured everything within the computer using the Logic Pro's built in plugin which can access the Apple system renderer even when you are not playing through the Apple headphones. So you can get a multichannel directional stimulus into the plugin and record the response.


I add the missing sound filed to existing recordings between the source and the binauralizer. I use ambisonics multichannel reverbs, with a selected IR trying to match the original performing venue. I keep the direct sound from the record intact, just add the reverb without the direct part. It will basically recreate the missing ambience channels from stereo, kind of upmixing it.
Check this paper after Page11: https://www.angelofarina.it/Public/Papers/155-AES19th.PDF

I am not familiar with PC solutions.
One of the cheapest and easiest great-great-great ambisonic reverb with real spaces is the Waves IR-360 (Mac and PC) which is based on Angelo Farina's work. Is is 5.1 only, so no height. It works in any plugin host, Logic Pro too, so you can combine with the Apple renderer. Or you can use Sparta binauralizer and your HRTF with an OSC head tracker for a complete solution with any headphone .
The Space Designer in Logic Pro is also a great ambisonic reverb. The built in B Format rooms are good, but single point source, so you have to insert one for each input channel and link them. With the Impulse Response Utility, you can create your 3d spaces. You can to find LR or LCR impulse responses in B format on the net, and build your own multipoint B Format rooms for a single master channel reverb. These are usually eery realistic and can be fed to the Sparta or Apple renderer within Logic Pro. They mask successfully the Apple renderer's built-in music mode BRIR.



If you assemble the stereo playback - reverb to multichannel - binaural render from multichannel to headphone chain, you can capture a couple of 4 channel full stereo impulse responses with the most used reverb settings in a fixed head position.
I use those IR with a convolver on devices where the whole head tracking chain is not available.
I was also experimenting capturing the multichannel output of the reverb and save it in a multichannel Flac ( there are batch utilities for that). Foobar2000 on an iPhone can play back multichannel files and feed it directly in Mch format to the Apple renderer with head tracking.


My basic workflow is the following, although variance is working on a script to make it easier.:

1. Scan your head with an iphone. Do it once, do it right, you don’t have to splice in high res ear scans. It will make no difference, just makes it complicated.
I use swim cap for the hair but no crazy reference grid. I put short earplugs in my ears exactly in the position where the ear canal mic will be for headphone correction. This way I don't have to sculpt my ears and I have the proper mic position reference in the model. Start from the back, do a little ups and down around the ears and your face and finish in the back. In this way the accumulated errors will show up in the back, where it is not important. Small holes are ok if smoothing over them is good approximation.

2. Use Meshmixer to prepare the model. That will solve most of the problems which are much more complicated in other apps. Remove the excess, make it solid, remesh with the finest resolution and check for holes. Any hole can ruin the following steps.

3. Position the head model in the coordinate system in Blender according to the tutorial.

4. Run the hrtf_mesh_grading script for the left and right to the desired resolution. It will reduce the million vertices models to 20-70k which can be handled in the simulation.

5. Import the 2 simplified models to Blender, and set the materials and microphone locations. Export the left and right model separately with the hrtf export script. It will create 2 simulation work directories for the 2 ears.

6. Run NumCalc on each. if you have less memory, you can run the NumCalc without the wrapper script. I ran my simulations on 8Gbyte on a Mac.

7. Run finalize_hrtf_simulation to create the sofa files and diagnostic plots.

8. Measure your ear with an in ear mic and correct your headphone accordingly. Headphone responses on your head measured with an in ear mic usually has nothing to do with published standard measurements.
I had a longer reply for this and then the page accidentally reloaded!

Anyways, I just wanted to say thanks for the detailed reply. I have a few things to add.

Firstly, I wanted to mention that in order for the personalized profile to be applied to Logic Pro's Apple Spatial Renderer, you need to have Airpods connected. This frustrated me the first time I recorded it to use with HeSuVi and after editing, realized it wasn't personalized.

Second, I have nearly all of the Waves reverb plugins, but is there a way to upmix them into ambisonics? For example, 5.1 reverb is pretty useless for 7.1 content or recording (especially if you want to record the impulse response for HeSuVI), is there a plugin that can wrap lower channel counts into proper ambisonics so that all channels of a 7.1 ambisonic setup would be affected by a (wrapped/upmixed) 5.1 reverb plugin?
 
What did you do to edit your mesh? I've asked this question a while back and received some great replies but i'm still unsure on the exact process for a "broken" mesh (e.g. a mesh with small holes behind the ears). I'm a complete beginner at blender, although I know that there have been some programs like the MESHtoSOFA, i'm about to review the rest of the thread but any advice for meshes would be appreciated. I recently got very good faceID scans when I was also taking some pictures and anthropomorphic measurements for EAC_Individualized and what not. Unfortunately, I don't think Mesh2HRTF (the original program) produces great localization. With all the AI tools and models out nowadays, I hope they potentially update the model soon.
I just followed the Mesh2hrtf video tutorial on mesh preparation the best I could, using Meshlab and Blender.
 
@aaronth07 Mesh2SOFA will make it easier and less error prone to use Mesh2HRTF, but it assumes that your mesh is already defect free. I've gone through the scanning and cleanup process several times now and was thinking of doing a video that discusses specifically how to process a very high res mesh that was scanned with a 3d scanner and fixing exactly those issues with holes and other weird defects that can occur. Would this be helpful?

That being said, while I've gotten good at removing defects and producing a mesh that processes efficiently, I'm not getting consistent results between scans. Every run I've done with scanning and processing my own head and calculating DFHRTF has produced a very different result above about 4 kHz. Due to the shape of my ear, the critical valley that runs from the concha up the front of the ear (cymba concha) is occluded and cannot be scanned; there is some degree of variability with this area on each scan which may be the main culprit. I've experimented with making a reverse mold of this area and merging the scans, but I think I need to get a different material (such as smooth-on body double) to make a more accurate mold.

I've posted a large number of Diffuse Field HRTFs from public databases at variancelog.squig.link, including where there are both measured and simulated HRTFs like the FABIAN database. On the whole, there is very poor correlation between measured and simulated HRTFs above even about 2kHz. I'm trying to understand where the main source of error is. The closest match I've seen between simulated and measured DFHRTF is in the FABIAN HRTF database. Fabian's ear has an open cymba concha, which may be a reason for the success? I should reach out to him in case he has any insight.
 
I tried the latest Virtuoso update with the new 2L room. However, am I the only one who find that Virtuoso tends to sound very colored? I can't seem to even EQ it to sound right to me on most tracks. I prefer running a chain with the SPARTA and/or COMPASS ambisonic plugins instead because it enhances spaciousness and allows head tracking without coloring the sound as much. (I've tried both direct and Diffuse field equalized SOFA files with Virtuoso.)
 
I tried the latest Virtuoso update with the new 2L room. However, am I the only one who find that Virtuoso tends to sound very colored? I can't seem to even EQ it to sound right to me on most tracks.
What would be your reference against which Virtuoso is “coloured“? And what kind of colour are you talking about? In stereo or in surround or in Atmos or ...?
I don't think there is such a thing as “uncoloured“. The result can only be an approximation, in binaural or in a room. And sometimes one has to give up a bit of “nice“ tonality to have more spatial realism, where a cello or a voice sounds like a physical object/subject and not like a giant poster of itself.

When I first checked out Virtuoso 2 it became instantly clear to me that HRTF E (based on Gras-Kemar) is the option in the plugin that gives by far the best spatial impression for me. [KU100 (Uni Cologne version) had nicer tonality but sounded mashed to me in comparison.]
So, everything below is about HRTF_E (Kemar).
I would agree that in stereo Virtuoso (HRTF E) does sound somewhat bright and sharp with many recordings, compared to a Harman EQed headphone at the stereo outputs. (I am in the group that agrees to a high degree with the Harman target.)

I, too, have found that it’s not at all easy to adjust the sound for stereo. You have to contend with all these format limitations.
So I tried different things:
- Measuring and EQing the “room measurement“ in Virtuoso.
- Measuring an approximate and weighted average FR for the head rotating through 360 degree (in steps of 30).
- Windowing the “direct sound“ at the ear and EQing that for flat.

Interestingly, all of this resulted in similar filter curves. However, though the details differed slightly, the result was not fully satisfactory in every case. The sound was uneven for me, some recordings sounded well, others not so much.
Typically the brightness was easy to correct, but the sharpness not so much. And when the sharpness was gone, most of the time the sound got dull or mashed in the process. Reducing the filter gain amplitudes made the sharpness come back ...
My guess it, that trying to be exact and using too many and too narrow filters is not the way to go.
And notably, there was a distinctive difference between classical recordings in an acoustical situation and pop recordings that were panned at the console. I could not make both similar to Harman at the same time.

So I played around with eyeballing the results from above and ended up with only two peak filters. (For me + Kemar + 2L)
PK - 1600Hz | Q1.71 | +7.2dB
PK - 2800Hz | Q0.43 | -6.0dB
[This is a starting point, and some recordings seem to need a bit more or a bit less for best results.]
This removes maybe 80% of the tonality difference compared to Harman for me. And about the remaining 20% I am not sure whether Harman is right or wrong.
Anyways, binaural sounds so much better! For me, “in-the-head“ localisation or cloudy sources are an important kind of coloration too.

For surround/atmos the situation is different for me. The tonality in Virtuoso is rather fine without EQ.
My experience is for classical, acoustic recordings as this is 99% of what I listen to. I wouldn't know which recordings to check.
If anything I would apply a reduced version (30-60%) of the filters above for some recordings, but the effect is small.
Obviously bad recordings will not become good ones.

All this is a bunch of compromises of course and one has to get lucky with a generic HRTF or get a personal one to get a good combination of tonality and spatiality, but I am content at the moment. Good head tracking helps a lot. [The camera head tracker is nice to have, but a gyroscopic one is much faster and smoother. I can only recommend.]

Every run I've done with scanning and processing my own head and calculating DFHRTF has produced a very different result above about 4 kHz.
Maybe, but how do you know that “very different results“ are indeed significantly different? When you look at the FABIAN HRTFs a turning of the torso (not the head!) of 30° makes a big difference in the graphs, but I am quite sure it will sound very similar.
How do your different head scan results sound?
 
Last edited:
@olieb
What would be your reference against which Virtuoso is “coloured“? And what kind of colour are you talking about? In stereo or in surround or in Atmos or ...?
My comparison was against (a) no binauralization and (b) against how I listen through an ambisonic chain of Sparta AmbiENC (1OA) or SpartaAmbiRoomSim (1OA) > Compass Upmixer (upmix to 3OA) > Compass Binaural or Sparta AmbiBIN.

I analyzed the outputs to understand why they sound so different, and why the Ambisonic signal path sounds so much more "natural" to me. It turns out they are VERY different, and the ambisonic signal path does change the frequency response less than Virtuoso. Sparta Binauraliser produces the same result as Virtuoso, minus the room reflections.

Note1: These are all using standard stereo source file with the same -30 / +30 positions for the virtual speakers.
Note2: I used a MTW window on the Virtuoso output in REW to more closely compare to the Sparta Binauraliser output.
 

Attachments

  • binaural_rendering_comparison.png
    binaural_rendering_comparison.png
    187.6 KB · Views: 21
Maybe, but how do you know that “very different results“ are indeed significantly different? When you look at the FABIAN HRTFs a turning of the torso (not the head!) of 30° makes a big difference in the graphs, but I am quite sure it will sound very similar.
How do your different head scan results sound?
In this context I'm referring to the diffuse field HRTF derived from the sofa files using Mesh2SOFA's "sofa_mastering_tool". The DFHRTFs of the different FABIAN head rotations are virtually the same, barring some wiggles around 1kHz on some.

Here's a comparison of the DFHRTF derived from three scans I did of my own head and shoulders. There is really poor consistency between them. There are three differences I note between the scans that I'll pay closer attention to in my next attempt:
  1. The scans include my in ear mics. Each of the scan used a different version I made, each with slightly different depths.
  2. The area around the back of the concha and the cymba concha are occluded and cannot be scanned directly. I'm going to mold that area with silicone and merge it with the main scan.
  3. In each scan, I merged a higher res scan of the ears with the scan of the head. There is a small variation in rotational alignment between these attempts that I have ideas on how to resolve on my next run.
All that being said, there is ultimately no way to gauge the correctness of the mesh2HRTF results without having measurements to compare it against.
 

Attachments

  • log of variancelog variances.png
    log of variancelog variances.png
    116 KB · Views: 23
I'm actually thinking of just biting the bullet and buying some binaural in-ear microphones. I have some decent speakers to use with Impulcifier, and then there are scripts like generate HRTF, which supposedly can give you a decent SOFA HRTF file without just in-binaural mics and room calibration.
Actually, this is something on my bucket list as well. I've been looking for some microphones, and the Soundman OKM II has caught my eye. I tried to find a Sound Professionals MS-CB-900 that has a much smaller capsule size, but it seems to be discontinued. I also found an adapter that lets me connect the 3.5mm jack to my audio interface's 2 XLR inputs. So, considering ordering both of those. I have access to different studio rooms that have a good response coming out of the speakers.

About Virtuoso and its so-called update, it's basically just another preset with different parameters. Virtuoso basically acts as a simulator, where, depending on the room size and other given factors, you can manipulate the sound; also, the sound coming out of the plugin is always close to flat. You can actually insert a custom SOFA file with a flattened response to the Virtuoso plugin and measure the response coming out of the plugin with REW. I played around with different settings to see how much the frequency response changes and how the room size manipulates the flatness of the "speakers".

What I would also note about Virtuoso is that the custom HRTF plays a huge role in its sound realism. I would like them to talk more about the custom HRTF part, since for me it actually makes up 50% of their plugin's promises. I actually wrote to the creator asking for some tips on headphone equalisation to better suit virtuoso (that was before I measured the plugin to see what it's doing). I did get a response that recommended sticking to the stock Harman curve EQs, and also stated that measuring with in-ear microphones is not feasible because it should be done close to the eardrum with a small probe. Anyways... for me, measuring headphones with an in-ear microphone gives a flatter and more pleasant response than any of the ready-made EQs that can be recommended. (I already have one very small electret microphone, not a stereo pair, but it was still useful for headphone EQ.)

@variance I'm thinking of scanning the most important parts of my ear with eFit 3D Scanner, one of those services just happens to be close to me,:) I could then try to remake my mesh2hrtf process with an even more accurate mesh. Hopefully, merging the iPhone head scan with more accurate 3D models won't be that hard to get right.

I would really like to eventually compare the Virtuoso (with my perfected V2 HRTF) to the Impulscifier experience.
 
Back
Top Bottom