but my major issue is the room rendering, there is very little reverb and the distance is quite close, particularly for use in gaming, or of course, media consumption
That was my feeling too. I think they voiced it to users watching iDevices at arms length, that's why the close distance virtual sound.
Firstly, I find it interesting that there is a separate "Movie Mode" (or whatever Apple may call it) for the Apple TV, have you tested with the Apple TV App on the Mac? What exactly triggers this different mode, only Apple's TV's spatial audio? In addition, how did you manage to record the Apple TV's output? Did you use a coupler or HATS, or did you find a way to record the TV's output, in the same way that you can use an application like loopback to record the spatial audio output on Mac? Re-reading your post, I assume that it is unfortunately the former.
Apple TV mode only kicks in if you connect your AirPods to an appleTV. The AppleTV rendering is a BRIR of a bigger room, the target market is for home cinema environment I guess.
I measured everything within the computer using the Logic Pro's built in plugin which can access the Apple system renderer even when you are not playing through the Apple headphones. So you can get a multichannel directional stimulus into the plugin and record the response.
I add the missing sound filed to existing recordings between the source and the binauralizer. I use ambisonics multichannel reverbs, with a selected IR trying to match the original performing venue. I keep the direct sound from the record intact, just add the reverb without the direct part. It will basically recreate the missing ambience channels from stereo, kind of upmixing it.
Check this paper after Page11:
https://www.angelofarina.it/Public/Papers/155-AES19th.PDF
I am not familiar with PC solutions.
One of the cheapest and easiest great-great-great ambisonic reverb with real spaces is the Waves IR-360 (Mac and PC) which is based on Angelo Farina's work. Is is 5.1 only, so no height. It works in any plugin host, Logic Pro too, so you can combine with the Apple renderer. Or you can use Sparta binauralizer and your HRTF with an OSC head tracker for a complete solution with any headphone .
The Space Designer in Logic Pro is also a great ambisonic reverb. The built in B Format rooms are good, but single point source, so you have to insert one for each input channel and link them. With the Impulse Response Utility, you can create your 3d spaces. You can to find LR or LCR impulse responses in B format on the net, and build your own multipoint B Format rooms for a single master channel reverb. These are usually eery realistic and can be fed to the Sparta or Apple renderer within Logic Pro. They mask successfully the Apple renderer's built-in music mode BRIR.
Assuming you are recording in HeSuVi format (not sure what other format there would even be, although suggestions are welcome)
Applying a reverb/room simulation directly to the file I believe wouldn't work, since it would be channel independent. Re-recording the response a second time could introduce more artifacts and errors that would be very hard to impossible to completely account for.
If you assemble the stereo playback - reverb to multichannel - binaural render from multichannel to headphone chain, you can capture a couple of 4 channel full stereo impulse responses with the most used reverb settings in a fixed head position.
I use those IR with a convolver on devices where the whole head tracking chain is not available.
I was also experimenting capturing the multichannel output of the reverb and save it in a multichannel Flac ( there are batch utilities for that). Foobar2000 on an iPhone can play back multichannel files and feed it directly in Mch format to the Apple renderer with head tracking.
speaking of which, if you were willing to give more details on how you scanned and edited the model
My basic workflow is the following, although
variance is working on a script to make it easier.:
1. Scan your head with an iphone. Do it once, do it right, you don’t have to splice in high res ear scans. It will make no difference, just makes it complicated.
I use swim cap for the hair but no crazy reference grid. I put short earplugs in my ears exactly in the position where the ear canal mic will be for headphone correction. This way I don't have to sculpt my ears and I have the proper mic position reference in the model. Start from the back, do a little ups and down around the ears and your face and finish in the back. In this way the accumulated errors will show up in the back, where it is not important. Small holes are ok if smoothing over them is good approximation.
2. Use Meshmixer to prepare the model. That will solve most of the problems which are much more complicated in other apps. Remove the excess, make it solid, remesh with the finest resolution and check for holes. Any hole can ruin the following steps.
3. Position the head model in the coordinate system in Blender according to the tutorial.
4. Run the hrtf_mesh_grading script for the left and right to the desired resolution. It will reduce the million vertices models to 20-70k which can be handled in the simulation.
5. Import the 2 simplified models to Blender, and set the materials and microphone locations. Export the left and right model separately with the hrtf export script. It will create 2 simulation work directories for the 2 ears.
6. Run NumCalc on each. if you have less memory, you can run the NumCalc without the wrapper script. I ran my simulations on 8Gbyte on a Mac.
7. Run finalize_hrtf_simulation to create the sofa files and diagnostic plots.
8. Measure your ear with an in ear mic and correct your headphone accordingly. Headphone responses on your head measured with an in ear mic usually has nothing to do with published standard measurements.