• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Crossfeed for headphones

At the end of the day, unless one is listening to a real instrument or audio source such as a vocalist, in a real room, with one's own ears, everything else is a virtual representation. This would include speakers in a listening room, or speakers on a live stage or theatre, or any other form of artificial reproduction.

Therefore can we accept, pretty much anything we hear on speakers (in a room, or in open air), or headphones, or IEM's and anything else is "virtual", or brains do a job of translating this audio into something more or less realistic to real life. Every recording we hear is virtual, in the sense that, we are NOT in the studio with the same speakers where this was mixed, and even where something like Dolby Atmos makes an effort to establish some standards, in terms of angles of speaker placement, we are not in the same room, nor listening to the same speakers used to mix/master that recording.

Therefore any impressions we make of what we hear, will be subjective, and personal.

With all that said, let's look at tools like crossfeed, with or without room emulation via any method, as a listening option for headphones and IEM's. Fortunately we live in a world where there are many options for crossfeed emulation, and we can also combine this with one or more or no options, for room/speaker emulation. And then there is EQ, which is also optional.

There is no right of wrong about any of these, let's treat them like salt and pepper shakers at a dining table. Optional, add or not add to taste. Try them out, see what you like, discard what you do not like. These things beyond or main listening tools, the headphones or IEM's are just icing on the cake.

My experience has been, the resolving abilility of the headphone or IEM itself, makes a huge difference, to the end result, and allows you to hear or not hear well enough, the salt and pepper of crossfeed and other optional listening aids.

I started this journey on listening tool enhancers, a few years ago, with an AKG K702, one of the better over ear headphones for its time, but the more I listen to other devices, such as some IEM's which, in my opinion, are even better resolving, I can hear more of a difference between the crossfeed and room emulation options available to me. Let's start with simple stero placement without any other bells and whistles. My KZ PRX, is more resolving of stereo placement, than my KZ SAGA, so it makes sense that the PRX is more revealing of any further listening enhancement I may wish to add via crossfeed, or room emulation. By the way I must add, using the stock eartips that came with the PRX, does not do them justice, for me. I have to use the Moondrop Spring XL (XL is the size which fits me best), to truly get the best from these IEMs. (I use the XL's for pretty much all my favourite IEM's and they deliver the best results for my ears, from the few eartips I have tried.)

May I suggest, that far more important than any augmentation to our listening chain, is the quality of the fundamental transducer(s). The headphone or other listening device, placed on our head. This is the 70%, anything else is just icing. The eartips in my case definitely help and add another 20%, with other things like crossfeed(removing the extreme left right separation on some recordings), EQ(for personal preference or simulating a room EQ curve), and room emulation(a personal preference), being just - icing, salt and pepper. Add to taste or do not add.

The million dollar question. Are your listening devices, able to resolve well enough, that you can hear accurately enough, any difference that the icing, salt and pepper of crossfeed, and other augments, adds? No point debating crossfeed, if the weak chain in the link is the headphone or IEM(and eartip) itself, limiting our ability to perceive any other enhancements we add to the listening chain !!

On crossfeed, there are now so many ways to do this. Research the various options, and try them all out, within your available time and budget. Pick the one you prefer, and be done with it. Or maybe pick none, if the difference is negligible.
 
On crossfeed, there are now so many ways to do this. Research the various options, and try them all out, within your available time and budget. Pick the one you prefer, and be done with it. Or maybe pick none, if the difference is negligible.

Well, there really aren't that many of them. If you don't go into experimental, crazy-level subjectivism.


The goal would be to hear the sound from the headphones as much as possible as you would hear it live or from a speaker. In their case, you sense the sound of the sound sources quite strongly (monophonically) in both ears most of the time, right? The sound has to be really sideways and really high in frequency for "panning to one ear" to occur. As is typical with bare headphone stereo quite often.

Try it yourself: listen to the sounds around you WITH YOUR EARS OPEN. Isn't that pretty much heard in both ears for the most part?! The LR -balance varies slightly depending on the situation, but neither ear is ever completely muted. Bare headphone-stereo -like "ear channel separation" l. sharply wide stereophonicity, that only focuses on one ear, almost never really occurs in any practical live hearing situation.

So... as a practical audio-Einstein conclusion, something should really, and rather urgently be done about this. Or our core aspiration for natural sound falls flat the moment we put on the headphones.

This leads to the fact that imitating this rather mono-like live reality with headphones, requires that crossfeed with a pretty heavy hand. So heavy, that many traditional audiophiles may almost start crying about this amount of mono. But that's just how it is in live reality; the sound is almost always strongly "crossfed". Achieving this with headphones requires a correspondingly strong crossfeed-setting from our headphone equipment.
 
I don't like reality that much - I just wanna have a good time. :) Often I probably prefer an illusion. Maybe it's all an illusion.
 
The advanced virtual cyber solutions of the future will only be there in the virtual cyber future... :)
The few personal HRTF solutions currently available are yet too impractical or otherwise unsuitable for normal home listening.
Those cyber looking HRTF measurements in anechoic chambers using rotatable arrays with dozens of speakers are no longer required for the purpose discussed here. Nowadays, personalized HRTF/BRIR are more available than ever: solutions with measurements (Realiser A16 for those who can afford, Impulcifer software for DIY) and sufficiently close approximations (like we do it) are available or available soon.

All above mentioned solution can create a soundstage in front just like with speakers.
 
Those cyber looking HRTF measurements in anechoic chambers using rotatable arrays with dozens of speakers are no longer required for the purpose discussed here. Nowadays, personalized HRTF/BRIR are more available than ever: solutions with measurements (Realiser A16 for those who can afford, Impulcifer software for DIY) and sufficiently close approximations (like we do it) are available or available soon.

All above mentioned solution can create a soundstage in front just like with speakers.

The Smyth Realiser has been featured in internet talk as the Jesus of headphone playback for probably 10 years already. Expensive as hell and only available to buy online. A manufacturer that exudes the feeling of an internet start-up tech company + a technically really exotic/complex device gives a pretty high bug-sack -risk prediction. "In case of a problem, come to our service center in the Cayman Islands with your device". Yes, it immediately inspires a lot of confidence... or not.

You should also record HRTFs for this Realiser yourself with ear mics, in some acoustics WORTHY OF MODELING. In practice, probably in some top studio (estimated rate around 500 €/h). So you should know where to travel, to go on site with this Realiser for HRTF recordings. Preferably less than 500 km away.
My techno crystal ball predicts that there will be quite a few problems, and especially Technical Problems with all this. A damn lot of them, actually.

Because of these things, the Realiser is quite clearly in the category of "impractical or otherwise unsuitable" solutions.



A DAW + software solution is also out of the question.

Controlling an interconnected software puzzle game, connecting to and making it to work with playback programs, verifying the operation of the whole and keeping all of this in working order is practically impossible. At least for me.
There could also be quite a lot of latency. And especially latency fluctuations. This is critically important, because video audio playback is my main purpose.
This would also be completely dependent on the host PC as the system's only audio source, which would be a massive usability penalty shot.

Because of these factors, a DAW + software solution is as well quite clearly in the category of "impractical or otherwise unsuitable" solutions.
 
A DAW + software solution is also out of the question.

Controlling an interconnected software puzzle game, connecting to and making it to work with playback programs, verifying the operation of the whole and keeping all of this in working order is practically impossible. At least for me.
There could also be quite a lot of latency. And especially latency fluctuations. This is critically important, because video audio playback is my main purpose.
This would also be completely dependent on the host PC as the system's only audio source, which would be a massive usability penalty shot.

Because of these factors, a DAW + software solution is as well quite clearly in the category of "impractical or otherwise unsuitable" solutions.

I concur that having to pass audio from a video app, through a chain of audio processors(aka plugins) running inside a DAW, may take some work, and has a learning curve. Permit me to highlight the pros and cons of this approach.

PROS

1. Flexibility, to choose any combination of plugins, to achieve each aspect of the audio processing.
2. Experimentation. Easy to swap and compare different chains of options.

CONS

1. There is a learning curve.
2. Depending on whether you are running on a Mac or PC, or Linux, not all plugins will be available for your operating system

If I may address your concerns about latency, it should be possible, with ease, to achieve a latency as low as 15 milliseconds in total, all the way from the video app, through the DAW and the plugins running in the DAW, and through the DAC, post the D/A conversion (which has typically a latency of about 1 millisecond or less).

The total latency will be the combination of all latencies

1. Video app to DAW - I have not measured this, but I do not expect it would be more than 5 milliseconds, most likely less.
2. DAW processing with the plugins running in the DAW. I just checked this aspect of latency in my implementation and was pleasantly suprised to report - there is none whatsoever. Absolutely Zero latency. This obviously depends on the specific chain of plugins one has chosen to use in the audio chain.
3. DAW buffer, between the DAW and the audio interface or DAC. This is set in the DAW configuration, and could also be set in the audio driver configuration. In today's world it is realistic to expect that this buffer can easily be set as low as 7 milliseconds, or less. I have mine set to 5 milliseconds.
4. The DAC latency, during the conversion from Digital to Analog, we can assume it's no more than 1 millisecond, for modern DACs.

Adding all that up 5 + Zero + 7 + 1 = 13 milliseconds. I've rounded that up to 15 milliseconds as a realistic maximum, that any audio chain should be able to achieve via a DAW and well chosen plugins. And this woud be a worst case scenario. In reality I can envisage that the total latency would be no more than 10 milliseconds, with a bit of tweaking of the configuration.

Summary, the latency can be negligibly low, typically 10 milliseconds or less, and no more than 15 milliseconds worst case. That's like placing speakers or a television with speakers, 10 feet or 15 feet in front of you, cos it takes about 1 foot per millisecond for audio to travel from a speaker to our ears. i.e any latency introduced by a DAW with plugins, is no more than the typical latency, one would experience listening to audio, in a room. So in conclusion, it can be low enough, that we need not be concerned about this.
 
The Smyth Realiser has been featured in internet talk as the Jesus of headphone playback for probably 10 years already. Expensive as hell and only available to buy online....

A DAW + software solution is also out of the question.
I can understand your reluctance in respect of the Realiser. In particular the price and the need to have access to a room with great acoustics first to make a virtual copy of.

But the DAW+ solution is actually much simpler than you think.
If you want to use sources outside the computer you can use an interface, send the signal through a host app for plugins (like Hang Loose Host) and use a single plugin (APL Virtuoso) to output via the interface. Both provide a trial for 14 days (and don't break the bank).
And it works on Windows and Mac.
EDIT: And for those who are overwhelmed with this there is even a standalone solution from APL.
The Realiser is more complex to operate than this.
With rerouting you can use the computer as source (for Atmos) quite easily. (On Mac "Blackhole" does the job nicely and there are tools for Windows, too)
Latency has been explained by @OK1 .
Virtuoso works with headtracking solutions with a few clicks.
If you can get your hand on personal HRTF (head only, no need for great room acoustics) you can load it into Virtuoso and this will probably surpass what is possible with the Realiser for most people.
 
Last edited:
  • Like
Reactions: OK1
I can understand your reluctance in respect of the Realiser. In particular the price and the need to have access to a room with great acoustics first to make a virtual copy of.

But the DAW+ solution is actually much simpler than you think.
If you want to use sources outside the computer you can use an interface, send the signal through a host app for plugins (like Hang Loose Host) and use a single plugin (APL Virtuoso) to output via the interface.
The realiser is more complex to operate than this.
This works on PC and Mac.
With rerouting you can use the computer as source (for Atmos) quite easily.
Latency has been explained by @OK1 .
Virtuoso works with headtracking solutions.
If you can get your hand on personal HRTF (head only, no need for great room acoustics) this will probably surpass what is possible with the Realiser for most people.
I have not tried out the Virtuoso yet. I did download the version 2 a few months ago. Hopefully I'll get round to reviewing this as time permits. Further to a quick look at its features, it seems competent. Will revert here with impressions, whenever I finally get round to using it.
 
Ok.

But that's where we're going into the DIY(w) -section of the propeller hat guys. Directly to deep end, glasses-in-the-fog. That is too much, at least for me. And the final results are very uncertain. To say the least.

It should be a PRODUCTIZED and READY package solution, a separate hardware device for sure. NOT a software solution on a PC.


Like the RME ADI-2 Pro and its crossfeed at the moment. But with advanced HRTF-functions and head tracking.
You would just load a ready HRTF-file into it, and it would work at maximum level right away.
One would get the HRTF-file, as you can already get from Genelec's Aural ID -service, based on a video recording of the head and ears.

Sure, partly sci-fi and just dreaming today. But maybe in the "future" (LOL)... :D
 
Ok.

But that's where we're going into the DIY(w) -section of the propeller hat guys. Directly to deep end, glasses-in-the-fog. That is too much, at least for me. And the final results are very uncertain. To say the least.

It should be a PRODUCTIZED and READY package solution, a separate hardware device for sure. NOT a software solution on a PC.


Like the RME ADI-2 Pro and its crossfeed at the moment. But with advanced HRTF-functions and head tracking.
You would just load a ready HRTF-file into it, and it would work at maximum level right away.
One would get the HRTF-file, as you can already get from Genelec's Aural ID -service, based on a video recording of the head and ears.

Sure, partly sci-fi and just dreaming today. But maybe in the "future" (LOL)... :D
It would be wonderful, to have a single piece of hardware that you plug in to the audio path, which does all this. So it does not have to be done on the computer. That would be wonderful and simple.

I agree with you. Sadly I do not yet know of such a device.

While this thread focusses on crossfeed, If I may add, from my experience there is also a benefit in adding some "room" effect, of some kind, to the audio. Why?

1. For music, and films, about every track we have ever listened to, is mixed on speakers, in a room, with at least some natural reveberation"echo" in the room. This is how the producers or mixing engineer or artist or record company executives, hear this audio, and approves it for release. On headphones or IEM's that reverberation is missing.

2. Especially if one spends a good amount of time listening to more recent audio mixes, which do not use a lot of reverb added in the recording itself, or one listens to podcasts, or Youtube interviews, where the microphone is close miked and recorded that way, with no ambience, such audio sounds like the person is speaking right next to your head, or ear, or inside one's head. It is most unnatural. Feels so strange. Adding some room reverb artificially, in the listening audio path, pushes the audio a little bit away virtually from our head or ear, and sound more like how a human being would sound speaking or singing to us, in the real world. In the real world rarely do we hear a human voice from just inches from our head.

Each person is different, so this reverb needs to have a control, so each person can dial in how much of it they need, to remove that "next to the ear, or inside our head" impression that one gets when listening to close miked audio, with little or no reverberation cues, in the recording.

I spend hours listening to Youtube interviews, each week, and find the addition of this room reverb simulation to the audio, makes listening so much more comfortable and natural.. More lifelike.
 

Here is a "new" KZ IEM, which seems to have within the headphone/microphone cable, some kind of surround sound implementation. The specs do not indicate if this includes crossfeed or not. Would be one interesting way to simplify such an implementation, i.e. embed it in the cable...
 
Hi folks, I've just started using UAPP on my Android phone when listening with IEMs. Today I noticed there is a crossfeed setting which has a couple of options as per the screenshot. I have zero experience with crossfeed before today and I think I like it, but would really like to know what these settings do and if possible, some guidance on how to set them best.
 

Attachments

  • Screenshot_20260201_192928_USB Audio Player PRO.jpg
    Screenshot_20260201_192928_USB Audio Player PRO.jpg
    37.4 KB · Views: 29
Hi folks, I've just started using UAPP on my Android phone when listening with IEMs. Today I noticed there is a crossfeed setting which has a couple of options as per the screenshot. I have zero experience with crossfeed before today and I think I like it, but would really like to know what these settings do and if possible, some guidance on how to set them best.
Crossfeed mixes some of the L into the R channel, and vice versa, with a delay. In UAPP, the top slider controls the strength of the crossfeed signal. It's a bit counter-intuitive since sliding it right makes it weaker - the middle setting "12 dB" means -12 dB. The bottom slider controls the top of frequency range to mix. Here, sliding it right makes it stronger, as it will work on frequencies up to 2 kHz.

How to set them "best" is entirely subjective. I prefer the minimal effect necessary because it does slightly degrade the overall sound quality. While playing music that has instruments hard-panned to the R and L (a lot of jazz recordings from the 60s are like this), I set them so that it sounds less wonky on headphones. For me that's around -14 dB and 1200 Hz.
 
Ok.

But that's where we're going into the DIY(w) -section of the propeller hat guys. Directly to deep end, glasses-in-the-fog. That is too much, at least for me. And the final results are very uncertain. To say the least.

It should be a PRODUCTIZED and READY package solution, a separate hardware device for sure. NOT a software solution on a PC.


Like the RME ADI-2 Pro and its crossfeed at the moment. But with advanced HRTF-functions and head tracking.
You would just load a ready HRTF-file into it, and it would work at maximum level right away.
One would get the HRTF-file, as you can already get from Genelec's Aural ID -service, based on a video recording of the head and ears.

Sure, partly sci-fi and just dreaming today. But maybe in the "future" (LOL)... :D
I know this is a late reply, but a dedicated mini PC with a stereo (or multichannel) in and out interface (analog or digital) would be less expensive than a nice hardware ADC/DAC with built in processing like the RME ADI-2 Pro.
The hardware would be probably around $500.00-700.00 and the software cheap to free, depending on what solution you go with. There are free VST Hosts that let you chain any number of plugins, and there are free EQ's and convolvers in form of VSTs or CLAPs if you prefer Linux.
You could add a USB controller and trigger presets without ever looking at the screen, after the initial setup.

I have a multi headphone setup at work that runs on my work laptop, but I don't need to interact with it, unless I want to make setup changes. It starts up automatically, passes audio from all my apps to three different headphones and one headset, with full DSP processing, and is controlled by my Stream Deck.
It could have been a dedicated mini PC, but I don't feel the need to have another piece of hardware on my desk.
I have a Babyface Pro FS, two headphone amplifiers, one SMSL DAC that is being fed over optical SPDIF from the Babyface, one USB DAC (TRN Black Pearl) for the headset. All of the outputs have at least EQ, but I'm running two instances of Realphones VST inside the Cantabile Lite. I don't need to touch anything other than the volume knobs and buttons on the Stream Deck.
This whole setup not including the headphones cost me probably around $1200.00, but the Babyface was the bulk of it. It could have been a much less expensive solution, I just wanted the Babyface.
 
Crossfeed mixes some of the L into the R channel, and vice versa, with a delay. In UAPP, the top slider controls the strength of the crossfeed signal. It's a bit counter-intuitive since sliding it right makes it weaker - the middle setting "12 dB" means -12 dB. The bottom slider controls the top of frequency range to mix. Here, sliding it right makes it stronger, as it will work on frequencies up to 2 kHz.

How to set them "best" is entirely subjective. I prefer the minimal effect necessary because it does slightly degrade the overall sound quality. While playing music that has instruments hard-panned to the R and L (a lot of jazz recordings from the 60s are like this), I set them so that it sounds less wonky on headphones. For me that's around -14 dB and 1200 Hz.
Thanks, that's very helpful.
 
Back
Top Bottom