every serious researchers agrees that a large soundstage cannot be achieved by manipulating FR alone.
Almost everybody agrees that the HD800 has the largest soundstage of all headphones.
As I said, there is no general accepted and worked out theory for soundstage, but large and angled drivers, large cups and distance of cups and ears seem to play a major role.
This all sounds quite cognitively dissonant to me.
But I'm glad to know that I'm now qualified enough to write an AES paper and back up my conclusions with "almost everybody agrees"
.
Although the exact mechanism for creating soundstage is not know
Question : if it is not known, how then are we even attempting binauralisation of object-oriented formats ?
What some people call "soundstage", in the context of stereo recordings, it's probably unknown indeed - if only because there is no operational definition of the term to begin with. But if the goal is to create a virtual space and make sound A appear to come from x angle at y distance, then I'm tempted to think that we know more about the theoretical mechanisms at play than about how to concretely realise it in a practical and robust way for most individuals.
it is definitely related to pinna interaction.
I was expecting that answer - and I wasn't expecting a source to back it up since it has yet to be truly evaluated to my knowledge whether or not over ears can reliably vary across individuals at higher frequencies in a way that for the most part corresponds to how their anatomy would influence loudspeakers' (or natural sound sources) response at their eardrum.
The problem I have with this idea is that it doesn't really add up with the data we already have, at least to a degree that any desirable results out of large over-ears interacting with one's pinna wouldn't be swamped by nuisance variables (including undesirable results out of the same type of interaction). I can see a few issues already such as :
- the relative difference between headphones varying inconsistently between individuals, and, in extenso, measurement rigs
- some over-ears being significantly impacted by positional and / or coupling variation - even so when you remove the pinna from the equation (flat plate).
In regards to the first point, my reasoning is as follows : pick two listeners A and B, which DF HRTF you know (we'll assume that DF HRTF + tilt or shelf is the reference for “sounds good to individual A or B”, but it could be how their HRTF influences how decent loudspeakers in a decent room measure at these individuals' eardrum, or anything else of the kind for the sake of the argument). Let’s imagine a theoretical pair of headphones that varies across listeners in a way that perfectly matches their DF HRTF differences. It then means that if you’ve measured the response in situ for listener A, you can
calculate the response in situ for listener B - and don't need to measure it.
Now pick two such “ideal” headphones. It also means that, regardless of their basal FR (we’ll consider for the example that they’re different), the variation between listeners will be constant across both of them.
Now, let’s imagine that you pick a selection of six large, open over-ears, but this time you don’t know whether or not any one of them can perfectly vary across listeners in a way that matches the difference in their DF HRTF. If they all are inconsistent in terms of how they vary across listeners, then it means that at least 5 out of 6 are incapable of varying across listeners in a way that matches their variance in DF HRTF - if not all of them.
You’re none the wiser in terms of knowing which headphones captured the DF HRTF variance best (a very interesting question indeed), but you can at least rule out that this is a common characteristic, even for these large, open over-ears - and possibly even quantify how incapable they are of doing just that.
Presented differently,
that presentation could possibly provide the sort of data that would allow us to evaluate just that, but some questions remain, notably whether or not all headphones were measured during the exact same sessions, without moving the mics (or if the individuals could reliably position the mics in the same place), and if all individuals wore them truly past the ear canal entrance. If we assume that this was done, then even just eyeballing the response past a few kHz it's fairly easy to see that the headphones vary inconsistently across listeners, even for the larger, open headphones :
The same sort of reasoning can apply to headphones rigs : if over-ears reliably interacted with them in a way that matched how their pinnae affect the HATS DF HRTF (and / or other reference for what "sounds good"), we wouldn't see that much noise around the average when measuring the difference between one fixture and another :
You'll see the same sort of noise around the average on any other test of the kind I've digitised the traces for, even when only the pinna varies.
You could tell me : yep but these ear simulators don't just have different pinnae, they also have a different ear canal, fixture geometry, etc... yep and so will humans.
What you do notice from such test fixture to test fixture tests is that some (but not all !) large over-ears tend to cluster closer to the average trend above a few hundred Hz (below the average is too influenced by leakage sensitive headphones to be a good baseline).
But :
- you'd then need to know first and foremost if the difference captured by measuring a cohort of headphones on fixture A over fixture B is similar to the difference between fixture A over B's DF HRTF. Clustering closer the average difference does not necessarily mean that
.
- some smaller over-ears, including some closed ones, or ones that deform the pinna, still tend to land quite close to the average : is it a fluke ? I'm starting to think that you'd need to test a larger sample size of fixtures and / or pinnae to know that.
I absolutely am interested in knowing if some over-ears designs are more capable than others at interacting with the individuals' anatomical features in a way that matches better the inter-individual variation with sound sources such as loudspeakers, and more capable at reducing other nuisance variables, but it's quite already clear to me that for most of them, that's not happening to a sufficient degree. Not even remotely close. And we haven't even arrived at the issue of positional variation.
Now it's true that IEMs for certain won't interact with one's pinna. But since the above should casts doubts whether or not over-ears can reliably interact with one's pinna in an exclusively desirable way, and not also if not entirely in a random and undesirable way, I'm quite skeptical that pinna interaction alone can explain whatever someone would call "soundstage".
It is definitely not a sufficiently explored subject, but what it is clear is that FR alone cannot be responsible for it. This can be demonstrated simply by two headphones with the same FR (achieved with or without EQ), which still have a differently perceived soundstage.
Ah, the classic.
How do you know that two headphones with the same FR on a measurement rig will still have the same FR on your own head (even more so when relying on measurements performed with other samples) ?
These are all headphones EQed to the Harman target according to Oratory's profiles, measured with the same blocked ear canal entrance mics on my own head (which by now I think that I can position consistently well) - please don't compare these to measurements performed at the eardrum, look only at the difference between the traces :
Only one pair of headphones in the lot is quite sensitive to leakage issues (not the red trace interestingly), hence the rather tight grouping below 1kHz.
So, what happened ? Among other potential issues :
- For a start, in the 1.5-8kHz band blocked ear canal entrance mics will introduce errors in terms of the relative difference between headphones, but it won't explain the above (If I were to re-do these measurements with open ear canal entrance mics, I'd get a similar spread). Above 8kHz the errors are too important, hence no valid data (and I think that this should possibly apply to Harman's latest presentation).
- for one pair, a volume dependent EQ, which means that the measurements on an ear simulator, quite likely not performed at the same volume as the one you'll be listening to, is not representative (yet that is not the cause of the main deviation from the rest for this model).
- for some of these traces, the profiles don't fully correct some of the nulls
- sample variation (but it won't explain the difference for all traces as the above contains several samples of the same model)
- and coupling issues.
For obvious reasons, the
same presentation linked above should also put a damper on being too enthusiastic about EQing one pair of headphones to sound like another (and should make the above results unsurprising), particularly if they're anything but large open backs with tight manufacturing tolerances, and without the help of in-ear microphones (which have their own limitations to perform that task).
This is no subjective opinion, but an objective fact. You may not agree with it, but it still remains true.
It's an objective fact that at the very least passive IEMs' FR can vary quite significantly and non-linearly past 3-4kHz with insertion depth. So how do you know that the insertion depth in your ears is a match with the one used to measure them in an ear simulator ?
I've also already provided you with an interesting example when it comes to IEMs with a feedback mechanism, whether you're interested in it or not is up to you.
Now let's have a bit of fun :
This is the exact same IEM, in the exact same clone coupler from Aliexpress, measured at the exact same volume, with the exact same signal (a sweep), and the exact same seating (the IEM wasn't moved between measurements), and no normalisation. Yet I was able to consistently and repeatedly get three different results - of the kind that is consistent enough that some people would think any one of these traces would be representative of the actual FR (and in fact some reviewers have been duped by that IEM's behaviour). What happened ? How do you know which one is representative of what happens in your ears (if any) ? Hint : they're active IEMs.