Some good questions here.
In broad terms, I think the premise that high-frequency equalization should mostly take the form of shelf or very low Q peak filters (broad adjustments in level, essentially) is quite advisable. I personally take this to a relative extreme; I don't even aim to notch the peak from my HD800, because I've found that doing so without putting a hole in the surrounding treble somewhere is troublesome. For most practical people's purposes, so long as it doesn't sound bad, I'd say that relatively fine/high Q equalization is reasonable up to 8-10khz, although it depends on a number of variables (particularly how the headphone's response varies with position on the head of the wearer).
Regarding the intersection of HRTF and HpTF, let's consider two extremes and how they would impact things:
First, let's imagine a world where headphones are "HRTF chameleons" - by some process, whether acoustic or computer controlled, they perfectly approximate the individual wearer's HRTF in the target sound field. In this world, you could see extremely wide variation in HRTFs and HpTFs, but have
zero variation in subjective timbre of headphones,
because headphone subjective frequency response is equal to HpTF minus HRTF. This world is almost entirely reconcilable with Hammershøi & Møller's data.
Second, let's imagine a world where headphones are "HRTF blind" - perhaps in this world a trend of very deeply inserted in-ear monitors dominates, but for whatever reason, the HpTF is
absolutely constant between wearers, even as individual HRTF varies. Subjective frequency response would track roughly according to the scenario you're outlining in this post - up to around the peak of the ear resonance there'd be agreement, and then it would all go south. A headphone that sounds peaky to one person would be smooth to another, and we'd have a great deal of trouble comparing our subjective impressions of headphones at all in the higher frequencies.
Now, in reality, we don't live in either of those worlds - some of the aspects of individual anatomy that influence HRTF also influence HpTF, so they aren't uncorrelated, but they also don't coincide perfectly. There's also the "x factor" of whether headphones have individualized and atypical interactions with some anatomy that doesn't relate in any way to HRTF - unarguably this happens at low frequencies with headphones with high acoustic impedance when a leak is present in the pad volume, but it could also happen at higher frequencies, and this would introduce another source of response variation.
Pragmatically, I don't think that we can use measurements of headphones on population average measurement fixtures to make some of the projections we can make about other types of equipment - e.g. "DUT A will sound the same as DUT B" - and non-individualized equalization is inevitably going to be an area where caution is wise, and erring towards broader filters and leaving the high Q features mostly alone is likely to yield better results on average, but equally, headphones are such radically audibly different devices that we don't
need the degree of consistency of coherency between test and in situ that we have with amplifier or DAC measurements to make very reasonable extrapolations about what will sound better.
This all said, I still don't entirely grasp what Amir's primary goal with this headphone testing project is; if it's to redefine headphone metrology and take it out of the dark ages...well, we weren't in the dark ages to begin with, so that'd be pretty hard
If it's to present an additional impartial source of reliable headphone measurements alongside what presently exists (Oratory, Resolve/Headphones.com, Clarityfidelity/Speakerphone, Keith Howard/HeadphoneTestLab, Brent Butterworth/Soundstage Solo, etc), then we've already got validation of concept. If it's to improve on the current state of the art in headphone metrology that's very conceivable - the 5128 can reasonably claim to be the most accurate way to measure headphones that presently exists, and there's still plenty of room for innovation in methodology if Amir is interested in that.