The Harman target curves are based on subjective preferences gathered under 'laboratory conditions'. There's nothing ideal about them, they just tell you which frequency response an average person would prefer - hence their commercial use to tune speakers, headphones and IEMs in a way that is most preferable to the most people to secure the most sales.
If I put imaging aside, I get more or less the same tonality out of speakers after they are set up to have a flat in-room response, so the Harman target is more than just "what sells." They know that as well, which is why they had trained listeners, among other things from the literature, including the response of a speaker in a semi-reflective room, which does add a bit of a recursive reference since you are using something else which also needs to be tuned to a measured target. The only potentially controversial part is possibly the bass as there may be implicit boost there to make up for the lack of tactile bass as you would have with speakers, and generally it can be excessive for some users subjectively. The rest of it, I would say, will be reasonably neutral for a lot of listeners. There will, of course, always be the "chicken and the egg" phenomenon as pablolie says, "what
is neutral?" How can we really know? This will be further complicated by what's on the recording end as well. That said without anything other than double-blind testing I don't think we can do much better for conventional headphone and IEM tuning beyond the Harman targets.