I'm a coward so I voted for two of them. A lot of that is down to the playback device. Even when tuned to Harman (whether by design of by using EQ presets based on ear simulator measurements) headphones will exhibit remaining differences that were enough for me to make me hesitate between two options after listening to the samples over several HPs. It may (or may not) be interesting to discuss playback devices once the results are revealed.
Is this turning into a situation where people think there is a "best" choice (the speaker closest to the preferred FR response) that needs to be recognized? That would be a bit funny imho.
As far as the Bach piece is concerned, this post prompted me to listen to several recordings on my big systems yesterday evening.
That made me realize that the differences between recordings were in some cases much bigger than the differences between test files here. The recording I think I had recognized (see above) sounds dull, and unidimensional compared to Gardiner's 2012 version. It is a matter of taste of course, but on that more recent version, I can localize and follow everything accurately. It is a live recording that better fits what I expect to hear. That's definitely a bias.
But this got me thinking.
First, the original recording matters a lot. Not big news, we all know that. It is also a matter of taste. I focus on the "being there" feeling, which a decently recorded live version clearly satisfies better but I can see how someone focused on the execution of the music would maybe prefer a studio/non-live "perfectly executed" version.
Second, given the big differences between recordings, I can see how people could prefer "bright boxes" (say the archetypal house B&W curve) on what sounds like duller recordings to me. On the other hand, on a recording that is intrinsically brighter, people might prefer a more pronounced downslope in the FR. There are recordings I like a lot musically but are technically awful (again, to my subjective hears) that I can only listen to them on my lower-end speakers, out of the sweet spot, because they don't magnify the technical issues.
Third, as many have noted, the binaural recording seems to have issues. Or maybe something was lost between the recording and the files we get. It's really hard to say of course.
Finally, what we use to listen to the test files has a major impact. I wouldn't worry too much about having a Harman-matched device to attempt to recognize which speaker matches Harman's FR better. The question is which test file one actually prefers, not a conformity test that identifies the file one should prefer according to this forum's opinion.
It is a bit strange to see that Harman's curve is so heavily described as some kind of ultimate standard here that people start to be mildly ashamed of their eventually diverging tastes.
I guess I will end up voting because
@Blumlein 88 comment, in typical "you'll be very surprised if you click" Internet bait style, triggered my curiosity
