This is certainly one of the stranger ways of assessing headphones and speakers that I've seen.
It was deliberately done not so much for headphones, but because my near-field speaker setup is a binaural system that uses passive XTC (cross-talk cancellation, in this case via head shadowing). As with conventional XTC, there is inherent coloration due to the fact that there is some form of filtering in place. Here, below 1 kHz where head shadowing is effective, both ears hear both speakers. However, above that, the contralateral speaker is no longer audible. Given the speakers have separate terminals, it was easy enough to demonstrate. The woofers don't show much interaural level differences between each ear. Pretty much what one would expect due to normal head diffraction. But the response of the tweeter is noticeably attenuated. This leads to an issue that its artificially imposing my own HRTF on the speakers response, which makes it sound dull and recessed. Using music was not terribly effective due to perceptual biases of ones expectations. I couldn't get close enough to actually get a consistent pattern in the measurements to see what corrections were needed. But, with normal non-musical sounds the coloration was much easier to hear. Once I had a rough response, semi-anechoic measurements of the speakers combined with some estimates of my own HRTFs allowed me to finally pin down what the target response needed to be. There is still some residual coloration in the crossover region between the woofers and the tweeter, but this, surprisingly, is not really audible when listening to music. Its only non-musical content that has spectra within that region that it becomes apparent that there is still some slight coloration. As far as the imaging goes, if I use ambisonic or binaural content that also has conventional stereo provided (quick example with synthesized instruments and choral vocals:
MIR Pro 3D Binaural "Sonic Explorations"), the imaging between the speakers and the headphones becomes essentially verbatim. This, of course, requires listening about 45 degrees off-axis with a wide dispersion pattern speaker in the direct sound field to work or diffraction again becomes an issue, and even then I had to also take semi-anechoic measurements since I have to also correct for the off-axis response of the speakers. Surprisingly, even though the null is fairly shallow at around 8 dB or so, it still works reliably, but I had to establish an RFZ to ensure the reflections did not interfere. I did take some additional measurements using a mockup head form as well as some crude measurements at each ear opening using an iMM-6, and there is at least reasonable agreement with the measurements and the respective corrections in the target response. Still, the sound shows why XTC has been pursued. It works really, really well providing immersive imaging that even exceeds headphone listening. But, its also quite sensitive, and requires substantial investment in setup to get optimal results. In the end, multi-channel will probably be more optimal for most listening setups once some form of upmixing or new format is fully adopted in the future. It's more flexible with respect to the room and listening position. XTC setups are still very much a single-user affair, and professional implementations like the BACCH system do require head tracking for conventional setups.
But, getting back to the use of unconventional means, its because that was all I had to start with. You can't get somewhere without first finding a way to actually get there...