It seems to me one (rough) solution would be to fix the mic position, do a sweep to capture the response, then use that to adjust relative levels in the distortion sweep. The in-room response should not be affected by volume (until you knock something around, or down) so check at low level then do the higher-power distortion measurements. This could be automated; for each frequency, check the levels of the fundamental and each harmonic using low-level test tones, then incorporate that into the distortion reading.
This would not correct problems in the recording (sensing) chain, however, be it mic overload, input buffer or ADC distortion, etc. And moving the mic or anything else that could change the response will of course impact the results.
In the distant past I used a Matlab program that sequentially generated and recorded the fundamental and ten harmonic tones (or to 20 kHz, highest I chose to test) of the desired test tone at 60 or 70 dB SPL'ish in short bursts, then did a burst power sweep at the fundamental. A simple loop would step through multiple test frequencies. That used GPIB control of a generator and spectrum analyzer; these days you would just use a sound card or something like REW.
Can someone provide a link to the GedLee measurements for the thread-weary?