I don't have any relevant experience, sorry.
The variable was the loudspeaker. This was before 1986.
Read the paper.
Yet, the preference testing between the Revel Salon 2 and the JBL Array 1400 shown on page 399 of the second edition of Toole's book:
"When they are put against each other in double-blind tests, the audible differences are small, somewhat program dependent, and listener ratings tend to vary slightly and randomly around a high number. In the end there may be no absolute winner that is revealed with any statistical confidence; the differences in opinion are of the same size as those that could occur by chance." What interests me are what factors result in such similar preference testing, other than the on-axis response from the transition zone up to 5 kHz or so. Here are the measurements:
View attachment 157120
Don't forget the effects of diffraction, whichs is part of why Kevin Voeck of Revel had advocated for use of listening window, rather than on-axis. I believe that this is why Harman replaced the on-axis response with LW in their predicted in-room response. Toole wrote an interesting article on the measurement and calibration of sound systems.
Instead of asking more and more questions, perhaps you could answer this yourself for, example, the Newman KH310 or Genelec 8351b.
The variable is the loudspeaker. Consider allergy testing in medicine, where the variable is the allergen. One could quibble about why an endless number specific varieties, why partially or extensively hydrolyzed allergens, or partially or extensively denatured version were also tested.
One would have to show that the anechoic on-axis or listening window measurements of the Beolab 90 were identical first. I haven't seen that, so it's possible, but I'd like to see that first.
I don't know when Toole began working for Harman, but at the time of the paper under discussion, he was at the NRC, so you're standing decades later and attributing false motivations, commercial or otherwise, to what seems to me to be an earnest attempt to begin using available tools to address interesting questions regarding loudspeakers at a time when the available body of literature was quite limited.
That's possible. It's been discussed elsewhere.
I've only seen several. Look again at the first one you posted. There's nothing about the shuffler that doesn't allow for speakers against the front wall, and in fact, Harman has special testing for on-wall speakers.
The first commercially available QRD diffuser was in 1983. It would be nice if Toole et al had access to a time machine at the NRC. Unfortunately, we cannot go back and ask them to accommodate a seemingly infinite number of requests. We simply have what has been published and are stuck interpreting and extrapolating, but we're stuck with what we've got and can't change it, despite our wishes to the contrary.
In my opinion, the most interesting method for addressing individual parameters of loudspeaker performance, as opposed to the loudspeaker itself as the variable, only became remotely feasible in the recent past through simulated testing:
https://users.aalto.fi/~ktlokki/Publs/JASMAN_vol_146_iss_5_3562_1.pdf
However, I doubt that the interest and resources would be brought to bear, plus the usual criticisms regarding the listeners would apply, plus the question of whether simulation adequately models reality, etc.
Sorry, I rather doubt that I have adequate knowledge and resources to begin to address the multiple posts and questions with which you're likely to respond.