This response is really two topics.
I had some time to think about this topic. So based on what I read recently and in this thread, all articles that published the result of blind testing, are irrelevant to me. If this testing is supposed to be scientific, the rule is that to make the result valid, anyone should be able to repeat. The problem I see is 1. You need to have the same amplifiers, the same speakers, the same speaker cables, the preamp and the same source hardware. Not to mention the same music.
Frankly I just don't get it, why I need to know if people can or cannot hear the difference between a certain model of a brand of amplifiers to another? That is is like telling you I have a rosebush that has wilting roses? Is that relevant to you? Almost never do i see the gearing test chart of the people listening. What if they are half deaf and can't hear half the arbitrarily defined spectrum?
On another hand why exactly do I need to do that at my house with my equipment? If I don't have different equipment to test that may or not sound different, why exactly do I want to do this? Now I have to buy even more equipment. This is my workbench. Let's just put this out there. I am retired RF engineer that has developed and design broadband RF amplifiers for CATV system amplifiers that hang on the pole, for almost 30 years. Never did I or any other engineer do any blind (no pun intended) to discern the difference between equipment. All faults that you see on your TV have measurable causes. I have designed audio amplifiers on the late 80's. Every one is still working today because I did not use crappy components. For me it is easy to determine if an audio amplifier is a crappy design.
Also I not only use frequency domain measures that Amir mostly uses, I also prefer to use time domain measurements. And a digital scope and decent signal generator cost a fraction of what Amir paid for his distortion analyzer. One thing I always test and amp for is a square wave to look for high frequency oscillation when I use a slightly capacities load. A 100 MHz scope is a good thing to have. In the picture I have a nice classic 100MHz HP scope, and the little lunchbox beside it is the latest Siglent 14bit 100MHz oscilloscope. Frankly ergonomics on the new digital stuff is complete crap. To do simple things you need to push so many buttons and some stuff is nested too many button pushes down. It does have an auto measurement button, but that is only good for the first time as it resets everything. Sorry for the rant. What the digital stuff is good for is to measure parameters that are preprogrammed. For an amplifier it is good to see the phase response over the 20-20K bandwidth. This gives you an idea on what kind of speaker load will it give it some trouble. With the signal generator and scope, it can be done in a few minutes.I also like to test an amp at full power at 20kHz to see how hot it gets after a few minutes and if it goes into current limiting or shutdown. Back when the first high power solid state amplifiers came out, the output stage would blow, because the transistors were too slow and as one turned on the other did not shut off so you had mutual conduction and the transistors went into avalanche destruction as you exceeded the safe operating area. Of course the manufacturers claimed it was not how the amplifier was going to be used, but I just call it a crappy design and cost cutting. Don't claim it can produce full power 20-20kHz. The class G amps switch at high frequencies. The output transistors are stressed as they have a finite bandwidth and much less margin than a transistor being used at audio frequencies. If you look at the specs of power Mosfets from the early devices made by Toshiba and the more recent ones developed for switching, the difference is not that same ration as 1MHz to 20kHz. All class G amps have to put in a dead zone where both transistors are cut off so the space charge can dissipate. So when one turns on the other is guaranteed to be off. I would worry that over a decade or longer that the timing circuit may get unstable and when both conduct, its game over.
When I buy audio equipment I try to to get a schematic either before or after. This way I can fix it if there is a problem. Some of the newer audio equipment is disposable, so it does not really earn my respects if it is not repairable.
I think this mostly covers what I wanted to say in this post.