There's no need for a large sample population when the goal is for one person to see if they can tell a difference. The sample population can be one since only the one cares about the result. What does matter, IME/IMO, is doing enough test runs over enough time to determine (or convince myself) that there is, or is not, a difference that can be heard. How well a test is controlled is not determined by how many try the test, though the outcome might be. In the case of a guy (gal, whatever) sitting at home trying to decide if DAC A or DAC B sounds better, all it takes is careful matching of level and a different hand on the switch (could be a computer-controlled switch, natch). You could listen in mono with one DAC (or whatever) on the left and one on the right channel, just toggling between left and right, for example. I used to have a simple relay for switching power amp outputs (actually a pair of DPDT relays with a load resistor on each so the amp was not driving into an open) controlled by a simple logic circuit that randomly selected the output and kept track in the box with LEDs. These days I'd use a Raspberry Pi or something and a program to keep track for me (I had to manually open the box and record the LED settings back then , several decades ago).
Since this thread is about power cords, I think they are hard to test. Not because just measuring a power cord is hard, that's easy, but because of all the "what if" conditions people will ask for and claim make a difference. Current draw, amplitude and frequency of direct (from the wall socket) and impinging noise, interaction with the wall source and component (impedance, etc.), etc. So many ways for people to say "well, you did not test my system, so it doesn't apply, and I know what I heard". That is where a single-person test comes in handy.