Well randomized is the key. If you only use college students, or only use volunteers at an AES conference, or even if you only use people from the US, it is unlikely they are sufficiently well randomized to represent the population of the world.A well randomized test...
Actually the problem is more difficult that it may first seem. A famous and well regarded paper on medical research, although not without some criticism too (and medical research is what this would be): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/