I’d be curious to know what a power and sample size calculation for the smallest meaningful difference looks like.

For those who are not familiar, the power of a study is defined as the probability of failing to identify a difference that is actually present.

In biomedical research large controlled trials are typically designed with 80 or 90% power.

Despite accepting a 10 or 20% chance that you fail to detect a true difference between groups, if the meaningful difference between groups is small, sample size gets big fast - often requiring thousands of people.

So, if we really wanted to know if speaker x was say 5% better than speaker y, how many people or blinded trials for a single individual would be necessary to confidently conclude a difference?

I don’t have the stats skills to figure this out but I bet it’s more than most would think.