Well, I imagine that a commercial entity supporting/funding research about "what most people prefer" is obviously interested in the "what most people would buy" secondary outcome
. In order to maximize sales, maximize the probability your speaker will receive a high average preference score among the target audience. The research can be and was developed further by age groups and sex I believe.
To take a medical analogy, very good survival (and even better survival with good quality of life) models can be developed if one takes physical exercise, smoking, alcohol consumption and, social contacts as parameters. Those are very valid statistically, on a much bigger scale and with a higher confidence degree than speaker preference models. That doesn't mean that you will not find many individual outliers.
The same happens with speakers: outliers, in terms of preferences, abound. There's nothing intrinsically wrong with that.
Where, imho, some ASR enthusiasts jump the gun is when they enthusiastically treat a "best fit" model between
a population of listeners and
a population of speakers as a
strong predictor of individual preference. As quite a few others have noted here before, the value and the confidence levels of the model and derived scores as an individual predictor of preference are quite weak.
One can hardly blame Toole, Olive and, others for that over-enthusiastic extension.
I think that, just as there are subjectively inclined people willing to embrace any nice story because it is nice, there are objectively inclined people willing to trust hard numbers simply because they are numbers