There are a few technical reasons why someone eyeballing at the spinorama charts might be able to come up with a better estimate of listener preference than the Olive model's 74% correlation. It has to do with known shortcomings of the input variables that Olive defined. For example, the deviation variables (AAD, NBD) are arguably too simplistic because they do not take into account the fact that some frequency ranges are more critical than others (it weighs the entire 100-12kHz range equally). They do not take into account the fact that peaks tend to be more audible than dips. NBD splits the frequency range into fixed, discontinuous, 1/2-octave bands, which is also not great from a perceptual perspective.
Edechamps, I agree with you that the Olive model isn't perfect in the ways you mention. So how would an individual go about considering critical freq ranges? Do the spinorama analysts here memorize which bands are more important and place more "mental" weight when looking at the curves? And do they also memorize what Q's and amplitudes are more audible than others (and at what frequency) and also mentally weight the peaks higher? I guess I'm not clear on how one could use this additional information to then eyeball a FR curve and decide which one is "smoother" in the right ways. And for the sake of argument, I'm talking about visually inspecting the curves taken from several subjectively well-regarded loudspeakers and making firm predictions without listening to them (not a maldesigned speaker with a wild curve vs. a studio monitor).
Another example is the SM variable whose definition is incredibly weird, to the point where it looks more like a mistake than something deliberate.
Curious what's weird about it - it's simply the r^2 correlation of the FR response compared to a flat response? It should "correlate"with the NBD statistic, except it's calculated continuously instead of via discrete 1/2-octave bands.
It is quite possible that, if a new model was designed without these shortcomings, then we might end up with a correlation higher than 74%, from the same raw spinorama data. Or we might not. It's impossible to know without doing more research, which sadly requires a lot of expensive and time-consuming blind testing.
This answers my original question, was whether there existed evidence of a formula/technique using spinorama charts that outperforms the Olive model. In the absence of evidence, I'm wondering why people think their formula/technique might be more accurate/predictive, and how they know that it is? (i.e. have people done their own controlled listening tests to validate their preferences?)
The other reason why people might prefer looking at the curves themselves rather than trusting the Olive preference rating model is because they might want to use the speaker in a different way than the listening setup used to calibrate the model. For example, they might want to use a subwoofer (all subjective tests for the Olive model were done without a subwoofer). Or they intend to EQ the speaker, so they are looking for EQability rather than out-of-the-box performance.
Sure, that's fair - and there are likely many other reasons why individuals might learn more by looking at individual spinorama charts. And individual circumstances might make the Olive formula less applicable, to your point. Don't disagree.