Does it, though? I wouldn't be so sure. I spent the last few hours thinking about this, rewriting this post several times, and I came up with a somewhat counter-intuitive conclusion. Here goes…
Recall what
@daverosenthal's wrote some time ago:
That's already an amazing insight in and of itself, but I'm not sure we carefully considered all the implications.
Let's reflect on this for a moment. It helps to go back to the
definition of r². To avoid confusion over the term "flat", in the following I use the term "horizontal" to mean a response that shows no overall trend/tilt and the term "well-behaved" to mean a response with no local (high/medium-Q) deviations.
Let's enumerate the possible cases:
- The response is perfectly horizontal and perfectly well-behaved. In this case the measurement is equal to the regression line and the mean itself, there is zero variance either way, and therefore SM=r²=0/0, i.e. undefined.
- The response is perfectly horizontal but not well-behaved. In this case the regression line is horizontal, matching the mean of the measurement. The regression doesn't explain any of the variance and we end up with SM=r²=0.
- This is an egregious example where "smoothness" as defined by Olive is very different from "well-behaved" as I defined it, which is misleading. This is why @daverosenthal said "we shouldn't think of it as measuring smoothness", and he's absolutely right.
- Let's assume a response that is tilted and perfectly well-behaved. In this case the regression line is the same as the measurement itself, explains all the variance and we have SM=r²=1.
- Let's assume a response that is tilted and not well-behaved. In this case SM will be somewhere between 0 and 1. If the local deviations are large and the tilt small, SM will move closer to 0. If the local deviations are small and the tilt large, SM will move closer to 1.
So far most people here have assumed that this is too weird to have been designed that way on purpose, and Olive must have gotten confused as to the implications of defining SM as r². But let's assume, for the sake argument, that the behaviour I just described is completely working as intended. (After all, SM made it into the model over a whole bunch of other variables, so the variable must be doing
something right.) Why, then define a variable that way?
Here's my hypothesis:
the effect of SM is to counteract NBD in the presence of a slope.
To see how, let's run through the cases again but this time we'll look at NBD at the same time:
- The response is perfectly horizontal and perfectly well-behaved: SM undefined, perfect NBD.
- The response is perfectly horizontal but not well-behaved. SM zero, bad NBD.
- Let's assume a response that is tilted and perfectly well-behaved. Perfect SM, somewhat bad NBD.
- Let's assume a response that is tilted and not well-behaved. This case is more interesting. The more tilted the response, the worse NBD becomes. But at the same time, local deviations being kept the same, an increased level of tilt improves SM! Which, depending on weights, might cancel out the worsening of NBD.
In light of this I'm wondering if SM, instead of being called "smoothness" should instead be called "tilt compensation factor" or something like that. Its effect (whether intentional or not) is to
compensate for the effect of tilt on other variables, most notably NBD.
This leads me to think that it doesn't make sense to look at SM in isolation. Instead, SM should always be considered in combination with NBD. Looking at the overall score weights:
- NBD_ON is used, but not SM_ON. This suggests listeners preferred speakers with a well-behaved and horizontal direct sound.
- NBD_PIR is used in combination with SM_PIR. This suggests listeners preferred speakers with a well-behaved but not necessarily horizontal PIR.
The above results are consistent with what we would expect considering what we know about loudspeaker preference in general. It makes a lot of sense.
Here's yet another way to phrase this to ensure I get my point across:
- NBD can be thought of as something similar to the variance of the measurement.
- SM quantifies how well a linear regression line, i.e. the slope, explains the variance of the measurements.
- Therefore, SM quantifies how well the slope explains NBD.
Revisiting earlier posts:
This is technically correct, but it's not the whole story. NBD_PIR treats slope as a deviation
but flat PIR is not necessarily treated as being better in the final score because SM_PIR counteracts the effect of NBD_PIR treating the slope as a deviation.
It is true that changing target slope (SL) does nothing to the SM value. But that doesn't mean the score doesn't take slope into account. It
is taken into account but in a much more subtle, implicit, hidden way: through the respective weights of SM_PIR and NBD_PIR.
Bottom line: there are good reasons to believe that the behaviour of SM in the Olive paper, while counter-intuitive, is valid and not a typo or ommission. Therefore, it can be argued that attempting to "fix" the formula (e.g. by hardcoding an offset) could be doing more harm than good.
(Whether Olive actually intended for SM to behave that way is a good question. The paper states "Smoothness (SM) values can range from 0 to 1, with larger values representing smoother frequency response curves", which definitely reads like Olive does not understand what SM actually is. But that doesn't really matter - Principal Component Analysis doesn't care about intent, and it's the resulting model and its performance that matters, not what went through Olive's brain when he defined the variables.)
I believe this might also explain why it's quite hard to figure out why a given speaker obtained a given score by looking at the score components. Indeed, NBD_PIR and SM_PIR mean nothing in isolation - it's the
combination of them that matters. This also means that the "breakdown" chart that
@MZKM publishes is difficult to interpret when it comes to the PIR components. One way to solve that problem could be to add SM_PIR and NBD_PIR together and present it as a single variable on the radar chart.