Speaker Equivalent SINAD Discussion

MZKM · Jan 18, 2020

daverosenthal said:
Guys, I'm reading the Olive AES paper and it's really weird. Yes, I know this is heresy. The SM_* "smoothness" model feature appears to use the Pearson 'r' regression correlation coefficient in a way that's, well, charitably, counter-intuituve. To wit: A speaker that had a highly-flat-and-smooth (i.e. desirable) frequency response would have a very low "smoothness" by this measure whereas a speaker that had a bumpy response with a distinct frequency-dependent tilt would score highly. The stats intuition here is that r is high when the variation in dB is well explained by the variation in frequency. In layman's terms, given a fixed amount of natural "wobble" in frequency response, the "smoothness" number will be much higher if there is a non-flat slope to the frequency response. Weird.

(In the patent, Olive notices the effect of this in the regression: "Variables that have small correlations with preference are smoothness (SM) and slope (SL) when applied to the ON and LW curves", but doesn't seem to realize the cause--on-axis (ON) and listening window (LW) frequency responses tend to be flat, not downward sloping, so the 'r' coefficient disappears.)

The final model is fit from many features with mutual correlation so the use of this weird SM feature doesn't invalidate the model, it just means that we shouldn't think of it as measuring smoothness(!). My guess is that the more fundamental effect of SM_PIR in the final score is steering preference to speakers with gradually downward-tilting response. Finally, the "NBD_* feature captures a similar concept but appears to be better engineered, which is perhaps why the "smoothness" factor plays only a small role in the final model.

Forgetting for a second about producing one number to rule them all... At this point we know how to measure a few key numerical attributes of a speaker in a way that can 1) be sorted best to worst and 2) are very likely to related to listener preference. These are:

Low frequency extension (lower better)

Narrow-band frequency response variations (less better. Per Olive: "the narrow band deviation (NBD) metric yields some of the highest correlations with preference...")

Overall slope close to Harmon in-room target (closer better)

Are there others that fall in this category?

Well, fuck me; I just looked and SM is r2.

I may have to look at my calculation again:

For the Revel, the r2 with linear regression/spacing is a rounded version of my calculation, but log spacing gives a slightly higher score.

For the NHT, ~~neither linear nor log spacing is a match for my calculation; I get ~0.175, where linear r2 = 0.107 and log r2 = 0.232.~~ Plugging the NHT data into my auto-calc version gives r2 of 0.175. Need to look at what when wrong in my original chart.

Now, the paper mentions using a log transform on the data points to linearly space them, so I’ll try that and see if it matches the r2 of the log regression line.

Looks like I didn’t have to use that crazy SM formula

, which I just recalled is cov(xy)/(sqrt(var(x)var(y)) from when I took statistics.

EDIT: Also just noticed that the slope is |actual-target|

MZKM · Jan 18, 2020

daverosenthal said:
Guys, I'm reading the Olive AES paper and it's really weird. Yes, I know this is heresy. The SM_* "smoothness" model feature appears to use the Pearson 'r' regression correlation coefficient in a way that's, well, charitably, counter-intuituve. To wit: A speaker that had a highly-flat-and-smooth (i.e. desirable) frequency response would have a very low "smoothness" by this measure whereas a speaker that had a bumpy response with a distinct frequency-dependent tilt would score highly. The stats intuition here is that r is high when the variation in dB is well explained by the variation in frequency. In layman's terms, given a fixed amount of natural "wobble" in frequency response, the "smoothness" number will be much higher if there is a non-flat slope to the frequency response. Weird.

(In the patent, Olive notices the effect of this in the regression: "Variables that have small correlations with preference are smoothness (SM) and slope (SL) when applied to the ON and LW curves", but doesn't seem to realize the cause--on-axis (ON) and listening window (LW) frequency responses tend to be flat, not downward sloping, so the 'r' coefficient disappears.)

The final model is fit from many features with mutual correlation so the use of this weird SM feature doesn't invalidate the model, it just means that we shouldn't think of it as measuring smoothness(!). My guess is that the more fundamental effect of SM_PIR in the final score is steering preference to speakers with gradually downward-tilting response. Finally, the "NBD_* feature captures a similar concept but appears to be better engineered, which is perhaps why the "smoothness" factor plays only a small role in the final model.

Forgetting for a second about producing one number to rule them all... At this point we know how to measure a few key numerical attributes of a speaker in a way that can 1) be sorted best to worst and 2) are very likely to related to listener preference. These are:

Low frequency extension (lower better)

Narrow-band frequency response variations (less better. Per Olive: "the narrow band deviation (NBD) metric yields some of the highest correlations with preference...")

Overall slope close to Harmon in-room target (closer better)

Are there others that fall in this category?

Thanks a bunch, now my calculations should be correct. And indeed it was exactly the same as the r2 that Sheets spits out for log regression, meaning I didn't even have to go through all that.

MZKM · Jan 18, 2020

daverosenthal said:
In layman's terms, given a fixed amount of natural "wobble" in frequency response, the "smoothness" number will be much higher if there is a non-flat slope to the frequency response. Weird.

Smoothness for the measurement uses Sound Power, which naturally tilts down for monopole speakers.

If you wish to calculate SM for other curves, it has a different target slope. It’s page 13 of the patent (labeled page 6).

I’ll play around with it, but for the Revel, using slope vs abs(target - measured slope) didn’t alter the final result at all, not even a tiny decimal amount.

daverosenthal · Jan 18, 2020

MZKM said:
Thanks a bunch, now my calculations should be correct. And indeed it was exactly the same as the r2 that Sheets spits out for log regression, meaning I didn't even have to go through all that.

Glad I could help, though I still think the SM measure is bunk

I updated my tool to compute NBD (which aligns nicely with yours, @MZKM, only a tiny difference in exactly how we bucket the bands). I also updated the tool to compute the precise -6db point accounting for the target curve and solving the linear interpolation to avoid the "first less or first greater" issue:

MZKM · Jan 18, 2020

daverosenthal said:
Glad I could help, though I still think the SM measure is bunk

I updated my tool to compute NBD (which aligns nicely with yours, @MZKM, only a tiny difference in exactly how we bucket the bands). I also updated the tool to compute the precise -6db point accounting for the target curve and solving the linear interpolation to avoid the "first less or first greater" issue:

View attachment 46408

The -6dB point for the calculation is for SP Hz relative to avg LW dB from 300Hz-1kHz.
My calculation for the NHT has ~73.5dB for avg dB and ~69.6Hz for -6dB.

Or, relative to avg SP dB from 300Hz-1kHz; Sean stating this could be used for rear-porter speakers.

daverosenthal · Jan 19, 2020

MZKM said:
Smoothness for the measurement uses Sound Power, which naturally tilts down for monopole speakers.

If you wish to calculate SM for other curves, it has a different target slope. It’s page 13 of the patent (labeled page 6).

View attachment 46407
I’ll play around with it, but for the Revel, using slope vs abs(target - measured slope) didn’t alter the final result at all, not even a tiny decimal amount.

I think might be confusing SL and SM. SM (smoothness) is just the r^2 coefficient of the linear fit to the curve. SL (slope) is the one which is computed relative to a target, which is different for the various curve types.

MZKM · Jan 19, 2020

daverosenthal said:
I think might be confusing SL and SM. SM (smoothness) is just the r^2 coefficient of the linear fit to the curve. SL (slope) is the one which is computed relative to a target, which is different for the various curve types.

You can calculate Smoothness for any curve, the preference formula uses SM_PIR, but you can calculate it for the on-axis, listening window, etc.

Slope is used to calculate smoothness, and has the different targets based on which curve you are using.

As for r^2, it is linear, but based on log spacing of the data, I calculated it and it’s the same as what Sheets states for when I use log regression instead of linear regression; I wish is knew this earlier as then I wouldn’t have needed to calculate it, I could have just had Sheets generate it via chart.

bobbooo · Jan 19, 2020

daverosenthal said:
I also updated the tool to compute the precise -6db point accounting for the target curve and solving the linear interpolation to avoid the "first less or first greater" issue

Although that's great, and could be posted in each speaker/subwoofer review to give the reader an idea of its bass extension, I don't think it should be used in the preference rating formula. It may be the more ‘precise’ value, but that’s irrelevant if that’s not the value and method Olive used in his AES paper. He says ‘the first frequency x_SP’ for a reason, rather that just 'the frequency', or using any other qualifier before frequency, like 'exact'. We have to match Olive’s calculation method as close as possible, as the formula correlating these variables to actual preference scores was optimised using this method – we have no idea how changing the method would change the correlation. So I believe we should stick with the 'closest Hz less than' value to be sure we'll get the same correlations.

bobbooo · Jan 19, 2020

MZKM said:
Or, relative to avg SP dB from 300Hz-1kHz; Sean stating this could be used for rear-porter speakers.

As I said previously, this is not correct – you’ve misunderstood what Sean Olive meant in his description of LFX. Here it is again from the full AES paper (scroll down in that link for the paper):

LFX is the log10 of the first frequency x_SP below 300 Hz in the sound power curve, that is -6 dB relative to the mean level y_LW measured in listening window (LW) between 300 Hz-10 kHz.

The next two sentences in the paragraph are just explanations for the choices made above:

LFX is log-transformed to produce a linear relationship between the variable LFX and preference rating. The sound power curve (SP) is used for the calculation because it better defines the true bass output of the loudspeaker, particularly speakers that have rear-firing ports.

He is explaining why the -6 dB frequency is taken from the sound power curve and not from any of the other curves i.e. why he is using x_SP and not x_LW or x_ON – because it better defines bass output of all speakers, but particularly (not only) speakers with rear ports (presumably because bass and sound power are not directional measures, unlike e.g. the on-axis and listening window curves). The listening window however better defines the average level in the range 300 Hz-10 kHz (where we can localize sound), so this is used for the reference level ybar_LW for all speakers.

I think your misunderstanding comes from the language used in the patent application (which is generally less clear and easy to follow), in which he says ‘may be used’ instead of 'is used' in bold that I've highlighted in the above quote. This is merely because the patent application is describing techniques that may be used to calculate predicted speaker preference ratings, so he uses 'may be' throughout the paper (Adobe pdf reader says 47 times!

) for all techniques he actually used when creating the model. Here's just one example from the end of the paragraph marked [0046] in the patent:

Spatial averaging may be used for all curves (except the on-axis (ON) response curve) to remove interference and diffraction effects from the measurements.

Contrast this with the same sentence from the end of the second paragraph of Section 3.2 of the AES paper:

Spatial averaging is used for all 7 curves (except the on-axis curve) to remove interference and diffraction effects from the measurements.

I suspect this change is just some legal requirement to use very precise language when applying for a patent on a method that may be used for a particular purpose, nothing more.

So, it is definitely the mean level of the listening window (and not the sound power curve) between 300 Hz and 10 kHz that should be used for the reference level of the LFX calculation, for all speakers, including, (but not only) rear-ported speakers.

amirm · Jan 19, 2020

bobbooo said:
@amirm is there any way of exporting your measurement data at a higher frequency resolution, with 1/20-octave smoothing instead of the current 1/10-octave data? As the preference ratings will not be very accurate as things stand.

I think the JBL 305P was with 20 points/octave. Unfortunately the default is 10 points and the rest of the measurements are gathered that way. Changing it requires remeasuring the speaker again.

I don't think there is much difference between 1/10 and 1/20 though. 1/3 octave is too low and wider than ERB of our hearing. But 1/10 should be fine. 1/20th will easily overstated our hearing resolution above a few hundred hertz.

bobbooo · Jan 19, 2020

Looking over Olive’s AES paper again, I’ve noticed a major problem with these calculations we have. The data we’re using from the measurements has 1/10-octave smoothing i.e. each frequency data point is a factor of 2^(1/10) times the previous one. Olive’s formula however is based specifically on 1/20-octave smoothed data. In Section 6 of his AES paper he explains that models based on 1/20-octave smoothed data show a markedly higher correlation between predicted and reported preference rating than those based on 1/3-octave smoothing (correlation coefficient r = 0.94 for a 1/20-octave model based on predicted in-room response of 13 speakers vs r = 0.75 for a 1/3-octave model based on in-room measurements).

Although not as bad as 1/3-octave would be, this suggests the correlation between our predicted preference ratings, based on 1/10-octave smoothed data, and actual preference will be significantly diminished from the r = 0.86 correlation it should be from Olive’s calculations.

Is there any way of exporting the measurement data at a higher frequency resolution, with 1/20-octave smoothing instead of the current 1/10-octave data? As the preference ratings will not be very accurate as things stand.

bobbooo · Jan 19, 2020

amirm said:
I think the JBL 305P was with 20 points/octave. Unfortunately the default is 10 points and the rest of the measurements are gathered that way. Changing it requires remeasuring the speaker again.

I don't think there is much difference between 1/10 and 1/20 though. 1/3 octave is too low and wider than ERB of our hearing. But 1/10 should be fine. 1/20th will easily overstated our hearing resolution above a few hundred hertz.

The problem is Olive's algorithm was optimised specifically for 1/20-octave smoothed data. We have no idea how lowering the resolution would affect the correlation between the predicted preference ratings from his formula and actual preference (apart from lowering it significantly, but not by how much or in what ways). It's not merely a matter of hearing resolution.

Here on page 12 Klippel say the raw data is taken and then processed to the desired frequency resolution. Do you not still have the raw data for the speakers you've measured so far, and so could you somehow use the Klippel software to reprocess it at a higher resolution?

daverosenthal · Jan 19, 2020

bobbooo said:
Looking over Olive’s AES paper again, I’ve noticed a major problem with these calculations we have. The data we’re using from the measurements has 1/10-octave smoothing i.e. each frequency data point is a factor of 2^(1/10) times the previous one. Olive’s formula however is based specifically on 1/20-octave smoothed data
...
the preference ratings will not be very accurate as things stand.

Hence my earlier statement that "From what I've seen in quick glance I'm skeptical that the coefficients he uses will work with the data we have (differences in number of measurements per octave, etc.)".

But honestly, after reading the paper fairly carefully, getting hung up is probably kind of silly here. The model itself is based on a fit to a relatively small set of observations to start with and the fit is not particularly close. Heck, the patent even proposes two models that are totally different. It seems to me the key thing here is that we preserve the basic directional correctness of the metrics in a way that lets us compare rather than exactly match some "golden" model down to the third significant figure.

Blumlein 88 · Jan 19, 2020

amirm said:
I think the JBL 305P was with 20 points/octave. Unfortunately the default is 10 points and the rest of the measurements are gathered that way. Changing it requires remeasuring the speaker again.

I don't think there is much difference between 1/10 and 1/20 though. 1/3 octave is too low and wider than ERB of our hearing. But 1/10 should be fine. 1/20th will easily overstated our hearing resolution above a few hundred hertz.

I think considering what we are (what you are) doing, yes, we need that 1/20th octave data. I don't remember the ins and outs of how they decided this, but seem to remember 1/10th and 1/12th was good enough which is why they went with 1/20th octave. Certainly in the future you'll want to go with 1/20th octave.

daverosenthal · Jan 19, 2020

amirm said:
I don't think there is much difference between 1/10 and 1/20 though. 1/3 octave is too low and wider than ERB of our hearing. But 1/10 should be fine. 1/20th will easily overstated our hearing resolution above a few hundred hertz.

Agree. The narrow band deviation (NBD) metric which correlates with quality relies on analyzing the variations within 1/2 octave bands. You can see why something like 1/3 octave smoothing messes with that measure. But moving from 1/20th to 1/10th seems like no big deal.

bobbooo said:
...was optimized specifically for 1/20-octave smoothed data. We have no idea how lowering the resolution would affect...

Or is it?

Actually, I think we can empirically answer that question. I re-ran the NBD calculations for both speakers (NHT and Revel) with both native 1/10th and 1/5th octave smoothing. The NBD metric moved from 0.41 to 0.41 for the NHT and 0.32 to 0.34 for the Revel. The effect of moving from 1/20th to 1/10th should be much smaller. This appears to be in the noise compared to all of the other factors at play here.

amirm · Jan 19, 2020

bobbooo said:
The problem is Olive's algorithm was optimised specifically for 1/20-octave smoothed data. We have no idea how lowering the resolution would affect the correlation between the predicted preference ratings from his formula and actual preference (apart from lowering it significantly, but not by how much or in what ways). It's not merely a matter of hearing resolution.

Oh it better not have that kind of dependency because if it does, by definition it is not perceptually correct. And couldn't correlated so well with listening tests.

Here on page 12 Klippel say the raw data is taken and then processed to the desired frequency resolution. Do you not still have the raw data for the speakers you've measured so far, and so could you somehow use the Klippel software to reprocess it at a higher resolution?

The setting is in two places one of which wanted to delete the data but the other not. I am recomputing the Yamaha just now. Looks like it is working so far.

bobbooo · Jan 19, 2020

amirm said:
Oh it better not have that kind of dependency because if it does, by definition it is not perceptually correct. And couldn't correlated so well with listening tests.

I don't see why not - the algorithm was calculated based on 1/20-octave data by using principal component and multiple regression analysis, in order to find the variables and coefficients that optimally correlate measurements with reported preference ratings, out of all possible combinations of 23 variables, while minimising collinearity between them.

Running this complex analysis on the same data with 1/10-octave smoothing may well result in a very different combination of optimum variables and coefficients, meaning the current ones we're using could be unpredictability inaccurate. We have no way of knowing without re-doing the principal component and regression analysis, which we can't do because we don't have Sean Olive's data.

amirm said:
The setting is in two places one of which wanted to delete the data but the other not. I am recomputing the Yamaha just now. Looks like it is working so far.

Great! I hope you can do the same for the other speakers measured

Krunok · Jan 19, 2020

amirm said:
Oh it better not have that kind of dependency because if it does, by definition it is not perceptually correct. And couldn't correlated so well with listening tests.

I fully agree with this. Something is very wrong if preference rating would differ with 1/10 vs 1/20 smoothing.

Krunok · Jan 19, 2020

bobbooo said:
I don't see why not - the algorithm was calculated based on 1/20-octave data by using principal component and multiple regression analysis, in order to find the variables and coefficients that optimally correlate measurements with reported preference ratings, out of all possible combinations of 23 variables, while minimising collinearity between them.

Running this complex analysis on the same data with 1/10-octave smoothing may well result in a very different combination of optimum variables and coefficients, meaning the current ones we're using could be unpredictability inaccurate. We have no way of knowing without re-doing the principal component and regression analysis, which we can't do because we don't have Sean Olive's data.

Because from the perspective of linear regression 1/10 smoothed graph should get the same result as 1/20 smoothed graph.

bobbooo · Jan 19, 2020

Krunok said:
Because from the perspective of linear regression 1/10 smoothed graph should get the same result as 1/20 smoothed graph.

I'm not talking about the linear regression lines applied to the measurement curves when calculating the individual variables here, I'm talking about the multiple regression analysis applied to all possible configurations of the independent variables of the model in order to find an algorithm with optimum correlation between measurement and preference. It's all in Olive's AES paper (scroll down for the correct paper).

Speaker Equivalent SINAD Discussion

Major Contributor

Major Contributor

Major Contributor

Member

Major Contributor

Member

Major Contributor

Major Contributor

Major Contributor

Founder/Admin

Major Contributor

Major Contributor

Member

Grand Contributor

Member

Founder/Admin

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Similar threads