# Steve Guttenberg compares subjective and objective reviews

#### dragonspit4

##### Member
Hi Dragon,

@Blumlein 88 and @amirm may have been a little careless with some word choices, but the thrust of what they are saying is that "relative" choices are generally preserved between groups. "Relative" means the rank is preserved, but not the absolute value or absolute difference. The P - I distinction deserves special consideration.

One can see from the graph the significant result:
P,I >* B >** M
* for 15/16 individual groups, and the combined group (p<0.0001)
** for all individual groups, and the combined group (p<0.0001)

Even though individual scores and differences between scores vary, the rank is well preserved.

The difference between P and I is less clear. Since the tabular data equivalent to the graph is not in the paper, one has to eyeball it. But it is clear that for 9/16 groups there is not a statistically significant difference. For 4/16 groups there is clearly a statistically significant difference, with 3 preferring P and 1 I. For 3/16 the significance is unclear from the graph (since I don't trust pixel measurements). When all the groups are combined the mean difference between P and I is 0.336 (from the paper), with a significance of p=0.0214.

Some will argue that is significant, but from my experience with this type of data, since the p=0.05 cutoff is arbitrary, other ways of viewing the data are relevant, and I would prefer further tests to draw a conclusion about P vs. I.

Cheers, SAM
Thx for the write up SAM.

I agree with you.
My interpretation of data presented here is as follows:
1. In absolute terms, people generally have different preference in Speaker.
2. Relative scoring of each speaker remains somewhat the same no matter which group is selected
3. All groups rated the speakers in the same order most of the time (but not always). <---this means that people generally speaking have the same preference but not always)
4. Trained listeners were more picky therefore the ratings are generally lower compared to other groups.

#### dragonspit4

##### Member
I think what the graph shows is that all the groups showed similar preferences.

I don't have a copy of Dr Toole's book and in any case I'm not sure how much is explained by the research in it, but unlike @dragonspit4 my concern is not so much about the relative scores varying (sure, the do, but not very significantly, at least in the case of B vs M vs P/I). My concern is that the tests all took place in the same room.

I'd like to see how different rooms affected preferences, and to what extent the results hold when the room is varied.

Perhaps there is more on this in the book, or in Harman's published research?
That's interesting.
How about no room!
Test the speakers outside, would the results still be the same?

#### amirm

Staff Member
CFO (Chief Fun Officer)
I don't have a copy of Dr Toole's book and in any case I'm not sure how much is explained by the research in it, but unlike @dragonspit4 my concern is not so much about the relative scores varying (sure, the do, but not very significantly, at least in the case of B vs M vs P/I). My concern is that the tests all took place in the same room.

I'd like to see how different rooms affected preferences, and to what extent the results hold when the room is varied.
Harman has three separate rooms where this type of research has been performed in. I showed the smallest one we sat in when we tested. Here is another where they test in-wall speakers and EQ products:

That is Dr. Olive there and the session I mentioned where he tested us as a group. Behind the screen is a triangular section of the wall that rotates with different speakers mounted on it. You can see it in this private harman presentation:

Then there is a larger room where they test multichannel speaker systems:

All of this research is extensively documented across countless papers from Dr. Toole and Olive over a number of decades. I have quoted many of them in the past and can quote more. It all points to the same consistent story that when tested blind, most of us like the same type of speaker response: ones with smooth and well behaved frequency response. Make variations from this and in controlled tests, listeners don't like the sound.

I have taken the blind test twice and both times voted the same as what their research indicates.

#### dragonspit4

##### Member
Harman has three separate rooms where this type of research has been performed in. I showed the smallest one we sat in when we tested. Here is another where they test in-wall speakers and EQ products:
View attachment 19999

That is Dr. Olive there and the session I mentioned where he tested us as a group. Behind the screen is a triangular section of the wall that rotates with different speakers mounted on it. You can see it in this private harman presentation:
View attachment 20002

Then there is a larger room where they test multichannel speaker systems:

View attachment 20000

View attachment 20001

All of this research is extensively documented across countless papers from Dr. Toole and Olive over a number of decades. I have quoted many of them in the past and can quote more. It all points to the same consistent story that when tested blind, most of us like the same type of speaker response: ones with smooth and well behaved frequency response. Make variations from this and in controlled tests, listeners don't like the sound.

I have taken the blind test twice and both times voted the same as what their research indicates.
Nice pictures!

Can we measure all the frequency response of different speakers (from same brand or different brand), and determine which one sounds the best? (determine best speaker by frequency response) and make a objective claim that one speaker is superior in sound compared the others?

#### amirm

Staff Member
CFO (Chief Fun Officer)
Can we measure all the frequency response of different speakers (from same brand or different brand), and determine which one sounds the best? (determine best speaker by frequency response) and make a objective claim that one speaker is superior in sound compared the others?
You can but you need an anechoic chamber with measurements every few degrees in both horizontal and vertical direction. You then apply a special weighting based on which ones come at you directly versus reflected from horizontal and vertical directions (indirect sounds). Once there, you get extremely high correlation to listening tests. It is not perfect and listening tests needs to confirm but it is a very good predictor.

Here is a picture of the setup I took while visiting Harman in one of their anechoic chambers:

That is the vertical arc with microphone array placed at the precise angles. The speaker is put on a turntable that is spun and the output of the microphones is captured. Once done, you have a sphere or response all around the speaker where you can now begin go create the composite score of all the direct and indirect sounds per above.

#### RayDunzl

##### Major Contributor
Central Scrutinizer
Is the floor missing something in that photo?

#### amirm

Staff Member
CFO (Chief Fun Officer)
To paint the larger picture, what you hear from a speaker is the sum total of direct sound coming at you from the speaker, and reflections from the room surfaces. Speaker response can vary from direct to reflected so any proper model of the speaker needs to include those. Using perceptual and practical modeling of rooms, Harman has created the weighting factor for those reflections by correlating the objective measurements against subjective listening tests.

#### amirm

Staff Member
CFO (Chief Fun Officer)
Is the floor missing something in that photo?
Anechoic chambers don't have floors. It is usually a mesh. That room was small and I could not get far enough back to show the floor.

#### dragonspit4

##### Member
You can but you need an anechoic chamber with measurements every few degrees in both horizontal and vertical direction. You then apply a special weighting based on which ones come at you directly versus reflected from horizontal and vertical directions (indirect sounds). Once there, you get extremely high correlation to listening tests. It is not perfect and listening tests needs to confirm but it is a very good predictor.

Here is a picture of the setup I took while visiting Harman in one of their anechoic chambers:

View attachment 20016

That is the vertical arc with microphone array placed at the precise angles. The speaker is put on a turntable that is spun and the output of the microphones is captured. Once done, you have a sphere or response all around the speaker where you can now begin go create the composite score of all the direct and indirect sounds per above.
This is pretty interesting.
How about for headphones, can the same thing be done
Can we measure all the frequency response of different headphones (from same brand or different brand), and determine which one sounds the best? (determine best headphone by frequency response) and make an objective claim that one headphone is superior in sound compared the others?
I am assuming that measuring headphone's frequency response would be easier? since there is no reverberation (echo) of sound coming from the walls?

#### Cosmik

##### Major Contributor
...what you hear from a speaker is the sum total of direct sound coming at you from the speaker, and reflections from the room surfaces.
What reaches your ear is what you describe above. What you hear is dependent on functions of the ears and brain, and possibly what you consciously choose to focus on.

#### flipflop

##### Senior Member
Can we measure all the frequency response of different headphones (from same brand or different brand), and determine which one sounds the best?
Yes, to a large degree. From 'A Statistical Model That Predicts Listeners’ Preference Ratings of In-Ear Headphones: Part 2 – Development and Validation of the Model' (Sean E. Olive, Todd Welti, and Omid Khonsaripour):
The correlation between the predicted and measured preference ratings was r = 0.91 with a residual error of about 5.5% or 5.5 points on a 100-point preference scale.
I am assuming that measuring headphone's frequency response would be easier?
It poses different obstacles due to leakage effects and driver position variability.

#### amirm

Staff Member
CFO (Chief Fun Officer)

#### vert

##### Member
Reading from the website of Paradigm, their research with the Canadian institute had similar results in that listeners expressed some definite preferences in speaker profiles.

#### amirm

Staff Member
CFO (Chief Fun Officer)
Reading from the website of Paradigm, their research with the Canadian institute had similar results in that listeners expressed some definite preferences in speaker profiles.
Paradigm is a spin off from NRC research team (of which so is Dr. Toole). PSB is another company in the same regard.

#### Arnandsway

##### Active Member
Paradigm is a spin off from NRC research team (of which so is Dr. Toole). PSB is another company in the same regard.
Little bit offtopic, but I just looked up PSB and saw that they just released a new Alpha series of budget-friendly speakers. They look nice

#### HuskerDu

##### Member
Patreon Donor
...All of this research is extensively documented ... It all points to ... smooth ... [speaker] frequency response.
Finally found this!

So I'll look around to see if I can find a smoothness ranking, but if anybody already has it, I would of course welcome a link.

I did find the Zu measures using the search suggested early in this thread. Ouch I think. I'ma look for my receipt next, to check whether 60 days are up already.

Glad to have the speaker performance target nailed down: flatness per dollar, assuming the (presumably axiomatic) engineering for getting the signal from the amp to the speaker (ohms I gather...) is competently addressed.

#### HuskerDu

##### Member
Patreon Donor
...see if I can find a smoothness ranking, but if anybody already has it, I would of course welcome a link.
Found Atkinson. Can't find a set of charts that are like the ones in the Zu speaker article. Smooth frequency response is much too vague for Google, as I've deployed it so far.

Found one post that points out a (now) obvious tidbit: put the speaker on the grass outside and you've got a low-tech anechoic test bench. (I'ma think about this. Flat freq resp can't be that hard to test, in pragmatic terms. Sound waves travel pretty slowly... Still feels like I'm catching up with the "obvious" though.)

#### Blumlein 88

##### Major Contributor
I think what you are looking for is a Directivity Index or DI from the spin-o-rama data.

Here is a good slideshow presentation with some explanation.

Here is a 7 page article by Floyd Toole explaining DI and spinorama results.
https://www.edn.com/design/audio-de...ring-the-essential-properties-of-loudspeakers

It is explained here also.
https://www.audioholics.com/loudspeaker-design/understanding-loudspeaker-measurements

Sorry for the scattershot approach some explanations are easier for a given person than others.

Here is another explanation in terms of the ANSI standard.
https://speakerdata2034.blogspot.com/2019/02/spinorama-cea-2034-2015-ansi-data-format.html

#### Daverz

##### Active Member
Found Atkinson. Can't find a set of charts that are like the ones in the Zu speaker article. Smooth frequency response is much too vague for Google, as I've deployed it so far.
Also check out the measurements for the Devores they all love at Stereophile.

Found one post that points out a (now) obvious tidbit: put the speaker on the grass outside and you've got a low-tech anechoic test bench. (I'ma think about this. Flat freq resp can't be that hard to test, in pragmatic terms. Sound waves travel pretty slowly... Still feels like I'm catching up with the "obvious" though.)
Should the mic go on the grass, too? I live on an abandoned golf course, and have thought of schlepping speakers out there for measurement.