• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Master Preference Ratings for Loudspeakers

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
As for listening distance, none of the papers mention it. The only hint is that the Test 1 paper indicates the room used is this one. Still no mention of listening distance, but that last paper does mention the room dimensions and there are a few photographs of the setup, so I can take a guess that the listening distance is between 5 and 10 meters. I think it's not actually a constant because of the loudspeaker shuffler system, which might be why they don't specify the distance in the papers.

If all Harman tests are performed at a distance between 5 and 10 meters then the dispersion characteristics do have a major impact because one will be listening to the room instead of the speakers and wide dispersion may in fact become important, particularly when you have several listeners evaluating at the same time (which requires a wide sweet-spot).

On the other hand I am willing to bet that no more than 20% of European audiophiles listens at a distance of 5 let alone 10 meters...
 

bobbooo

Major Contributor
Joined
Aug 30, 2019
Messages
1,479
Likes
2,079
If all Harman tests are performed at a distance between 5 and 10 meters then the dispersion characteristics do have a major impact because one will be listening to the room instead of the speakers and wide dispersion may in fact become important, particularly when you have several listeners evaluating at the same time (which requires a wide sweet-spot).

On the other hand I am willing to bet that no more than 20% of European audiophiles listens at a distance of 5 let alone 10 meters...

Take a look at the previous page of this thread - we've confirmed the listening tests for the study Dr. Olive's preference formula is based on were done at more like 3 meters.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,701
Likes
241,420
Location
Seattle Area
Could you measure the speakers again much louder(say +20db) and see how the measurements hold up? That would give a good idea of how well composed they stay as you crank them up. Would that risk damaging some of the weaker speakers?
With nearly 1000 measurements, no way you want to run the standard test at elevated levels. There will be nothing left of the speaker, or the ears of the people in residence here. :) If you want a single sweep in-room I can do that but I just have not had time to focus much on distortion measurements. It would be something I will focus on later.

For now, my listening tests are focused very much on power handling.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,701
Likes
241,420
Location
Seattle Area
If all Harman tests are performed at a distance between 5 and 10 meters then the dispersion characteristics do have a major impact because one will be listening to the room instead of the speakers and wide dispersion may in fact become important, particularly when you have several listeners evaluating at the same time (which requires a wide sweet-spot).
Having taken the test, that did not enter the equation. You are immediately taken back by the difference in tonality and grade based on that.
 
OP
M

MZKM

Major Contributor
Forum Donor
Joined
Dec 1, 2018
Messages
4,251
Likes
11,557
Location
Land O’ Lakes, FL
With nearly 1000 measurements, no way you want to run the standard test at elevated levels. There will be nothing left of the speaker, or the ears of the people in residence here. :) If you want a single sweep in-room I can do that but I just have not had time to focus much on distortion measurements. It would be something I will focus on later.

For now, my listening tests are focused very much on power handling.
The Spinorama likely wouldn’t change much. Measuring distortion at 10W (8ohm) or 20W (8ohm) would be more helpful. I doubt any speaker would get damaged from 20W.

If that would be too difficult to work out, maybe pick an SPL (e.g., 95dBC) at your measuring position. This may be better as it’s a level playing ground, you won’t need to send 10W into a 99dB sensitive speaker to make it get loud enough for most music.
 
Last edited:

QMuse

Major Contributor
Joined
Feb 20, 2020
Messages
3,124
Likes
2,785
The Spinorama likely wouldn’t change much. Measuring distortion at 10W (8ohm) or 20W (8ohm) would be more helpful. I doubt any speaker would get damaged from 20W.

Testing at fixed SPL(s) produced at a defined distance would be a much better idea than testing at fixed power. ;)
 
Last edited:

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,337
Likes
6,709
With nearly 1000 measurements, no way you want to run the standard test at elevated levels. There will be nothing left of the speaker, or the ears of the people in residence here. :) If you want a single sweep in-room I can do that but I just have not had time to focus much on distortion measurements. It would be something I will focus on later.

For now, my listening tests are focused very much on power handling.

I figured as much. I often listen pretty loud(90-110db), so I'd be curious how speakers measure at those levels.
 

Sancus

Major Contributor
Forum Donor
Joined
Nov 30, 2018
Messages
2,926
Likes
7,643
Location
Canada

Sancus

Major Contributor
Forum Donor
Joined
Nov 30, 2018
Messages
2,926
Likes
7,643
Location
Canada
Dunno if anyone else finds this useful, but I used a thing called ScanIt to scan the(apparently) 75 points on the Measured : Predicted Preference ratings graph in Olive's study. I had to manually pick some of the really clustered points. I've attached the data as an excel sheet.

image of scanned points:
1583444978678.png
 

Attachments

  • olive study pg11 pref data points.zip
    9.1 KB · Views: 182
Last edited:

bobbooo

Major Contributor
Joined
Aug 30, 2019
Messages
1,479
Likes
2,079
Would it help to grade speakers by letter grade? You could still also include the number, but maybe a letter grade would stop people from obsessing over 6.3 vs 6.6 if both have a B rating.


100-90: SS
89-80: S
79-70: A
69-60: B
59-50: C
49-40: D
39-0: F

or maybe go by 8?

They're already graded by colored tiers, in keeping with the SINAD charts, so in addition to the scores, I don't see any need to have a third grading by letters. Here's how @MZKM defines the tiers:

8-10: Blue
6-8: Green
4-6: Yellow
2-4: Orange
0-2: Red
<0: Black

EDIT: @MZKM maybe it would be a good idea to write the score in the color of its corresponding tier though when you post them after each review?
 
Last edited:

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,337
Likes
6,709
They're already graded by colored tiers, in keeping with the SINAD charts, so in addition to the scores, I don't see any need to have a third grading by letters. Here's how @MZKM defines the tiers:

8-10: Blue
6-8: Green
4-6: Yellow
2-4: Orange
0-2: Red
<0: Black

EDIT: @MZKM maybe it would be a good idea to write the score in the color of its corresponding tier though when you post them after each review?

Maybe I don't understand the 0.8 error rating, but aren't those colors too few in number? An 8 and 6 can't be grouped together as being more or less equally preferred. The purpose of the letter grade idea was to group the speakers in such a way that one could say any speaker within that group would more less be "equally" preferred in a blind test. You could do the same thing with colors if you like that better, but wouldn't you need more colors?
 
OP
M

MZKM

Major Contributor
Forum Donor
Joined
Dec 1, 2018
Messages
4,251
Likes
11,557
Location
Land O’ Lakes, FL
Maybe I don't understand the 0.8 error rating, but aren't those colors too few in number? An 8 and 6 can't be grouped together as being more or less equally preferred. The purpose of the letter grade idea was to group the speakers in such a way that one could say any speaker within that group would more less be "equally" preferred in a blind test. You could do the same thing with colors if you like that better, but wouldn't you need more colors?
It’s +/-0.8, meaning max 1.6 deviation, so an 8.0 could be equivalent to a 6.4.
 

bobbooo

Major Contributor
Joined
Aug 30, 2019
Messages
1,479
Likes
2,079
Maybe I don't understand the 0.8 error rating, but aren't those colors too few in number? An 8 and 6 can't be grouped together as being more or less equally preferred. The purpose of the letter grade idea was to group the speakers in such a way that one could say any speaker within that group would more less be "equally" preferred in a blind test.

And what about two speakers just either side of the margins of two adjacent tiers? For example, a speaker with a score of 7.9 would be more or less equally preferred to one scoring 8.1, yet they would be in two separate tiers in both the current color-tier gradings and your letter tiers. It's impossible to have true 'equally preferred' tiers with static gradings.

Anyway, I think the color-tier spacings of 2 preference ratings are good as they are, as they are currently 2.5 standard deviations (σ) apart (0.8 * 2.5 = 2), which corresponds to ~99% of speakers scored in that range would receive listener ratings within the same range, which is just shy of the 3σ, 99.7% level often considered near-certainty in a lot of the sciences.
 
Last edited:

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,337
Likes
6,709
And what about two speakers just either side of the margins of two adjacent tiers? For example, a speaker with a score of 7.9 would be more or less equally preferred to one scoring 8.1, yet they would be in two separate tiers in both the current color-tier gradings and your letter tiers. It's impossible to have true 'equally preferred' tiers with static gradings.

Anyway, I think the color-tier spacings of 2 preference ratings are good as they are, as they are currently 2.5 sigmas (standard deviations) apart (0.8 * 2.5 = 2), which corresponds to ~99% of listeners would rate them in that range, which is just shy of the 3-sigma, 99.7% level often considered near-certainty in a lot of the sciences.

I considered the margins, but couldn't think of a better way to deal with that. You still have the same problem with the current scheme. Makes more sense though with the 1.6.
 

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,337
Likes
6,709
It’s +/-0.8, meaning max 1.6 deviation, so an 8.0 could be equivalent to a 6.4.

What about listing the score with the +-0.8? So instead of a 6.4 rating it would have a (5.6-7.2) rating. I'm just throwing out ideas to try and lessen the decimal point comparisons currently going on. Personally, I'm 100% content with the way it is right now.
 
OP
M

MZKM

Major Contributor
Forum Donor
Joined
Dec 1, 2018
Messages
4,251
Likes
11,557
Location
Land O’ Lakes, FL
What about listing the score with the +-0.8? So instead of a 6.4 rating it would have a (5.6-7.2) rating. I'm just throwing out ideas to try and lessen the decimal point comparisons currently going on. Personally, I'm 100% content with the way it is right now.
Too much info reduces ease of understanding for the common reader (same as why I don’t state “w/ ideal subwoofer” instead of “w/ subwoofer, too many people would continually ask what “ideal” means). The notes about +/-0.8 as well as other info is under the Notes section of my preference score master list.
 

bobbooo

Major Contributor
Joined
Aug 30, 2019
Messages
1,479
Likes
2,079
I considered the margins, but couldn't think of a better way to deal with that. You still have the same problem with the current scheme. Makes more sense though with the 1.6.

The only way to properly estimate whether two particular speakers would be 'equally' preferred would be to see how many standard deviations (represented by the Greek letter σ, sigma) their scores are from each other. But you would still need to specify a confidence interval, so if the score difference is:

≤ 0.8 (1σ) => 68% confidence higher score preferred
≤ 1.6 (2σ) => 95% confidence higher score preferred
≤ 2 (2.5σ) => 98.8% confidence higher score preferred
≤ 2.4 (3σ) => 99.7% confidence higher score preferred
≤ 4 (5σ) => 99.9999% confidence higher score preferred

5σ confidence qualifies as an official scientific discovery in particle physics by the way :D, and according to Wikipedia's numbers, a score difference of 5.2 (6.5σ) would mean no-one alive today would prefer the lower scoring speaker. In fact, the difference between the Revel M16 and the Realistics MC100 is greater than 7σ, which means not a single person in the whole of human history would have preferred the Realistics :oops:

Ok I'll stop now.
 
Last edited:

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
Last edited:
Top Bottom