• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required as is 20 years of participation in forums (not all true). There are daily reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Crinacle: Why (most) Ratings Suck: An Analysis

raistlin65

Major Contributor
Forum Donor
Joined
Nov 13, 2019
Messages
2,278
Likes
3,382
Location
Grand Rapids, MI
This is an article purely on the execution of rating and ranking systems of various websites, though of course also critiquing the overly-positive vibe of many publications that may be leading our hobby to its downfall if not corrected.
....

TL;DR
You gotta read between the lines.

WhatHiFi: Statistically, only 5-star awards are worthy of consideration.

SoundGuys: A score of 7.5/10 is statistically average, and only those rated 8/10 are truly significantly ahead of the pack (top 25%).

Headfonics: Virtually nothing is rated under 6/10, and the average is set at roughly 8.4/10. The worst of the bunch in terms of rating skew and bias.

Headphonesty: The best of the bunch, but still terrible. 4/5 stars is the average rating, so really only those rated at 4.5 stars and above are worthy of consideration.

MajorHiFi: Distribution puts a “B” grade at the median, with the highest grade being “A”. Statistically speaking, only A-rated models are worthy of consideration.

https://crinacle.com/2021/07/06/why-most-ratings-suck-an-analysis/
 

Berwhale

Major Contributor
Forum Donor
Joined
Aug 29, 2019
Messages
2,443
Likes
2,896
Location
UK
I notice that he didn't calculate the distribution of his own ranking system :)
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
14,657
Likes
23,746
https://crinacle.com/2021/06/20/the-update-where-the-headphone-rankings-get-shuffled/

He looks at the stats of his own ratings here.

1625522641412.png


1625522654333.png
 

Berwhale

Major Contributor
Forum Donor
Joined
Aug 29, 2019
Messages
2,443
Likes
2,896
Location
UK

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
14,657
Likes
23,746
Re-ranking past headphones to keep a normal distribution seems odd, but I do appreciate that Crinacle is trying to maintain the overall value of the list to consumers.
Well I don't know I'd consider it odd. Up evaluating the situation he realized he had the same problem as others. Too narrow a range. Of course mass online rating systems seem to be headed for the Up or Down vote format. One bit ranking that works like DSD instead of multi-bit I suppose.
 

tmtomh

Major Contributor
Joined
Aug 14, 2018
Messages
1,408
Likes
4,235
"WhatHiFi: Statistically, only 5-star awards are worthy of consideration."

I routinely read their reviews. Their ratings have zero meanings.

"WhatHiFi: Statistically, only 5-star awards are worthy of consideration... and pay no attention to our tendency to give 5-star ratings only to products from UK brands."

:):confused:
 

ReaderZ

Addicted to Fun and Learning
Joined
Apr 14, 2020
Messages
566
Likes
368
I take audio equipment very seriously so I actually read the full review and don't rely on ratings. I do look at scores for video games.
 

Berwhale

Major Contributor
Forum Donor
Joined
Aug 29, 2019
Messages
2,443
Likes
2,896
Location
UK

pozz

Слава Україні
Forum Donor
Editor
Joined
May 21, 2019
Messages
4,036
Likes
6,622

Berwhale

Major Contributor
Forum Donor
Joined
Aug 29, 2019
Messages
2,443
Likes
2,896
Location
UK
@pozz - Can you enable the export of data from the Power BI report? Ideally to an .xlsx with embedded hyperlinks to the review articles? (or maybe just PM me a copy).
 

pozz

Слава Україні
Forum Donor
Editor
Joined
May 21, 2019
Messages
4,036
Likes
6,622
@pozz - Can you enable the export of data from the Power BI report? Ideally to an .xlsx with embedded hyperlinks to the review articles? (or maybe just PM me a copy).
I'll PM you.
 

Berwhale

Major Contributor
Forum Donor
Joined
Aug 29, 2019
Messages
2,443
Likes
2,896
Location
UK
There doesn't appear to be an ISO standard for Panthers, so I took a look at this thread... ASR Panthers | Audio Science Review (ASR) Forum

...and came up with the following Panther rankings:

1 = Golfing, Soccer, Saluting Soldier
2 = Lounging, Sitting
3 = Shrugging, Shrugging Postman, Chilling
4 = Pewter (Worlds Worst Cook), Shrugging & Missing Arm (I can't find a picture of this one)
5 = Headless, Piggybank
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
37,969
Likes
162,429
Location
Seattle Area
There doesn't appear to be an ISO standard for Panthers, so I took a look at this thread... ASR Panthers | Audio Science Review (ASR) Forum
It is too good for ISO standards. ISO standards were created by a committee and we all know how that turns out. Panther ratings in sharp contrast was created by just me so objectively, it is a million times better....
 

Phorize

Major Contributor
Forum Donor
Joined
Apr 26, 2019
Messages
1,113
Likes
1,372
Location
U.K
"WhatHiFi: Statistically, only 5-star awards are worthy of consideration."

I routinely read their reviews. Their ratings have zero meanings.
Don’t they mean that the marketing people at the Acme Cable Company bought the reviewer lunch? Unless they are Naim of course, then it’s 5 stars all round, no lunch required.
 
Last edited:

Berwhale

Major Contributor
Forum Donor
Joined
Aug 29, 2019
Messages
2,443
Likes
2,896
Location
UK
I think this may say more about the ellusiveness of the Pewter and Shrugging & Missing Arm panthers than Amir's Panther-o-meter...

1625597279943.png


If we group the 3rd and 4th rank together (they are all friends, after all), then the 2nd, 3rd/4th and 5th groupings all account for roughly 29% of reviews each and the 1st rank being reserved for 14% of the 49 entries, which seems nice!

There are only 7 data points for second Panthers with EQ - 4 with 1st rank and 3 with 2nd rank.

Note: Reviews and headphone measurements with no panther assigned were ignored (e.g. HD600, initial tests with GRAS, etc.).
 
Top Bottom