Ricardus
Senior Member
But I am. I addressed it in a recent post:You're not quite addressing the point of what I've proposed here.
You can't standardize subjectivity, so this is a pointless exercise.
But I am. I addressed it in a recent post:You're not quite addressing the point of what I've proposed here.
But I am. I addressed it in a recent post:
You can't standardize subjectivity, so this is a pointless exercise.
You can't standardize subjectivity, so this is a pointless exercise.
What's the problem with that?
Well, look at it like this. Currently we have lots of people out there listening to speakers and delivering nothing but subjective reviews, including a good handful of "professionals". Year in, year out, we get reviews of potentially interesting speakers with no objective data. It feels like a big waste.
I think most of the people in this thread would agree that these subjective reviews are worthless or nearly so. However, sometimes the only reviews available for a given piece of equipment are subjective.
So, we do this: We take everything a certain reviewer has ever written about speakers, and correlate the language they use to objective measurements where available.
This would yield a model where if said reviewer uses the term (say) "strident", we could (for example) correlate that with a 92% probability that there is excess energy in the 7-9khz range.
Such a tool could also be used to evaluate the credibility of subjective reviewers. If the ML model finds that their use of terms is very consistent across speakers (for example, they always use the words "buttery", "smooth", or "round" to describe bass boost of +2dB or more in the 55-80hz region), we can then put more stock in those reviews, since we will know that their use of certain words consistently refers to real, specific variations in performance.
Conversely, it could be used to discredit reviewers who just make s*** up for their reviews. If their use of language correlates poorly or not at all to objective measurements, we can show (objectively) that their reviews are meaningless.
To me, this would be a little bit useful and very interesting. I'm not really proposing more than that, a mostly-just-interesting study.
It seems that people in this thread are arguing such a tool or study would be worse than nothing, which is odd to me, but I'm still considering the possibility I haven't made the concept clear.
You could also use this technique to find trends in how people in general use these words, but maybe that's less interesting or useful. I think it would work for any corpus where there is enough use of flowery language. That could be "people in general" or for long-standing reviewers, it could be one person.
If so, then so much the better, we would have data showing that subjective reviews are unreliable, and HOW unreliable they are.I simply doubt that you could get any reasonably useful percentage of correlation (much less something in the 90-ish category)
I don't know, it doesn't work at all without the objective measurements, so you could just as well argue that this could be the final subjugation of subjectivity to objectivity! Just think - people would skip reading the review and just wait for the "ML scoring model summary" of the review to come out on ASR. Then if the scores look good, they petition Amir to measure the device for confirmation.this whole idea seems to be a crutch that would, for many, many people, destroy any interest or reliance on measurements.
But are they zero differences?Yes, I would limit this to descriptions of speakers, to at least avoid measuring known-zero differences...
How many reviewers do that though? Their brain processed the input from the ears, along with whatever other inputs there are and whatever experience was deemed relevant - and across several centres in the brain working on different aspects of the music, by current accounts, and created the experience that is described as "hearing" particular aspects of the "sound".Conversely, it could be used to discredit reviewers who just make s*** up for their reviews. If their use of language correlates poorly or not at all to objective measurements, we can show (objectively) that their reviews are meaningless.
In retaliation, the subjective reviewers would either start publishing measurements to stay in the game, or double down and switch up their vocabulary to throw off the model, which would backfire and kill off their remaining traffic / readership.
Quite a few if you ask the typical ASR poster.How many reviewers do that though?
Certainly, and actually I would not argue that subjective reviews that are technically about nothing more than the look of the equipment are worthless. They enable people to enjoy music (even if it's on a questionable basis) and it keeps hifi manufacturers in business. "But they're WRONG!!" is a common feeling around here, but at the end of the day, homeopathic audio solutions aren't just fraud, they're effective placebos. If you hear an improvement that isn't actually there, well, you still hear it, and that's what counts.But maybe the bigger speaker, or the one with the traditional look, or the famous brand name, is seen as better but it comes out as a difference in the sound once the brain's finished its processing.
Indeed, that's the real goal of this proposed idea. To brute-force a link between measurable performance and subjective description.So the subjective model is doing something. Maybe it would be more scientific to understand what it is doing,
If there are any objective correlations between the actual measured sound and what people say, then we can be comforted in the fact that there is real value in these reviews.condemn it for not matching up with objective measurements (if indeed it doesn't, for whatever boundaries we put around what the sound waves are doing).
There's a third option at least. The reviewers are reacting to some other feature of their experience of the system. For example, the name on the boxes.If there are any objective correlations between the actual measured sound and what people say, then we can be comforted in the fact that there is real value in these reviews.
If the terms used in reviews turn out to be indistinguishable from random, we have 2 possibilities left:
1) The reviews are hogwash and should be ignored, unless you want someone to talk you into buying gear for no reason whatsoever
2) The reviewers are actually hearing something that is not measured. We could infer this by finding that two or more subjective reviewers' terminology matches closely, but doesn't correlate with measurements. So either they are collaborating (not unlikely) or they're actually hearing something we don't measure. (much less likely).
Both outcomes would be valuable info, I think!
There's a third option at least. The reviewers are reacting to some other feature of their experience of the system. For example, the name on the boxes.
And a fourth. A reviewer may be confused about the experience, decide that something is a bit different in some way, and rationalise it. Yes, that is "make stuff up", but what editor will let a reviewer come to the conclusion "haven't the faintest".. so a conscious conclusion based on what might be expected will fit the bill.
Many of us throw up our hands when equipment is described as "fast", "slow", "crisp", "warm", etc. It seems impossible to relate these terms to measurable characteristics.
The issue is that when it comes to facts, people think more like lawyers than scientists, which means they 'cherry pick' the facts and studies that back up what they already believe to be true.
So if someone doesn't think humans are causing climate change, they will ignore the hundreds of studies that support that conclusion, but latch onto the one study they can find that casts doubt on this view. This is also known as confirmation bias, a type of cognitive bias.
"7. Wovon man nicht sprechen kann, darüber muss man schweigen."
Literal.
We should defer to measurements when performance assessment is the goal. A lot can be done outside of that to speak better, to be more informative, but the object of description is phantasmagorical. It asks for precision and study. It resists casualness.
The problem is this. If you eliminate difference by level matching, between devices where no audible difference is predicted by measurements, and test blind, the result is that no audible difference is heard. It's kind of "by definition" but it's strong evidence that we don't have to allow for anything else.There are audible differences, even if standard measurements are suggesting to some people there are none.
It's been proven a million times. Not just by the "audiophile golden ears" out there.
Manufactures, professional reviewers, reviewers, audio professionals, more or less experienced users, ... they can all hear it.
Everybody can - even via poorly recorded Youtube videos you can tell fuse A from fuse B, cable A from cable B, asf. asf.... apart.
Now. Fact is.
There are audible differences, even if standard measurements are suggesting to some people there are none.
It's been proven a million times. Not just by the "audiophile golden ears" out there.
Manufactures, professional reviewers, reviewers, audio professionals, more or less experienced users, ... they can all hear it.
Everybody can - even via poorly recorded Youtube videos you can tell fuse A from fuse B, cable A from cable B, asf. asf.... apart.
Simply ignoring the facts is the worst thing a scientist can do.
Etymologists use resources to standardize the meaning of words, so others can use those words with some level of consistent meaning.Curious: How do you think words get in to dictionaries in the first place?
And does some level of imprecision render words useless?
"smooth" "sharp" "dull" "sweet" "bitter" and on and on?
Would you plead ignorance if anyone used such terms...or any of countless such examples...because those words are not measurements, or don't come with measurements, and un-quantified language represents such a subjective morass it's just useless?
How many measurements do you see in dictionary definitions?
YESHow we could finally pin down flowery audiophile subjective descriptions?
The answer is simple - ignore it.