The frailty of Sighted Listening Tests

amirm · Aug 14, 2020

krabapple said:
Apart from this being a test of four, and only four, loudspeakers, simple 'ranking' is a crude measure. Alternately, a takeaway from that graph is that blinding the test reduced the strength of preference for the first two and dislike for the third, enough that the four actually become remarkably similar to each other, preference-wise. In fact the error bars for A ,B,C,D all appear to overlap once the test is blinded.

Once more, the testing included three Harman speakers and all the listeners were Harman employees. That they had strong biases in sighted listening is without question. Such is not the case in my listening tests. Or at least not nearly so.

The study says don't drink your own Kool Aid if you are designing speakers. Make sure you perform controlled studies when designing speakers so that you are not predisposed to like or dislike your own designs and products. We are in violent agreement on that front.

And no, simple ranking is not a crude measure. That is what I am tasked to do: give or not give a recommendation. I am not proving subjective numerical scoring for speakers. So your point is non-sequitur on that front.

MattHooper · Aug 14, 2020

krabapple said:
I find it so. Except as anthropology.

Yes, I think you were one of the members I was thinking of in writing that ;-)

krabapple said:
Maybe you 'often find' that because you tend to remember when it happens, more than when it does not. That too, is a form of bias.

Possible, of course. As we've all repeated here until we are red in the face: blind testing is the gold standard when you *really* want to be sure.

And, of course, absent blind testing you also can't presume the situations I'm describing were due to that selection bias. You have no reason to go on whatever I say. But from my perspective, I've seen quite accurate subjective descriptions of pretty much every speaker I own from reviewers and other audiophiles. Also it reminds me that I did a long thread on audiogon describing my more recent speaker hunt, where I auditioned a ton of different well known speakers and gave my subjective descriptions of their character. Many people replied to the thread and there was almost no dissent from my descriptions, with most saying essentially "I know X or Y speakers and you've described what I hear."

And as I have said before, I've also been led to speakers I adore from the subjective reports of others. I kept encountering from reviewers and audiophiles a consensus on the general sound character of Devor O speakers, and the characteristics described were very much what I was looking for. I was also aware of the reasons why Revel speakers were highly regarded and generally understood what type of sound they produced through both measured and subjective descriptions. When I heard the Revels it was "Yup, no surprises there, competent in all the ways I expected." When I heard the Devores they exhibited just the slightly eccentric characteristics I'd read about, and I found myself loving my music through them more than the Revels.

So in both cases the subjective reviews by others seemed to really capture the character of each speaker brand. And I was actually led to a speaker I really liked by subjective reviews. Though, for various reasons, I ended up with another speaker which, again, there was a high level of concensus on the sonic characteristics of that speaker, exactly what I hear from them at home. So...yes, I personally find carefully parsing reviews and reports from other audiophiles to be somewhat helpful.

Are there possible biases operating here? Of course. That's possible. But whatever may happen under blinded conditions, it remains the case that strictly using sighted listening and exchanging notes with others using the same method, has led to a sufficient level of consistancy to be of use to me. Whatever may change under blind conditions, under the sighted conditions I listen, the speakers I get continue to sound "the same" as when I first heard them, and "the same" as others have described them, making for satisfying purchases.

krabapple said:
Really, there are excellent, excellent reasons to require blind protocols.

Really? A year and 1/2 on this site, and here we are 26 pages in to this thread and... I had no idea.

krabapple said:
Sean Olive, a Harman trained listener, sometimes uses sighted tests...but is well aware of the limitations and biases involved.

Yes, just like pretty much everyone here.

Sighted listening is less reliable than blind trials.

But "less" reliable" is not the same as "wholly unreliable."

Sighted listening, even given the ever present possibility of bias, is not necessarily useless. There is an ever present possibility of bias in your every perception, your every inference, in judging your every action. Yet, somehow, without blinded protocols to vet every inference, you seem to navigate the world in predictable-enough manner.

patate91 · Aug 14, 2020

MattHooper said:
Yes, I think you were one of the members I was thinking of in writing that ;-)

Possible, of course. As we've all repeated here until we are red in the face: blind testing is the gold standard when you *really* want to be sure.

And, of course, absent blind testing you also can't presume the situations I'm describing were due to that selection bias. You have no reason to go on whatever I say. But from my perspective, I've seen quite accurate subjective descriptions of pretty much every speaker I own from reviewers and other audiophiles. Also it reminds me that I did a long thread on audiogon describing my more recent speaker hunt, where I auditioned a ton of different well known speakers and gave my subjective descriptions of their character. Many people replied to the thread and there was almost no dissent from my descriptions, with most saying essentially "I know X or Y speakers and you've described what I hear."

And as I have said before, I've also been led to speakers I adore from the subjective reports of others. I kept encountering from reviewers and audiophiles a consensus on the general sound character of Devor O speakers, and the characteristics described were very much what I was looking for. I was also aware of the reasons why Revel speakers were highly regarded and generally understood what type of sound they produced through both measured and subjective descriptions. When I heard the Revels it was "Yup, no surprises there, competent in all the ways I expected." When I heard the Devores they exhibited just the slightly eccentric characteristics I'd read about, and I found myself loving my music through them more than the Revels.

So in both cases the subjective reviews by others seemed to really capture the character of each speaker brand. And I was actually led to a speaker I really liked by subjective reviews. Though, for various reasons, I ended up with another speaker which, again, there was a high level of concensus on the sonic characteristics of that speaker, exactly what I hear from them at home. So...yes, I personally find carefully parsing reviews and reports from other audiophiles to be somewhat helpful.

Are there possible biases operating here? Of course. That's possible. But whatever may happen under blinded conditions, it remains the case that strictly using sighted listening and exchanging notes with others using the same method, has led to a sufficient level of consistancy to be of use to me. Whatever may change under blind conditions, under the sighted conditions I listen, the speakers I get continue to sound "the same" as when I first heard them, and "the same" as others have described them, making for satisfying purchases.

Really? A year and 1/2 on this site, and here we are 26 pages in to this thread and... I had no idea.

Yes, just like pretty much everyone here.

Sighted listening is less reliable than blind trials.

But "less" reliable" is not the same as "wholly unreliable."

Sighted listening, even given the ever present possibility of bias, is not necessarily useless. There is an ever present possibility of bias in your every perception, your every inference, in judging your every action. Yet, somehow, without blinded protocols to vet every inference, you seem to navigate the world in predictable-enough manner.

A roll of dice is not what I would call reliable. Even at 25% of faillure I wouldn't call it reliable. Let say you want to buy a car and the seller says : 1/4 of our car failed during the first years. Trained/experienced/critical listeners seems To be very reliable during blind test (consistent).

But again, at this point, I would change my position if someone shared science based proof.

amirm · Aug 14, 2020

patate91 said:
Even at 25% of faillure I wouldn't call it reliable.

You don't understand statistics then.

I have tested about 80 speakers. I would need to get 47 of those right for probability of me guessing being below 95% (p < .05). That is only 59% right. Or inverted, I can get 41% wrong and the results would pass standard of validity in such studies.

Controlled tests routinely have errors in them in people randomly guessing and such. That is why we perform analysis like I just did above to determine the likelihood of the results being correct. Suggest reading this article I wrote: https://www.audiosciencereview.com/forum/index.php?threads/statistics-of-abx-testing.170/

And this chart specifically:

As I said, much of these protests are because the subject itself is not understood. Blind tests do not generate reliable results just because they are blind. You must understand the nature of testing -- sighted and blind -- to understand the power of its conclusions. Walking blind into either, pun intended, gets you in trouble fast.

Rusty Shackleford · Aug 14, 2020

Rusty Shackleford said:
Amir wrote: “At the end of the day, you can't know the limits or effectiveness of what I do. So best to take a back seat and not try to pontificate based on studies that I keep telling you does not read on this situation.”

If he wants to clarify that Harman trained listeners are his equals, then that will clarify things and negate what I said. Otherwise, if we “can’t know the limits or effectiveness” of his skills, it seems it’s pointless to bring research to bear on this question.

To @Thomas savage ’s point above, Amir just wrote: “In other areas such as distortion, you [quoting Olive] are now in my wheelhouse. I have shown my skills in this area many times including in challenges from likes of you and late Arny Krueger. That is my resume. There is no such proof for the listeners in Harman studies. Trained in that context meant tonality of speakers, not ability to hear very small impairments.”

I think this pretty definitely answers your question. Yes, Amir is asserting that, even though (as he’s said many times) he can only reach level six in How to Listen and can’t hear much above 12 kHz anymore, his skills are above and beyond anyone else’s, including those of Harman-trained expert listeners and (it seems) Mr. Olive himself.

Everyone can evaluate that claim for themselves. I’m dubious, to say the least. (I’d also like to correct the record: How to Listen doesn’t only train to listen for tonality.)

But regardless, such a stance makes clear that even if Harman had done a test with trained listeners that exactly mimicked Amir’s evaluation process and found that the listeners struggled, he’d say it didn’t apply to him. So there’s not much point in trying to bring more research to bear on this issue, insofar as it’s specifically about Amir.

My broader view, based on things such as the reviewer/measurement correlations presented by Mr. Olive above, is that some people are better than others at providing accurate sighted subjective evaluations. I tend to think training does matter in that regard. How to Listen is a great program for that, and I think others have value, too, such as Professor Jason Corey’s book and course/software.

Ultimately, I think each reader needs to decide how much they trust any given reviewer. But I think insofar as people on ASR trust Amir’s subjective evaluations and no longer assert that only blind evaluations are valid, they need to resist rejecting other reviewers’ subjective evaluations out of hand because, to quote Amir, we “can’t know the limits or effectiveness” of each reviewers’ training or skills. This is simple fairness and logical consistency. Moreover, if there’s only one person (rather than all trained listeners or all critical audiophiles, etc.) who can “hear very small impairments,” it raises the question of why we should even care about those “impairments” at all.

amirm · Aug 14, 2020

Rusty Shackleford said:
I think this pretty definitely answers your question. Yes, Amir is asserting that, even though (as he’s said many times) he can only reach level six in How to Listen and can’t hear much above 12 kHz anymore, his skills are above and beyond anyone else’s, including Harman-trained expert listeners and (it seems) Mr. Olive himself.

I said nothing about "beyond anyone else." I said that I am trained in hearing distortions artifacts and Harman listeners are not. Without training, folks are in no position to hear what I am able to here. And I have demonstrated that in countless online challenges.

Here is one from another champion of double blind tests, Ethan Winer. He setup a test to see if people can tell if a signal is digitized over and over again. Here is how I did:

-------
Here is an example test you can take to show us you do have good hearing acuity. https://ethanwiner.com/loop-back.htm

It is a piece of music that has gone through a DAC, then ADC, then back to DAC and so on. And on really bad DAC/ADC as audiophile standard go: a $25 Soundblaster X-Fi.

This is me finding the difference double blind with just one pass through DAC/ADC:
foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/18 06:40:07

File A: C:\Users\Amir\Music\Ethan Soundblaster\sb20x_original.wav
File B: C:\Users\Amir\Music\Ethan Soundblaster\sb20x_pass1.wav

06:40:07 : Test started.
06:41:03 : 01/01 50.0%
06:41:16 : 02/02 25.0%
06:41:24 : 03/03 12.5%
06:41:33 : 04/04 6.3%
06:41:53 : 05/05 3.1%
06:42:02 : 06/06 1.6%
06:42:22 : 07/07 0.8%
06:42:34 : 08/08 0.4%
06:42:43 : 09/09 0.2%
06:42:56 : 10/10 0.1%
06:43:08 : 11/11 0.0%
06:43:16 : Test finished.

----------
Total: 11/11 (0.0%)

And of course with 20 loops:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/18 05:38:16

File A: C:\Users\Amir\Music\Ethan Soundblaster\sb20x_original.wav
File B: C:\Users\Amir\Music\Ethan Soundblaster\sb20x_pass20.wav

05:38:16 : Test started.
05:39:05 : 00/01 100.0%
05:39:27 : 00/02 100.0%
05:39:44 : 01/03 87.5%
05:40:01 : 02/04 68.8%
05:40:18 : 02/05 81.3%
05:40:30 : 03/06 65.6%
05:40:58 : 04/07 50.0%
05:41:09 : 05/08 36.3%
05:41:19 : 06/09 25.4%
05:41:28 : 07/10 17.2%
05:41:38 : 08/11 11.3%
05:41:53 : 09/12 7.3%
05:42:02 : 10/13 4.6%
05:42:18 : 11/14 2.9%
05:42:29 : 12/15 1.8%
05:42:42 : 13/16 1.1%
05:42:53 : 14/17 0.6%
05:43:03 : 15/18 0.4%
05:43:16 : 16/19 0.2%
05:43:27 : 17/20 0.1%
05:43:40 : 18/21 0.1%
05:43:53 : 19/22 0.0%
05:43:58 : Test finished.

----------
Total: 19/22 (0.0%)

As you see, 0% of guessing.

---------

Above is not the hardest test I have passed but it is a challenge put forward by someone who thinks people can't pass these tests.

Ultimately the fact that someone can detect impairments better than you is a source of sever angst for the few of you. That is all this is. Everything else is "smoke-filled coffeehouse crap " as the great line in the great movie A Few Good Men goes. Get over it. Sometimes you have to accept the truth no matter how untasteful to you.

patate91 · Aug 14, 2020

amirm said:
You don't understand statistics then.

I have tested about 80 speakers. I would need to get 47 of those right for probability of me guessing being below 95% (p < .05). That is only 59% right. Or inverted, I can get 41% wrong and the results would pass standard of validity in such studies.

Controlled tests routinely have errors in them in people randomly guessing and such. That is why we perform analysis like I just did above to determine the likelihood of the results being correct. Suggest reading this article I wrote: https://www.audiosciencereview.com/forum/index.php?threads/statistics-of-abx-testing.170/

And this chart specifically:

As I said, much of these protests are because the subject itself is not understood. Blind tests do not generate reliable results just because they are blind. You must understand the nature of testing -- sighted and blind -- to understand the power of its conclusions. Walking blind into either, pun intended, gets you in trouble fast.

Blind test are reliable because they are controlled, repeatable and remove a lot of biais and variable (not all).

Racheski · Aug 14, 2020

I vote to close this thread. At this point we are re-hashing the same arguments over and over, it is taking up Amir's time, and nobody is going to change their mind. Probably worth revisiting this topic if new research is published.

amirm · Aug 14, 2020

Rusty Shackleford said:
But I think insofar as people on ASR trust Amir’s subjective evaluations and no longer assert that only blind evaluations are valid, they need to resist rejecting other reviewers’ subjective evaluations out of hand because, to quote Amir, we “can’t know the limits or effectiveness” of each reviewers’ training or skills.

What nonsense and FUD. As I have said, I follow strict protocols in my subjective tests:

1. Mono listening. No other reviewer does this. Research from Olive et. al. shows that the ability to tell differences shrinks substantially when stereo is used as the "other" reviewers us.

2. Training. I am trained with both Harman software and years of on the job training to hear impairments. None of the "other" reviewers are that way.

3. Strict protocol of putting every speaker in the same spot, listening to the same clips and listener in the same position.

4. No stake in the outcome: I don't get free loaner equipment. I don't make money from sponsored links. I don't get paid to do any of this. Other reviewers are not remotely situated like this.

So please cut out the debating tactics and fear mongering. There is huge difference between what I do and other reviewers do.

Rusty Shackleford · Aug 14, 2020

amirm said:
You don't understand statistics then.

I have tested about 80 speakers. I would need to get 47 of those right for probability of me guessing being below 95% (p < .05). That is only 59% right. Or inverted, I can get 41% wrong and the results would pass standard of validity in such studies.

Controlled tests routinely have errors in them in people randomly guessing and such. That is why we perform analysis like I just did above to determine the likelihood of the results being correct. Suggest reading this article I wrote: https://www.audiosciencereview.com/forum/index.php?threads/statistics-of-abx-testing.170/

And this chart specifically:

As I said, much of these protests are because the subject itself is not understood. Blind tests do not generate reliable results just because they are blind. You must understand the nature of testing -- sighted and blind -- to understand the power of its conclusions. Walking blind into either, pun intended, gets you in trouble fast.

With all due respect, I think this is a misleading framing of the issue. In all studies (and, though I’m purposely pseudonymous on here, I can assure you I have more than lay knowledge here, so please don’t retort that I “don’t understand”), you need to specify the sample properly.

Your recommendations having a 75% correlation with measurement-calculated preference scores across the whole sample of speakers isn’t the proper sample if our question is about being able to differentiate between speakers with near-identical scores. (It’s doubly not the case when you have access to the measurements before issuing your recommendations.)

There are several ways we could go about this, but one would be a large, blind study with trained listeners evaluating similar-scoring (by measurements) speakers to see if there’s a pattern of trained listeners preferring certain of those speakers to the others. Then we could compare any subsequent listener’s preferences to that baseline to see how valid they are.

But, of course, if the listener being compared asserts that their skills trump those of the trained listeners in the initial study, the point becomes moot and we’re again in superlative golden ear territory.

scott wurcer · Aug 14, 2020

amirm said:
Ultimately the fact that someone can detect impairments better than you is a source of sever angst for the few of you. That is all this is. Everything else is "smoke-filled coffeehouse crap " as the great line in the great movie A Few Good Men goes. Get over it. Sometimes you have to accept the truth no matter how untasteful to you.

Difference and preference are two different things "impairment" has implications of preference, detecting differences would be a better term IMO.

amirm · Aug 14, 2020

patate91 said:
Blind test are reliable because they are controlled, repeatable and remove a lot of biais and variable (not all).

Not at all. You are confusing "blind" with "blind controlled tests" and even then, your conclusions are way wrong. If I did not match levels for example, the fact that a test is blind does not at all mean that the results are reliable. This is why you need to read and understand the research and not just run with buzzwords. If you had read the research, you would have known what the word "trained" meant in it. And that they used Harman employees instead of general population.

All studies have degrees of validity. Characteristics of the test alone don't make them reliable.

On repeatability, no one has repeated this research so you don't have that either.

Rusty Shackleford · Aug 14, 2020

amirm said:
Ultimately the fact that someone can detect impairments better than you is a source of sever angst for the few of you. That is all this is. Everything else is "smoke-filled coffeehouse crap " as the great line in the great movie A Few Good Men goes. Get over it. Sometimes you have to accept the truth no matter how untasteful to you.

amirm said:
So please cut out the debating tactics and fear mongering. There is huge difference between what I do and other reviewers do.

As you have told us, you “can’t know the limits or effectiveness” of other reviewers’ training or skills.

Rusty Shackleford · Aug 14, 2020

amirm said:
Not at all. You are confusing "blind" with "blind controlled tests" and even then, your conclusions are way wrong. If I did not match levels for example, the fact that a test is blind does not at all mean that the results are reliable. This is why you need to read and understand the research and not just run with buzzwords. If you had read the research, you would have known what the word "trained" meant in it. And that they used Harman employees instead of general population.

All studies have degrees of validity. Characteristics of the test alone don't make them reliable.

On repeatability, no one has repeated this research so you don't have that either.

Do you honestly think @patate91 doesn’t know that level-matching is essential? It goes without saying here. He’s writing in short hand. No one on this thread is typing out every single element of the proper method in every single post.

patate91 · Aug 14, 2020

amirm said:
Not at all. You are confusing "blind" with "blind controlled tests" and even then, your conclusions are way wrong. If I did not match levels for example, the fact that a test is blind does not at all mean that the results are reliable. This is why you need to read and understand the research and not just run with buzzwords. If you had read the research, you would have known what the word "trained" meant in it. And that they used Harman employees instead of general population.

All studies have degrees of validity. Characteristics of the test alone don't make them reliable.

On repeatability, no one has repeated this research so you don't have that either.

But there's other studies about biais and blind tests.

At this point I don't know why you come to conclusion that I'm confuses between blind and controlled.

amirm · Aug 14, 2020

Rusty Shackleford said:
As you have told us, you “can’t know the limits or effectiveness” of other reviewers’ training or skills.

Oh you can. Just ask them to demonstrate the same skills they say they have sighted, blind. As I have been asked. I think it was actually Steven (krabapple) who challenged me to pass MP3 at 320 kbps against CD. Right on the spot, I took a clip that ArnyK had put forward and showed that I could pass it:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/19 19:45:33

File A: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44.wav
File B: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44_01.mp3

19:45:33 : Test started.
19:46:21 : 01/01 50.0%
19:46:35 : 02/02 25.0%
19:46:49 : 02/03 50.0%
19:47:03 : 03/04 31.3%
19:47:13 : 04/05 18.8%
19:47:27 : 05/06 10.9%
19:47:38 : 06/07 6.3%
19:47:46 : 07/08 3.5%
19:48:01 : 08/09 2.0%
19:48:19 : 09/10 1.1%
19:48:31 : 10/11 0.6%
19:48:45 : 11/12 0.3%
19:48:58 : 12/13 0.2%
19:49:11 : 13/14 0.1%
19:49:28 : 14/15 0.0%
19:49:52 : 15/16 0.0%
19:49:56 : Test finished.

----------
Total: 15/16 (0.0%)
---

At that time, there was also a challenge on AVS about passing high res vs CD. So I took those very same clips and tested the MP3 hypothesis:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/31 15:18:41

File A: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_A2.mp3
File B: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_A2.wav

15:18:41 : Test started.
15:19:18 : 01/01 50.0%
15:19:30 : 01/02 75.0%
15:19:44 : 01/03 87.5%
15:20:35 : 02/04 68.8%
15:20:46 : 02/05 81.3%
15:21:39 : 03/06 65.6%
15:21:47 : 04/07 50.0%
15:21:54 : 04/08 63.7%
15:22:06 : 05/09 50.0%
15:22:19 : 06/10 37.7%
15:22:31 : 07/11 27.4%
15:22:44 : 08/12 19.4%
15:22:51 : 09/13 13.3%
15:22:58 : 10/14 9.0%
15:23:06 : 11/15 5.9%
15:23:14 : 12/16 3.8%
15:23:23 : 13/17 2.5%
15:23:33 : 14/18 1.5%
15:23:42 : 15/19 1.0%
15:23:54 : 16/20 0.6%
15:24:06 : 17/21 0.4%
15:24:15 : 18/22 0.2%
15:24:23 : 19/23 0.1%
15:24:34 : 20/24 0.1%
15:24:43 : 21/25 0.0%
15:24:52 : 22/26 0.0%
15:24:57 : Test finished.

----------
Total: 22/26 (0.0%)

Show me these kinds of results about "other reviewers" and then we can talk. Until then, they are totally unqualified to test anything related to audio. They need to take blind tests as I have to demonstrate their critical listening ability.

So don't mix me with other people as a debating stunt. It doesn't work. I have proof on my side, you don't.

amirm · Aug 14, 2020

patate91 said:
But there's other studies about biais and blind tests.

Then post a link regarding speakers. That is the only topic of interest. Don't show lack of common sense in constantly generalizing to other areas.

Thomas savage · Aug 14, 2020

Rusty Shackleford said:
To @Thomas savage ’s point above, Amir just wrote: “In other areas such as distortion, you [quoting Olive] are now in my wheelhouse. I have shown my skills in this area many times including in challenges from likes of you and late Arny Krueger. That is my resume. There is no such proof for the listeners in Harman studies. Trained in that context meant tonality of speakers, not ability to hear very small impairments.”

I think this pretty definitely answers your question. Yes, Amir is asserting that, even though (as he’s said many times) he can only reach level six in How to Listen and can’t hear much above 12 kHz anymore, his skills are above and beyond anyone else’s, including those of Harman-trained expert listeners and (it seems) Mr. Olive himself.

Everyone can evaluate that claim for themselves. I’m dubious, to say the least. (I’d also like to correct the record: How to Listen doesn’t only train to listen for tonality.)

But regardless, such a stance makes clear that even if Harman had done a test with trained listeners that exactly mimicked Amir’s evaluation process and found that the listeners struggled, he’d say it didn’t apply to him. So there’s not much point in trying to bring more research to bear on this issue, insofar as it’s specifically about Amir.

My broader view, based on things such as the reviewer/measurement correlations presented by Mr. Olive above, is that some people are better than others at providing accurate sighted subjective evaluations. I tend to think training does matter in that regard. How to Listen is a great program for that, and I think others have value, too, such as Professor Jason Corey’s book and course/software.

Ultimately, I think each reader needs to decide how much they trust any given reviewer. But I think insofar as people on ASR trust Amir’s subjective evaluations and no longer assert that only blind evaluations are valid, they need to resist rejecting other reviewers’ subjective evaluations out of hand because, to quote Amir, we “can’t know the limits or effectiveness” of each reviewers’ training or skills. This is simple fairness and logical consistency. Moreover, if there’s only one person (rather than all trained listeners or all critical audiophiles, etc.) who can “hear very small impairments,” it raises the question of why we should even care about those “impairments” at all.

Tbh iv no clue what your point is , beyond grabbing every straw to try and win some battle the origin of which no one can even remember.

Every time you add a bit more nonsense to back your extravagant claim about Amir proclaiming to be the greatest listener in the universe and impervious to bias and all other potential error.

Your expending a huge amount of time and effort to this mission. God knows why.

It's quite dull . Just harassment really and I'm at the point where I'd suggest you just ignore the parts of these free reviews you don't like.

There's a thread open to discuss how we can improve/quality these listening tests . I don't suspect to have contributed to that .

Rusty Shackleford · Aug 14, 2020

amirm said:
Oh you can. Just ask them to demonstrate the same skills they say they have sighted, blind. As I have been asked. I think it was actually Steven (krabapple) who challenged me to pass MP3 at 320 kbps against CD. Right on the spot, I took a clip that ArnyK had put forward and showed that I could pass it:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/19 19:45:33

File A: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44.wav
File B: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44_01.mp3

19:45:33 : Test started.
19:46:21 : 01/01 50.0%
19:46:35 : 02/02 25.0%
19:46:49 : 02/03 50.0%
19:47:03 : 03/04 31.3%
19:47:13 : 04/05 18.8%
19:47:27 : 05/06 10.9%
19:47:38 : 06/07 6.3%
19:47:46 : 07/08 3.5%
19:48:01 : 08/09 2.0%
19:48:19 : 09/10 1.1%
19:48:31 : 10/11 0.6%
19:48:45 : 11/12 0.3%
19:48:58 : 12/13 0.2%
19:49:11 : 13/14 0.1%
19:49:28 : 14/15 0.0%
19:49:52 : 15/16 0.0%
19:49:56 : Test finished.

----------
Total: 15/16 (0.0%)
---

At that time, there was also a challenge on AVS about passing high res vs CD. So I took those very same clips and tested the MP3 hypothesis:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/31 15:18:41

File A: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_A2.mp3
File B: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_A2.wav

15:18:41 : Test started.
15:19:18 : 01/01 50.0%
15:19:30 : 01/02 75.0%
15:19:44 : 01/03 87.5%
15:20:35 : 02/04 68.8%
15:20:46 : 02/05 81.3%
15:21:39 : 03/06 65.6%
15:21:47 : 04/07 50.0%
15:21:54 : 04/08 63.7%
15:22:06 : 05/09 50.0%
15:22:19 : 06/10 37.7%
15:22:31 : 07/11 27.4%
15:22:44 : 08/12 19.4%
15:22:51 : 09/13 13.3%
15:22:58 : 10/14 9.0%
15:23:06 : 11/15 5.9%
15:23:14 : 12/16 3.8%
15:23:23 : 13/17 2.5%
15:23:33 : 14/18 1.5%
15:23:42 : 15/19 1.0%
15:23:54 : 16/20 0.6%
15:24:06 : 17/21 0.4%
15:24:15 : 18/22 0.2%
15:24:23 : 19/23 0.1%
15:24:34 : 20/24 0.1%
15:24:43 : 21/25 0.0%
15:24:52 : 22/26 0.0%
15:24:57 : Test finished.

----------
Total: 22/26 (0.0%)

Show me these kinds of results about "other reviewers" and then we can talk. Until then, they are totally unqualified to test anything related to audio. They need to take blind tests as I have to demonstrate their critical listening ability.

So don't mix me with other people as a debating stunt. It doesn't work. I have proof on my side, you don't.

Yes, you’ve quoted these in many other threads for many years. But, as you just told @patate91, this is a thread “regarding speakers.”

More importantly, what happens when someone outscores you in How to Listen or another test? You have already rejected the skills of Harman-trained listeners who have done just that.

Who gets to decide which tests or skills are important? The answer seems to be “only Amir.”

patate91 · Aug 14, 2020

amirm said:
Then post a link regarding speakers. That is the only topic of interest. Don't show lack of common sense in constantly generalizing to other areas.

It seems that toi don't want to recognize that biais affects every humans being in all situations. Audio world is no different than other field. Again it comes to a point where there's "specials" human.

Foobar ABX is a blind test. But again no one refute your abilities.

The frailty of Sighted Listening Tests

Founder/Admin

Master Contributor

Active Member

Founder/Admin

Active Member

Founder/Admin

Active Member

Major Contributor

Founder/Admin

Active Member

Major Contributor

Founder/Admin

Active Member

Active Member

Active Member

Founder/Admin

Founder/Admin

Grand Contributor

Active Member

Active Member

Similar threads