Blind Test Results: Benchmark LA4 vs Conrad Johnson Tube Preamp

lashto · Nov 25, 2022

charleski said:
Read what? It’s all been taken down and Hertsens didn’t bother to link to the post describing his methods, so finding it would require scrabbling around in the archive (neither of your links have been stored on Wayback). He doesn’t say that he used the same procedure as earlier, so we’d just be making an assumption anyway. All we’re left with is a cool story.

Maybe this guy does have preternatural auditory discrimination and can resolve very subtle differences. That in itself would be interesting and definitely worth investigating further in a rigorous manner. But … no, he doesn’t bother. It’s just an anecdote that he used for a blog post, and as worthless as all the other anecdotes littering the web.

Looks like you are very 'rigorous' about others doing the work but you won't even bother digging the archive to find the links & details. Admittedly that's zero fun, but still..

The 3amps was just one of the tests, the first and supposedly the easiest one. IIRC, there was one guy who screwed all tests but all others had good results with the 3amps. At least better than the expected 50-50. That guy was the only one who "aced" it but he was not a huge outlier, he just did a bit better. And considering how different those amps measured, the audibility claims do not seem particularly wild to me.
The cables test was the expected blalala and, also as expected, no one could reliably diff the DACs. Nothing really wild happened there.

Could that test have been done better or at least documented better? Of course! Same as (almost) any other blind test.
Does the test fail to satisfy your criteria X/Y for rigurosity? Could easily be, nothing is perfect. If you want 'total science', anything not published in a peer-reviewed journal (and actually reviewed/reproduced X times) is "anecdote". Even then it's not possible to 100% eliminate the "anecdote" part.

Just keep whatever convictions/beliefs you currently have about amps/etc and don't bother reading any of those "anecdotes". At least not until they were (re)tested for 50+ years and reached the sureness level of the theory of relativity.
Good luck & good sound while waiting...

lashto · Nov 25, 2022

MattHooper said:
And one reason that is interesting to me concerns how subtle the difference is, given the much poorer measurements of the CJ preamp.

Yes the differences seem to me to be distinct, and for my goals they are significant. But in the bigger picture they are very subtle. Most of the super fine detail heard through the Benchmark is heard via the CJ preamp as well. Recordings retain their very individual characteristics. In that sense it could be surprising how alike they sound!
...

That sounds pretty much like the expected result. At least to me.

Distortion is not easy to test/hear and probably impossible without some training: one needs to know what/how to listen for.
Most people expect missing sounds/details/air or some sort of 'sounds bad' effect but there should not be any of that ... at least not before the tube amp goes beyond its 'comfort zone'

charleski · Nov 25, 2022

lashto said:
Looks like you are very 'rigorous' about others doing the work but you won't even bother digging the archive to find the links & details. Admittedly that's zero fun, but still..

The 3amps was just one of the tests, the first and supposedly the easiest one. IIRC, there was one guy who screwed all tests but all others had good results with the 3amps. At least better than the expected 50-50. That guy was the only one who "aced" it but he was not a huge outlier, he just did a bit better. And considering how different those amps measured, the audibility claims do not seem particularly wild to me.
The cables test was the expected blalala and, also as expected, no one could reliably diff the DACs. Nothing really wild happened there.

Could that test have been done better or at least documented better? Of course! Same as (almost) any other blind test.
Does the test fail to satisfy your criteria X/Y for rigurosity? Could easily be, nothing is perfect. If you want 'total science', anything not published in a peer-reviewed journal (and actually reviewed/reproduced X times) is "anecdote". Even then it's not possible to 100% eliminate the "anecdote" part.

Just keep whatever convictions/beliefs you currently have about amps/etc and don't bother reading any of those "anecdotes". At least not until they were (re)tested for 50+ years and reached the sureness level of the theory of relativity.
Good luck & good sound while waiting...

I notice you can't be bothered to find the links, so why should I? Now you say the amps measured 'different', earlier you said they were all in the same 80-90 SINAD ballpark. Looks like neither of us know what was actually going on here. If Hertsens had meant this to be taken seriously he'd have bothered to write it up properly. Clearly he didn't and it was just a blog post to fill his quota.

No, any result that can't be reproduced and hasn't undergone some sort of peer review is not science. If these anecdotal reports were demonstrating a real effect, then reproducing them should be trivial. Maybe this is just laziness, or maybe it's because the phenomenon disappears when they try to repeat it and inadvertently remove the cue that was confounding their initial result.

lashto · Nov 25, 2022

charleski said:
I notice you can't be bothered to find the links, so why should I?

why should anyone?

But looks like I did a lot more effort than you. And oops, my bigsound links were indeed bad, should work now. They were just 1-2 clicks away from the initial article but anyway, you have em now.

charleski said:
Now you say the amps measured 'different', earlier you said they were all in the same 80-90 SINAD ballpark. Looks like neither of us know what was actually going on here. If Hertsens had meant this to be taken seriously he'd have bothered to write it up properly. Clearly he didn't and it was just a blog post to fill his quota.

You can study the measurements. 80-90 SINAD is already a big ballpark but my bad again, it was more like 50-90 (check the tube-amp). So, even less of an audibility 'surprise'

charleski said:
No, any result that can't be reproduced and hasn't undergone some sort of peer review is not science. If these anecdotal reports were demonstrating a real effect, then reproducing them should be trivial. Maybe this is just laziness, or maybe it's because the phenomenon disappears when they try to repeat it and inadvertently remove the cue that was confounding their initial result.

Don't think that Herstens planned to write phd/AES papers from that test. But he did a giant lot of useful work, much more than most of us. Also doubt that he had some 'hidden agenda' about the audibility/results

P.S.
Looks like my threshold for acceptance/accountability is lower than yours. Otherwise, not sure why you seem somewhat upset with my posts. I did not do those tests and also did not make any particularly wild claims about them.
It was what it was, some people will be convinced some people will not be...

charleski · Nov 25, 2022

lashto said:
not sure why you seem somewhat upset with my posts

Because we need to be more discriminating if we're ever going to get any genuine answers. A slapdash old blog post that was clearly only meant to be a bit of entertainment doesn't provide any help. Claiming that someone can tell the difference between 80dB and 90dB SINAD is a pretty wild claim. Now that you've gone and checked properly it turns out the low end was 50dB instead, which is a massive difference. Checking that first would have been a good idea.

MattHooper · Nov 25, 2022

I just came across this photo I took of the results for this blind test. It's the two different sheets of paper: the two columns on each sheet represent the 2 sets of trials we performed. The sheet on the left was my son's sheet. As described, he did the switching using an on-line random number generator to determine the order of the switching. The sheet on the right - listener sheet - is my sheet where I wrote down my "guesses" from the listening room, as to which pre-amp was being used. You can see the correlation there. But the reason I'm posting is that I find the second column - second trial - interesting. To me it sort of indicates the usefulness of using a random generator to determine the switching. Look at the right column and you'll see starting at #6, there are six Cs in a row - so the random generator had my son NOT switch for 6 times in a row! Pretty tricky!

I actually remember that point in the test when I'd yelled "switch" and I could tell I was hearing the CJ preamp, but then yelled switch again and thought..ok sounds the same, still the CJ...then I yelled switch again...."huh? No change in sound...still CJ".... yelled switch again...expecting a change. But again, sounded the same like the CJ so...ok...then yelled switch again..."what? sounds the same. It couldn't REALLY be the CJ again could it? Started to doubt myself but decided to just keep writing down what it sounded like to me, rather than trying to second-guess the switching. So....6 CJ's in a row. And it turned out I was accurate.

In previous blind tests my helpers have used a coin flip to randomize switching, but the on-line number generator made it easier.

BLIND TEST RESULTS SHEET CJ BENCHMARK copy 2.jpg

lashto · Nov 28, 2022

charleski said:
Because we need to be more discriminating if we're ever going to get any genuine answers. A slapdash old blog post that was clearly only meant to be a bit of entertainment doesn't provide any help. Claiming that someone can tell the difference between 80dB and 90dB SINAD is a pretty wild claim. Now that you've gone and checked properly it turns out the low end was 50dB instead, which is a massive difference. Checking that first would have been a good idea.

only the lower HD number needs to be audible. And -80 dB does not look so inaudible to me.
Here's a test which shows -75 dB to be quite audible. And that is not 'normal' THD but a so called THD-equivalent made out of pure H2/H3 .. supposed to be the least audible form of HD (by far).

If you use a pair of headphones with very low HD (e.g. DC Stealth), 80dB should be in the audible ballpark. Make those 80dB out of 'offensive' H5/H7/H9 and I'll even bet some money on audibility

MarkS · Nov 28, 2022

lashto said:
Here's a test which shows -75 dB to be quite audible.

No. The results of that test were that -75dB was indistinguishable from -175dB:

BLIND TEST RESULTS Part II: "Is high Harmonic Distortion in music audible?" Respondent Results

A blog for audiophiles about more objective topics. Measurements of audio gear. Reasonable, realistic, no snakeoil assessment of sound, and equipment.

archimago.blogspot.com

lashto · Nov 28, 2022

MarkS said:
No. The results of that test were that -75dB was indistinguishable from -175dB:View attachment 246534

BLIND TEST RESULTS Part II: "Is high Harmonic Distortion in music audible?" Respondent Results

A blog for audiophiles about more objective topics. Measurements of audio gear. Reasonable, realistic, no snakeoil assessment of sound, and equipment.

archimago.blogspot.com

congratulations, you did find the one graph that best matched your beliefs

And look, I can 'prove' the exact opposite. With 3 graphs!

If you want to 'give it a chance', take your time and read the entire part2 & part3 articles. There are 10+ graphs in there and a pretty clear (and I guess surprising for some) trend: the 75 dB variant was preferred by 'everyone'. Plus a lot of very useful comments/interpretations/conclusions/etc by the author and also the blog commenters.

Spkrdctr · Nov 28, 2022

MattHooper said:
I just came across this photo I took of the results for this blind test. It's the two different sheets of paper: the two columns on each sheet represent the 2 sets of trials we performed. The sheet on the left was my son's sheet. As described, he did the switching using an on-line random number generator to determine the order of the switching. The sheet on the right - listener sheet - is my sheet where I wrote down my "guesses" from the listening room, as to which pre-amp was being used. You can see the correlation there. But the reason I'm posting is that I find the second column - second trial - interesting. To me it sort of indicates the usefulness of using a random generator to determine the switching. Look at the right column and you'll see starting at #6, there are six Cs in a row - so the random generator had my son NOT switch for 6 times in a row! Pretty tricky!

I actually remember that point in the test when I'd yelled "switch" and I could tell I was hearing the CJ preamp, but then yelled switch again and thought..ok sounds the same, still the CJ...then I yelled switch again...."huh? No change in sound...still CJ".... yelled switch again...expecting a change. But again, sounded the same like the CJ so...ok...then yelled switch again..."what? sounds the same. It couldn't REALLY be the CJ again could it? Started to doubt myself but decided to just keep writing down what it sounded like to me, rather than trying to second-guess the switching. So....6 CJ's in a row. And it turned out I was accurate.

In previous blind tests my helpers have used a coin flip to randomize switching, but the on-line number generator made it easier.

View attachment 245955

Matt, I know this is an old thread and I just read the entire thread for the first time. Since i have had experience with numerous blind testing I was going to offer up some help. Well, my best intentions hit a brick wall. All of the tests I was involved in were solid state only. No one at that time thought of tubes as being anything but out of date old fashioned stuff. So, since I have ZERO experience with tubes (I have never heard a tube preamp and amp in my life) I can't offer any help or advice. You shut the Spkrdctr down. So, I can give my personal thoughts on your project. Here goes...
1. For a non-engineering type of test you did a very good job with everything you had at hand. I say that you went the extra mile and commend you for it.
2. You kept mentally checking yourself and tried to do the absolute best you could.
3. I recommend that many people follow in your footsteps and perform testing at home.
4. I always suggest using speakers rather than headphones. Speakers in your room are the great "leveler". The speakers go a long way to making most any equipment "sound the same".
So, all in all, you did a good job. It was enjoyable to read and it is obvious you are very open to learning. Keep up learning! You are already ahead of 90% of most people who like audio. Enjoyable thread, perfection is not always required. Sometimes the "process" itself can be a learning experience.

MarkS · Nov 28, 2022

lashto said:
congratulations, you did find the one graph that best matched your beliefs
And look, I can 'prove' the exact opposite. With 3 graphs!

Wrong again. Try looking at those graphs! The first two both show the -75dB distortion as sounding better than the the -175dB distortion. The 3rd shows a slight preference for -75 over -175, but has only 7 people; this small a difference is not statistically significant with that few people.

lashto · Nov 28, 2022

MarkS said:
Wrong again. Try looking at those graphs! The first two both show the -75dB distortion as sounding better than the the -175dB distortion. The 3rd shows a slight preference for -75 over -175, but has only 7 people; this small a difference is not statistically significant with that few people.

I just said that the -75dB sample was audible. "Sounding better" means the exact same thing: it's audibly different.
Plus, you do not need 7 people to prove that it's audible, a single one is more than enough. Apparently there were 5 in that test, also statistically relevant:

In total there were 5 respondents who selected the order correctly from lowest to highest distortion. ... we would expect 2-3 based on pure chance alone.

anyway, just keep your beliefs and be happy, no need to yell "wrong" at graphs who do not 'believe' the same.

MarkS · Nov 28, 2022

lashto said:
I just said that the -75dB sample was audible.

And that is wrong, as the first graph proved.

The second graph ("the" in your set of 3 links) shows that those who believed they heard a difference (no proof that they actually heard a difference) liked -75db distortion over -175db distortion by a slight margin. No calculation of statistical significance of the margin is supplied, and I'm too lazy to do it myself. My guess is that it's within one standard deviation of chance, which would make it meaningless.

lashto · Nov 29, 2022

actually, I only stated "quite audible" originally. Not exactly the same as "audible" and very far from the "surely audible for everyone" that you seem to be debating as "wrong". But anyway...

Someone recently posted an old (arab?) proverb going like:

"The truth will greatly annoy those whom it does not convince".

Looks like it does not even have to be "the truth", even the slightest hint/evidence pointing in a different direction will greatly annoy many people.
The two most vocal crowds, the everything-sounds-different 'audiophiles' and the everything-sounds-the-same 'objectivists', look quite the same to me. At least same as easily annoyed... also same as annoying

Gorgonzola · Nov 29, 2022

MarkS said:
Wrong again. Try looking at those graphs! The first two both show the -75dB distortion as sounding better than the the -175dB distortion. The 3rd shows a slight preference for -75 over -175, but has only 7 people; this small a difference is not statistically significant with that few people.

I guess I wonder about the relevance of statistical insignificance if only 7 people or indeed only 1 person can consistently hear a difference (in blind testing). If the trials involving those 7 people was too small, what if the number of trials were increased for these folks and the results were still that these folks could still hear differences, now at a statistical significance?

Would using the more selective group of participants be so-called "p hacking"?

To be clear, are we saying that because 97% of the general population, ("19 times of 20" or whatever), do not hear a difference, that there is none? Or, if only, say, 3% of people can reliably hear differences at level of statistical significance, can we assert differences can't be heard?

lashto · Nov 29, 2022

Gorgonzola said:
I guess I wonder about the relevance of statistical insignificance if only 7 people or indeed only 1 person can consistently hear a difference (in blind testing). If the trials involving those 7 people was too small, what if the number of trials were increased for these folks and the results were still that these folks could still hear differences, now at a statistical significance?

Would using the more selective group of participants be so-called "p hacking"?

To be clear, are we saying that because 97% of the general population, ("19 times of 20" or whatever), do not hear a difference, that there is none? Or, if only, say, 3% of people can reliably hear differences at level of statistical significance, can we assert differences can't be heard?

The "only 7 people" were actually a special case. A bunch of testers maintained that they can hear (all) differences and the ranking by preference does not make sense. So they got an extra chance to re-test by HD. And they ~nailed it. No idea how to calculate the relevance of that, but it's surely way higher than "only 7 people" or "some 7 out of 57".

Generally, I would consider something audible if a single person can hear it. Repeatedly!
But that is of course not the only (reasonable) interpretation.

My analogy would be the 100m run. You can organize 1000+ tests with random people and you'll 'prove' that it's impossible to run 100m under 10 seconds. But there are ~10 people in this world (out of 10 billions) who proved that it can be done. And nowadays it is considered Doable. And it became Doable after just one guy did it once.
~Same for the audibility of 20kHz. Test any 20 random people and chances are very good that none will hear 20kHz. But it's still considered Audible...

MarkS · Nov 29, 2022

OK, I did some calculations: if everyone was just guessing, the expected difference between the -75 rank (1, 2, 3, or 4) and the -175 rank would (of course) be zero, and the standard deviation would be Δ = √(10/(3n)), where n is the number of people guessing. (If anyone cares, I will provide details of the calculation.) For n=7, this is 0.69, which is much larger than the actual ranking difference of 2.29 - 2.00 = 0.29. So the results of the 7 are completely consistent with guessing.

For n=55, Δ = 0.24, to be compared with 2.38 - 2.09 = 0.29, which would happen p=11% of the time if everyone was just guessing, larger than the usual p=5% standard for statistical significance.

But by slicing and dicing the data in multiple ways, we are in p-hacking territory. The more ways you slice and dice, the more likely you are to find a large fluctuation in some particular slicing of random data.

A good follow-up would have been to re-test (multiple times) the people scoring well, and see how they do. Were they just lucky the first time, or could they really hear a difference?

Testing this would not be p-hacking, it would be a legitimate follow-up experiment.

Sadly it was not done.

ahofer · Nov 29, 2022

Gorgonzola said:
Nice work indeed, @MattHooper, and thanks.

That the sonic difference between these two preamp should be audible is, of course, what any dyed-in-the-wool subjectivist audiophile would expect. The nature of the differences would also be predicted by audiophiles. The "richer, more full bodied, more 'relaxed'" quality of the CJ would be fully predictable; likewise the greater transparency of the Benchmark. (On another forum I've been hearing from a subjectivist person who has had both CJ and Benchmark preamps whose impression are exactly these.)

For my part the the differences are predicted by measurements: thanks, @charleski, for the link to JA's CJ measurements. There the salient point is the very high 2nd order harmonic. What is also comment-worthy are the relatively low higher order harmonics beginning, maybe surprisingly, with the 3rd order.

Interesting. It would cost a lot less than the price of a boutique preamp to simulate/add this. Should be a "goodness" knob to turn up the 2nd harmonic (borrowing from .... Dartziel?). That would make a great follow-up test, if we could find the right device. Sort of like the Carver challenge.

rwortman · Dec 1, 2022

New to this thread. I find it interesting that we assume that the LA4 is essentially a wire and a switch. It is such a low distortion device, that’s probably correct. This was not a comparison of the Benchmark to the CJ, it was a test to see if one could hear the sound of the CJ being inserted into the chain, one preamp vs two in series. If the LA4 has any “sound” the other preamp’s sound was added to it. I doubt that using another switch would have changed the preference but that would have made the test more “pure”.

I used a tube preamp for about a decade. I sold it and bought an AVR. Soon after, I decided I wanted a dedicated two channel preamp again and bought a solid state one. No more need to have spares in the house and no more fooling around chasing line frequency hum. I know that it’s not a universal problem but an awful lot of tube gear has slightly audible 60hz noise that grates after a while.

xaxxon · Dec 1, 2022

pkane said:
20+ years ago I tried a few high-end CJ preamps that a local dealer let me borrow against no preamp (a DAC with a digital volume control). I could also tell the difference, including in a blind test with an audiophile buddy helping me out. In the audiophile parlance, CJ preamps sounded veiled to me, while no-preamp sounded natural and transparent. I felt like the sound became muddier, thicker, denser with CJ. I've not used a preamp since.

why people would want to "hear" their volume knob and source selector has always been beyond me. If you want to introduce distortion somewhere then I guess that's up to you, but why would you want your volume knob to do that?

Blind Test Results: Benchmark LA4 vs Conrad Johnson Tube Preamp

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Grand Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Master Contributor

Addicted to Fun and Learning

Senior Member

Similar threads