• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Amplifier Bakeoff: Purifi Eval1, McIntosh MA252 & Benchmark AHB2

Rottmannash

Major Contributor
Forum Donor
Joined
Nov 11, 2020
Messages
2,969
Likes
2,606
Location
Nashville
I bought the parts and assembled it myself. Purifi modules from Purifi, Ghent Case from Ghent and Hypex PS from Hypex.
Totally off topic but what did you end up spending on the amp?
 
OP
P

PJ2000

Member
Joined
Aug 19, 2021
Messages
26
Likes
86
Totally off topic but what did you end up spending on the amp?
I think I ended up paying around $1400, which is really dumb considering that VTV sells it for a few hundred dollars less. I didn't realize that they had a Purifi Eval-1 based design as they started out with just the Hypex buffer.
 

pogo

Major Contributor
Joined
Sep 4, 2020
Messages
1,242
Likes
382
Yeah, but we recalibrated each amplifier on each sample, as a result this would introduce a random variation.
Would this process have been necessary at all, i.e. are the differences not audible at any volume (possibly hidden details, impulse response....)?
I love my Purifis for the speed, the elaboration of details and their separation.
 
Last edited:
OP
P

PJ2000

Member
Joined
Aug 19, 2021
Messages
26
Likes
86
Would this process have been necessary at all, i.e. are the differences not audible at any volume (possibly hidden details, impulse response....)?
I love my Purifis for the speed, the elaboration of details and their separation.

It was interesting to see if a 'blind' test would result in the same subjective preference that I had.

Subjectively I preferred the Purifi in casual listening and then quasi-objectively again in our test that we did, since I ranked them the same in all tests and the Purifi was always in position 1. My friend who owns the Benchmark's argument is that I have been trained now to prefer the sound of the Purifi since I use that most of the time and as a result didn't like the Benchmark. Of course, if his assertion is true, it still indicates that there are audible differences.
 

pogo

Major Contributor
Joined
Sep 4, 2020
Messages
1,242
Likes
382
On a few run of the mill speakers, I really couldn't tell much of a difference as they themselves just don't have the detail resolution of the S5
And that's probably why people don't notice such differences between amplifiers, because the speakers they use are often a bottleneck and worsen the actual signal, thus masking the amp potential. And it looks like the Magico is not one of them and was a very good choice for this test.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,386
Likes
24,752
Location
Alfred, NY
And that's probably why people don't notice such differences between amplifiers, because the speakers they use are often a bottleneck and worsen the actual signal, thus masking the amp potential. And it looks like the Magico is not one of them and was a very good choice for this test.
Can't you even do an original excuse? The "your equipment isn't resolving enough" line is old enough to collect Social Security.
 
OP
P

PJ2000

Member
Joined
Aug 19, 2021
Messages
26
Likes
86
Can't you even do an original excuse? The "your equipment isn't resolving enough" line is old enough to collect Social Security.

It is clearly a poor excuse if someone is just looking for an excuse, but if there wasn't a case to improve the quality of some component in the chain, and that that those improvements could lead to better quality, then what exactly is the point of this forum? Could we reach the point of inaudibility of changes? Sure, we would just need to be able to have the facts to support that we were there and what there looks like.

I am sure if anyone made the argument that all speakers sounded alike, everyone would dispute that claim as it is easy to test and disprove that and more importantly subjectively, we all know that. It isn't surprising intellectually to understand why that is.

Speaking subjectively and about my own personal experience, I've had the luxury of spending time with many audio components of various types in the past 18 months and I can hear differences and I find some sound 'better' to me. For me, better is defined by what I perceive to be closer to a real voice or a real instrument (love stringed instruments such as guitar and violin and piano) in a real venue. For day to day, I have converged on some relatively inexpensive gear that frankly has piqued my interest in audio in a way that hasn't happened for a long time. I also am fortunate enough that I been able to test some relatively expensive gear, and could I acquire them if I wanted them, but they haven't been better in quality of experience to do so. What I find most exciting are the new entrants that have both driven down the cost/performance curve, making it possible for almost anyone to have experiences that would have cost an order of magnitude more, less than a decade ago.

For under a $1000 anyone can buy objectively and, subjectively to me, one of the best amplifiers on the market. For a couple of hundred dollars one can purchase a DAC that has phenomenal specs and sounds fantastic with some of the most sophisticated audio processors on the market. For another couple hundred dollars you can buy software that can do room and headphone corrections which will make a bigger difference to your listening experience than anything else will once you have any decent gear. My new favorite speaker is the Magnepan LRS. Yes, there are constraints such as room size and that you need to want to spend the time to tweak placement, but it sounds amazing for voice, instruments and while it is obviously missing anything below 40Hz in my room and it can't get really loud, even listening to Van Halen and Def Leppard, has a clarity and spaciousness that is pretty amazing, let alone from a $650 speaker. Pretty much any hobbyist can afford to play with one and listen for themselves.

This forum is ultimately about the 'practice' of audio and practically is. So, what I have noticed since I posted this thread is the plethora of responses from a lot of folks who seem to have a lot of theoretical responses in a forum which is focused on the practical (testing of real gear). Why don't more of you test some of these items out and see if your experience in fact matches what your expectations are?

If you live in the SF Bay area, drop by and we can do some 'blind' testing.
 

eddantes

Addicted to Fun and Learning
Joined
May 15, 2020
Messages
707
Likes
1,385
I haven't had the patience to read through all of it... But has this thread grown over 100 posts, because the OP refuses to do a DBT (or simply a BT) correctly?

Dude - feel free to remain in the warm embrace of your presumption, but if you want to be sure of the truth there is no replacement for a proper blind test, that a many here have advised you on how to do it.

Anyways - best of luck to you, I'm unwatching.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,386
Likes
24,752
Location
Alfred, NY
I haven't had the patience to read through all of it... But has this thread grown over 100 posts, because the OP refuses to do a DBT (or simply a BT) correctly?
You got it.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,386
Likes
24,752
Location
Alfred, NY
If you live in the SF Bay area, drop by and we can do some 'blind' testing.
I would if I were still there- escaped in 2009. :D

I've done a reasonable amount of testing of gear over the decades. Ditto a reasonable amount of controlled listening. So this isn't theoretical, this is Russell's Teapot.
 

Newman

Major Contributor
Joined
Jan 6, 2017
Messages
3,454
Likes
4,218
Again thanks for comments and ideas, we will definitely factor those in to the next time we do a test.

But while the problems that have been pointed out are valid, it doesn't change the fundamental problem of explaining how I ended up with exactly the same sequence of 1,2,3 six times in a row. I would love for a the true statistician on this thread to maybe do a better job, but let me take a quick stab at it.

Hypothesis: It is unlikely, if not impossible to discern audible differences in amplifiers that have bench test results that are beyond audibility. In other words the standard suite of tests that we perform, Frequency Response, Noise, IMD etc, fully characterize the audio response such that two amplifiers (or DACs) that have similar results should be indistinguishable. (Of course we can assume that the audible test would be performed in a linear response region, i.e. not clipping).

Real world limitations in the testing that have been pointed out, and more importantly how could each SKEW the results in a particular direction. Random skew doesn't matter as with a sufficient number of samples, the skew will be average out.

1. In accurate measurements of level due to the use of an acoustic reference instead of electrical. Random skew factor as this would affect each amplifier measurement (1/36) equally.

2. Acoustic memory limitations. Random skew factor as this would affect each amplifier measurement (1/36) equally.

3. Preamp impedance issues: This could be a systematic error that skews in a particular direction. This seems highly unlikely in today's world of modern DAC's. The D90SE has an output impedance of 100ohms with the Benchmark's input impedance is 50K while the Eval1 is 10.2K. It isn't clear why this would matter.

4. Clipping: None of the amplifiers were clipping when we listened to selections as they were all moderate since our interest was in hearing differences, not how loud they could get. That doesn't seem to be a very plausible reason to invalidate the results.

So what is the probability in this case of picking 6 tests with the same order RANDOMLY? It is 1 in 46,656. In this case the results actually strongly favor that there is an audible difference that can't be explained by chance. Even if the odds were reduced to only 3 of the 6 tests were valid due to errors in the method that were random, that still is 1 in 216. That still doesn't favor the explanation that it is random and therefore inaudible.

Here is what I would strongly recommend. Why don't a few others repeat the tests and see what you get? There is nothing like actually using the scientific method and testing a hypothesis vs. theorizing about it. Remember the basis for this hypothesis is bench testing can measure anything that we can discern audibly. How well have we tested that hypothesis? If anything with the specs on every new DAC and amplifier approaching the limits of test equipment, testing this hypothesis should be easier and easier.
Why did you omit the problem pointed out that you didn't bench test the actual amps that were used? Including electrical FR when hooked up to the speakers used, at the speaker terminals? (although you could do this acoustically, too, with a bit less precision).

After all, that's the elephant in the room.

As for methodology, someone asked early on (#10) for the logbook/notes data from the test. This might reveal a breakdown in randomness or method. But see above paragraph: you don't even know if the amps measured near enough to the same in your setup. This is effectively a deal-breaker. Any methodology issues are secondary.

cheers
 

pogo

Major Contributor
Joined
Sep 4, 2020
Messages
1,242
Likes
382
This forum is ultimately about the 'practice' of audio and practically is. So, what I have noticed since I posted this thread is the plethora of responses from a lot of folks who seem to have a lot of theoretical responses in a forum which is focused on the practical (testing of real gear). Why don't more of you test some of these items out and see if your experience in fact matches what your expectations are?
Correct.
But even the theory does not always seem to be fully reflected here, if that is possible at all. A good example is the repeated reference to Benchmark's DF paper.
The short paragraph on '7.4.3 Damping factor' is sufficient to understand how complex the theory can be and I think this one is closer to reality:
Link
 

peng

Master Contributor
Forum Donor
Joined
May 12, 2019
Messages
5,615
Likes
5,168
Correct.
But even the theory does not always seem to be fully reflected here, if that is possible at all. A good example is the repeated reference to Benchmark's DF paper.
The short paragraph on '7.4.3 Damping factor' is sufficient to understand how complex the theory can be and I think this one is closer to reality:
Link

Fully reflected or not, to me, one relevant point is, Benchmark's article does make logical and practical sense. I don't know how special your speaker is that makes it so sensitive to DF at seemingly high enough for probably most other speakers.

I hope you would watch the AH's Youtube interview with Dr. Sean Olive by Gene and @Matthew J Poes (hopefully he might chime in:)). If you did, I would like your comments on why the participants could not easily identify the difference between the German and Danish speakers that were the same except the crossovers were voiced differently, one has the German voicing, the other Danish voicing, when the FRs were visually quite different, definitely much more different than one would expect if the differences were due to say the amp's DF of 100 and 300. To Harman's credit, they apparently let the German voicing person go because they couldn't justify the higher cost (more parts for the crossover cited as one reason) for that voicing.


In addition to Benchmark's, you may be interested in the following (iirc, Benchmark's made reference to the second one (D. Pierce's):

 

Willem

Major Contributor
Joined
Jan 8, 2019
Messages
3,659
Likes
5,277
I don't think it's because of a super-human hearing (at my age i certainly don't have that anymore), but the setup used plays a very big role. And it is not really an unexpected observation, because well-known manufacturers are talking about it and they should know best.
It does in fact super assume human hearing acuity because all properly conducted listening tets show that measured differences between excellent amplifiers such as these are below the threshold of human hearing.
You put a great deal of faith in amplifier manufacturers. Quad's Peter Walker disagreed with you, but so does RME with regard to their DACs.
 
Last edited:

pogo

Major Contributor
Joined
Sep 4, 2020
Messages
1,242
Likes
382
You put a great deal of faith in amplifier manufacturers.
Also recording studios and their engineers swear by this product, for example, especially because of the high DF in the mid-high range:
Link
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,386
Likes
24,752
Location
Alfred, NY
Also recording studios and their engineers swear by this product, for example, especially because of the high DF in the mid-high range:
Link
You do love your advertisements, don't you?
 

Newman

Major Contributor
Joined
Jan 6, 2017
Messages
3,454
Likes
4,218
I would like your comments on why the participants could not easily identify the difference between the German and Danish speakers that were the same except the crossovers were voiced differently, one has the German voicing, the other Danish voicing, when the FRs were visually quite different,
This is interesting, because it is inconsistent with something in Toole's book first edition.

IMG_0221.thumb.jpg.fcec97840651f1c63d152f5f34e48a36.jpg


Look at the first 2 speakers (black and white). When blind, they score quite closely together. Even the error bars are longer than the difference in score. Yet when sighted, they score exactly the same relative to one another. Why? Because they were visually identical. The only difference between them was a tweaked crossover to cater for presumed regional variations in sonic tastes.

Now, one could argue that it is not inconsistent with what you reported above from Olive, because the error bars overlap and are bigger than the difference in ratings, so maybe they are indeed difficult to tell apart. But the perfect consistency between sighted and blind tests, ie the same one was rated better and by the same amount, is interesting and hints at an ability to tell them apart. (I suppose what it really tells us is that we needed more subjects to do that test until the error bars are smaller than the difference in ratings.)

But one generic answer to your query, is that there is no reason to expect a measured difference to be audible. We know perfectly well that we can often measure things below our ability to hear differences. Also, differences at crossover might not apply equally to differences at the frequencies DF impacts.

cheers
 

pogo

Major Contributor
Joined
Sep 4, 2020
Messages
1,242
Likes
382
You do love your advertisements, don't you?
It was not intended as an advertisement, but simply to show that this behavior also plays a major role in the professional sector.
 

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,554
Likes
1,701
Location
California
To save me the trouble of reading all 6 pages, was the trial data posted with basic statistical analysis?

Edit: looks like the answer is no. I mean even an entry into the junior high science fair requires SOME data.
 
Last edited:

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,386
Likes
24,752
Location
Alfred, NY
It was not intended as an advertisement, but simply to show that this behavior also plays a major role in the professional sector.
It's an advertisement. Oh, sorry, it's a LINK to an advertisement, sooooo different.
 
Top Bottom