• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Integrated Amplifier Comparison: $90 vs $9000

This is an interesting and thoughtful thread.

I've certainly measured things which I thought should have been audible, but were not audible to me (especially distortion), and I believe I've heard things which in theory should not be significantly audible, but I felt they were. I've also taken part in professionally run double-blind tests and have had training on detecting artefacts.
 
A 2 dB difference at specific frequencies is *very* audible.

Thanks for the detailed experiment. Great work.
 
View attachment 302917

View attachment 302918


Interestingly that spike at 1:23 region is when the audience is shouting/clapping of the guitars starts playing for the first time and 2:10 is exactly when the lyrics start to come in on my recording of the music. Not sure if that matters, but they are definite correlates to changes in the music. Meanwhile, the first minute of Hotel California is a lot of the same instrumentals

I forgot to mention, this is the Hell Freezes Over XRCD mastering that was digitized, not the original.

Is there an RMS null graph? I pasted the logs in the original post

Difference (rms) = -58.08dB [-60.46dBA]
Correlated Null Depth=49.58dB [43.33dBA]

View attachment 302919

View attachment 302920
View attachment 302921
View attachment 302922
View attachment 302923
View attachment 302924
View attachment 302925

Sorry, missed the "spoiler" hiding the DeltaWave log in the OP.

Yes, this looks correct to me, I don't see anything in the results that would point to an error in the nulling operation.
 
You really need to do a blind test. I've lost count of the number of times I heard "significant differences" when I was doing direct comparisons and was proven wrong when someone did the change without me knowing about it. In most of the tests, some of which I prepared myself for, I went from having a near 100% certainty to being uncomfortable and then shocked by the numerical result.
Yawn
 
I still see a blind test as essential.
I did the level matched DBT blind test of AIYIMA A07 vs. linear amp and was able to tell the difference, I think the test can be searched here at ASR. It was just because of the difference in FR due to A07 dependence of FR on the complex load. It is the fact that does not need to be doubted.
 
I should be able to get this with a simple multimeter at the outputs. Let me do this later this week.

A multimeter won’t do it, unless it’s a 6.5digit with a wide bandwidth and a mV scale. You are talking microvolts, up to a mV.

Your adc can, if you’ve calibrated it.
 
I continue to hear differences between amplifiers that might be expected to be inaudibly different. That is, for two high-SINAD systems that I'm not running into clipping, I hear differences. I also believe in measurements. My perspective has been consistent: "Everything that can be measured may not be audible. Everything that is audible is measurable -- but you need to select the right measurement."
Very well said.

But objectivists are far too quick to dismiss subjectivists’ impressions. That’s a real problem.
 
Very well said.

But objectivists are far too quick to dismiss subjectivists’ impressions. That’s a real problem.

Sort of. I think it is best when people try not to categorize an entire group of people. At the end of the day we listen, so experience always matters. Likewise, we are all humans and subject to bias and placebo effect. It’s the fundamental nature of experiences. Finding that balance is important.
 
Two amplifiers with "good electrical performance" sounded different to my ears. While we know the Fosi V3 is load dependent, it translated into more of an issue that might be predicted
That’s a subjective assessment.

I was able to measure differences using a UMIK-2 that seem to correlate with my subjective experience
Correlation does not mean causation. This is really my biggest issue with these conclusions. There is nothing to prove causation. It like a cable manufacturer puts up a random plot and then says: see, this is the difference you hear! You will need to try to falsify your claims. That is where the real knowledge lies.

And you can say: but every time I do this I find something that correlates! That may be, but this is like a self fulfilling prophecy: you’ll always find something that you can point to and say: look this is it! But without proof this remains just speculation. While it’s fun to do that, it doesn’t bring the knowledge we seek.

we scrutinize the vocal region, there are differences in the recorded, un-corrected comparisons. Again, RMS volume mismatch is 0.05 dB different, but the biggest delta in this region is as high as 2 dB which could be audible.
Looks like the high frequency differences. 2 dB difference in signal does not need to translate into a 2dB difference in the frequency domain. But as we see below, in this case that may just be true.


Looking at the difference in spectra, you can see that the spectra is pretty consistent from 30 to 300 Hz, again suggesting that my measurements are reasonably done, but you do see a bigger difference as you move up the frequency range which is even bigger than what is seen in the bass region.
This is what you expect to see when measuring in-room. Measure twice with the Marantz, and you’ll see similar noise in HF. But clear from this is the high frequency deviation which seems to be up-to 2dB if you look though the noise

No wonder this is audible…

Next try to falsify your conclusions: eq the Fosi to the Marantz response, and redo the blind test. See how well you fair with that?
 
Again: Triple YES ;)

Again polluting another topic. Why don't you create your own topic where you can go to the core of the matter, and consolide all knowledge on damping factor? Same effort with much higher outcome. Unless there's nothing to report, and we can leave it there.
 
Very well said.

But objectivists are far too quick to dismiss subjectivists’ impressions. That’s a real problem.
If people are dismissing properly controlled and carefully executed listening impressions that is a real problem.

But uncontrolled and sighted listening impressions don't bring any useful information to the discussion. They don't mean anything to anyone other than the subjectivist who listened. In these cases, the main problem comes when people accept uncontrolled listening impression as valid data - that just takes everyone off down a rabbit hole.
 
That’s a subjective assessment.


Correlation does not mean causation. This is really my biggest issue with these conclusions. There is nothing to prove causation. It like a cable manufacturer puts up a random plot and then says: see, this is the difference you hear! You will need to try to falsify your claims. That is where the real knowledge lies.

And you can say: but every time I do this I find something that correlates! That may be, but this is like a self fulfilling prophecy: you’ll always find something that you can point to and say: look this is it! But without proof this remains just speculation. While it’s fun to do that, it doesn’t bring the knowledge we seek.


Looks like the high frequency differences. 2 dB difference in signal does not need to translate into a 2dB difference in the frequency domain. But as we see below, in this case that may just be true.



This is what you expect to see when measuring in-room. Measure twice with the Marantz, and you’ll see similar noise in HF. But clear from this is the high frequency deviation which seems to be up-to 2dB if you look though the noise

No wonder this is audible…

Next try to falsify your conclusions: eq the Fosi to the Marantz response, and redo the blind test. See how well you fair with that?
I'm late to the party, but just to add to this.

I'm really struggling with the comparison data being recorded via a microphone. It seems to me that doing this there are just too many environmental variations that can cause measured differences that are not due to the amplifiers under test.

A plane going overhead (or dog barking in the distance, or... any of a million other possible environmental noises) - a different recording.

Same for any changes in the room causing changes to reflections - such as the person carrying out the test not being in exactly the same position, or small movements of the measurement mic.

What about heating devices switching on/off - (speculation warning) does sound passing through a volume of warm/less dense air get measurably impacted? Even if only enough for a small phase shift to alter interference patterns?

Surely if we are looking for differences in electronics it would be far more accurate to measure electrically at the speaker terminals. Probably easier too once an appropriate filter has been built.
 
That’s a subjective assessment.
Well yeah, that’s the point of a conclusion/interpretation…

Correlation does not mean causation. This is really my biggest issue with these conclusions. There is nothing to prove causation. It like a cable manufacturer puts up a random plot and then says: see, this is the difference you hear! You will need to try to falsify your claims. That is where the real knowledge lies.

Yes. That's why I used the word correlation. But cable manufacturers don’t put a random plot measured at the acoustic level. You cannot place measurements as the gold standard if you are not willing to accept measurements that seem to disagree with your world view. If you look at our ability to measure cables *electrically* we can do so and achieve statistical significance. Clearly inaudible, but we would expect the difference to be inaudible given the t

And you can say: but every time I do this I find something that correlates! That may be, but this is like a self fulfilling prophecy: you’ll always find something that you can point to and say: look this is it! But without proof this remains just speculation. While it’s fun to do that, it doesn’t bring the knowledge we seek.

The scientific method involves testable hypotheses. Forming the hypothesis is based upon initial observations. Forming testable hypotheses after an observation is exactly the right way to explore this.

Measure twice with the Marantz, and you’ll see similar noise in HF.
I no longer have the Marantz SA-10/PM-10 setup, but a negative control to compare recording the same speaker/amplifier twice to see run-to-run variability makes a lot of sense. Let's measure with a UMIK-2 some music twice. I'll use the same Hotel California opening. This time I'm using my Yamaha CX-A5100 which, based upon my measuring level, is around 80 dB SINAD

1698503924299.jpg


Deltawave Log
DeltaWave v2.0.10, 2023-10-28T07:32:16.6750696-07:00
Reference: Recording1.flac[L] 1824000 samples 48000Hz 24bits, mono, MD5=00
Comparison: Recording2.flac[L] 1824000 samples 48000Hz 24bits, mono, MD5=00
Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -500dB
Correct Non-linearity: False
Correct Drift:True, Precision:30, Subsample Align:True
Non-Linear drift Correction:False
Upsample:False, Window:Kaiser
Spectrum Window:Kaiser, Spectrum Size:32768
Spectrogram Window:Hann, Spectrogram Size:4096, Spectrogram Steps:2048
Filter Type:FIR, window:Kaiser, taps:262144, minimum phase=False
Dither:False bits=0
Trim Silence:True
Enable Simple Waveform Measurement: False

Discarding Reference: Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s

Initial peak values Reference: -23.521dB Comparison: -23.543dB
Initial RMS values Reference: -41.494dB Comparison: -41.491dB

Null Depth=29.663dB
Trimming 0 samples at start and 0 samples at the end that are below -90.31dB level

X-Correlation offset: -3 samples
Trimming 0 samples at start and 0 samples at the end that are below -90.31dB level

Drift computation quality, #1: Excellent (0.06μs)


Trimmed 56720 samples ( 1181.666667ms) front, 10170 samples ( 211.875ms end)


Final peak values Reference: -23.521dB Comparison: -23.55dB
Final RMS values Reference: -41.626dB Comparison: -41.631dB

Gain= 0.0071dB (1.0008x) DC=0 Phase offset=-0.072184ms (-3.465 samples)
Difference (rms) = -71.33dB [-79.73dBA]
Correlated Null Depth=48.16dB [51.41dBA]
Clock drift: -0.01 ppm


Files are NOT a bit-perfect match (match=2.23%) at 16 bits
Files are NOT a bit-perfect match (match=0.01%) at 24 bits
Files match @ 50.0011% when reduced to 11.28 bits


---- Phase difference (full bandwidth): 2.09392702550831°
0-10kHz: 2.19°
0-20kHz: 2.11°
0-24kHz: 2.09°
Timing error (rms jitter): 2.8μs
PK Metric (step=400ms, overlap=50%):
RMS=-75.7dBFS
Median=-76.0
Max=-72.1

99%: -72.74
75%: -74.72
50%: -76.04
25%: -77.26
1%: -80.04

gn=0.999185202751952, dc=0, dr=-7.19272397044365E-09, of=-3.4648298187

DONE!

Signature: c887dfc97a45cb753f99842786672a03

RMS of the difference of spectra: -119.444168903886dB
DF Metric (step=400ms, overlap=0%):
Median=-31.9dB
Max=-11.5dB Min=-38.3dB

1% > -38.33dB
10% > -36.48dB
25% > -34.02dB
50% > -31.9dB
75% > -27.23dB
90% > -18.32dB
99% > -2.49dB

Linearity 24.3bits @ 0.5dB error
---- Phase difference (full bandwidth): 7.94038835242244°
0-10kHz: 12.29°
0-20kHz: 8.70°
0-24kHz: 7.94°
Linearity 24.3bits @ 0.5dB error

You can see with the different speaker, the high-pass filter really attenuates the response in the bass
1698504485025.png


1698504587849.png


You do see the HF difference, but most of it stays below +/- 2 dB whereas the earlier comparison was +6 dB/-4 dB at the extremes. If you say that the spikes at 3 and 5 kHz are ambient noise, it stays under +/- 1 dB to about 6 kHz. In the original comparison you got to +/- 1 dB at around 2 kHz

Most important, the PK Metric shows much greater similarity.
1698504728894.png


This might be approaching the limit of my CX-A5100. I will try recording at a similar dBFS as the original.

Interestingly, the phase comparisons between the original comparison and this one shows that the phase was more variable between the two amplifiers compared to the repeated measurements of the same setup.

1698504889400.png


Next try to falsify your conclusions: eq the Fosi to the Marantz response, and redo the blind test. See how well you fair with that?

I've done a blind ABX test with the un eq'd recordings. See this post. Since I no longer have the Fosi or Marantz (but I do have the recordings), what do you suggest I use as the EQ? Since the differences were FR based, I am willing to bet that EQ’ing the recordings will make it harder to hear the difference. That would strengthen the conclusion that two high 1kHz/5W SINAD amplifiers can sound differently into actual speakers at levels well below clipping.
 
I'm late to the party, but just to add to this.

Surely if we are looking for differences in electronics it would be far more accurate to measure electrically at the speaker terminals. Probably easier too once an appropriate filter has been built.

Measuring electrically definitely makes it easier to see differences. I have measured electrical output through a speaker load.
I have done this for DACs which are different and DACs where the results are similar.

The criticism of measuring electrically is that when I measure differences between amplifiers or DACs, the claim is that it’s inaudible…
The criticism of measuring acoustically is that when I measure differences between amplifiers, the claim is that it’s just run-to-run-variability…

So how else would someone quantify a perceived subjective difference between amplifiers through speaker loads at realistic volumes?
 
The criticism of measuring electrically is that when I measure differences between amplifiers or DACs, the claim is that it’s inaudible…

Are you saying if you measure the effect you are demonstrating here with a mic, but electrically instead, that the differences will be much lower and below the accepted level of audibility?

If so, then that tells you that measuring acoustically is creating bigger differences than actually exist electrically (from the amps). If this is the case it is proof of the unsuitability of the measurement you are doing.

The criticism of measuring acoustically is that when I measure differences between amplifiers, the claim is that it’s just run-to-run-variability…

Well, yes - if the acoustic measurements show greater differential than the electrical measurements of the same change, then that is exactly what they are.
 
Are you saying if you measure the effect you are demonstrating here with a mic, but electrically instead, that the differences will be much lower and below the accepted level of audibility?
No.

I am saying that I have only so much time to do the measurements when I am lent something like the Fosi. So I haven’t taken a comparison electrically and acoustically of the same setups. That is a good idea and something to try next time.

What I am saying is that I started doing electrical measurements then switched to acoustic measurements.

If so, then that tells you that measuring acoustically is creating bigger differences than actually exist electrically (from the amps). If this is the case it is proof of the unsuitability of the measurement you are doing.

Clarified above. That does not apply but is a great thought for the next round of testing.

I would say that acoustically the differences are smaller. Notice that comparisons of the same speaker twice have a pkmetric in the 70’s but electrical measurements of the same amplifier twice (1 day apart) will be nearly 120 dBFS.

IMG_0152.png


So when I have the electrical differences between these two amps, it’s real.

The Fosi vs PM10 electrical measurement has PK Metrics of -47 dBFS for a recording with a

Initial peak values Reference: -10.514dB Comparison: -10.419dB
Initial RMS values Reference: -29.814dB Comparison: -29.761dB

EDIT
In other words:

Measuring the same amplifier twice, one day apart results in nearly -120 dBFS null testing. The E1DA Cosmos ADC and a good reference Marantz amplifier is consistent from day to day.

Measuring two amplifiers immediately after the other results in certain electrical areas having a PK Metric in the -40’s dBFS for a recording that is -10 dBFS peak.

When measured electrically, people still complain that this is an artifact of testing.

Likewise, I don’t have any magic. I am using a relative affordable E1DA Cosmos and UMIK-2 along with @pkane ‘s donationware software. Other readers are free to try to replicate my experiments…

Additional, there is probably selection bias. When I don’t hear differences, I haven’t gone out and taken the time to measure (except the early comparison between the UB9000 and SA-11s1.). When I hear a difference then I spent my free time measuring.
 
Last edited:
But cable manufacturers don’t put a random plot measured at the acoustic level.
It doesn’t matter what graph you show. As long as there is no proof of causation it’s just speculation.

You cannot place measurements as the gold standard if you are not willing to accept measurements that seem to disagree with your world view.
I see nothing that agrees with my world view. This is about the scientific method and trying to figure out why things happen. You generally tend to stop at speculation. And I think that is a problem, because it invites others to speculate and extrapolate, leading to conclusions that might not be correct.


The scientific method involves testable hypotheses. Forming the hypothesis is based upon initial observations.
I have no problem with this. I also have no problem with speculation. I have a problem when it stops there. Because people will with it and try to proclaim it as fact.
Forming testable hypotheses after an observation is exactly the right way to explore this.
Yes, let’s go then!
You do see the HF difference, but most of it stays below +/- 2 dB whereas the earlier comparison was +6 dB/-4 dB at the extremes. If you say that the spikes at 3 and 5 kHz are ambient noise, it stays under +/- 1 dB to about 6 kHz. In the original comparison you got to +/- 1 dB at around 2 kHz
I said there is a clear variation in the HF… totally expected given the load dependence, and totally within the realm of audibility.

I've done a blind ABX test with the un eq'd recordings. See this post. Since I no longer have the Fosi or Marantz (but I do have the recordings), what do you suggest I use as the EQ? Since the differences were FR based, I am willing to bet that EQ’ing the recordings will make it harder to hear the difference.
Can one extract the impulse response from the recordings in Delta wave? If so, it should be trivial to create a filter that converts the Fosi impulse to the Marantz impulse.
 
Last edited:
Back
Top Bottom