Measuring the "sound signature" of two different integrated amplifiers.

KSTR · Oct 5, 2022

pkane said:
Since the digital part of the test set up wasn't changed between any of these tests, any constant clock drift between DAC and ADC should not be an issue as it would be the same in all captures. In fact, it's safe to turn off drift measurement/correction in DeltaWave (and I've done it for @GXAlan 's recordings) -- it made no measurable or audible difference in the result. There are much larger analog errors than the 1-2ppm clock drift. This was also verified by capturing two files from each of the amps and then comparing them to each other -- the RMS null was well over -100dB (edit: meant to say well lower than -100dB)

OK, understood.
IME this is a rare case when clock stability of two independent clocks is good enough for a deep null (despite any clock rate mismatch which is irrelevant when it remains constant, like you said). Devices need to have fully settled thermally and there must be zero disturbance during the test (like vibration and of course thermal conditions). Basically we have tested the clock stability of the SACD and the ADC as being truly excellent. It also of course depends on the length of the recording.

@GXAlan, any chance you might publish the recordings?

restorer-john said:
The sync issue is addressed by using just one channel (say L) from each amplifier into the ADC in stereo mode.

Excellent idea, using a splitter cable. The ADC's channel differences can be checked beforhand for a baseline, together with @dualazmak 's idea of double checking criss-cross.

-Matt- said:
If so, without placing enormous additional workload on Amir, what additional testing would you suggest to include? Preferably it should be possible to refine the result to a single figure metric for easy comparison between devices.

To dig in really deep, we would need to compare a DUT's output directly to the digital source signal. This will also include the errors from the DAC and ADC used but the baseline resolution of this can be determined, and the contribution of the DAC and ADC can even be factored out, by comparing DAC-->ADC to DAC-->DUT-->ADC (and preferably both recorded sample-synced). Prerequisites must be fullfilled, the main one being identical levels at the ADC (and identical polarity etc). When the differences are large enough, DW can handle this nicely.

This is what I did in my linked tests and by this I uncovered an ever so sligthly compression effect of my DAC (the signal slightly modulating the reference voltage of the DAC). In hindsight it also manifested itself in standard THD measurement at low frequency where the distortion is higher but from that alone it is impossible to infer it came from a dynamic Vref modulation rather than an actual simple nonlinearity.

A single figure of merit seldom works as a general concept from which a general behavior could be extrapolated, IHMO. For one specific test (the test signal -- music), tough, it is certainly possible to get a single value like PK-metric which has sufficient validity. Looking -- and actually listening to -- the residual offers more insight IME.

--------:--------

Still I think the verifcation run is neccesary to have a reliable result.
- So, you have two DUTs and get positive DBT result (with level matching, training period etc). I would think we all do agree that a DBT is required to establish that there actually *is* a difference worth further investigation.
- As explained, then try to emulate one DUT with the other by adding the residual to the input signal, do sanity check if it now measures the same (with the same test) to good enough precision (difference not way larger that run-to-run variations).
- Do another DBT (e.g., but not restricted to, an ABX) and when it is positive again (differences still reliably heard), chanced are high that something else is going on and we have missed it in this test.
The DBT is the critical part, the testers not knowing what is being played, otherwise prior knowledge might easily bias the result ("we have established that the emulation of amp A via amp B measures the same the real amp A, so why should there be any perceptual difference?").

-Matt- · Oct 5, 2022

Another possible factor to keep in mind...

The headline SINAD value for an amp (eg 113dB for the excellent benchmark ABH2) only applies at one particular signal level.

If your testing is carried out at a different level then SINAD is potentially much higher, leaving more room for audible differences between devices.

Eg. At ~~50mV~~ 50mW SINAD of the ABH2 increases to about 93dB in low gain or 90dB in high gain.

Benchmark AHB2 Amplifier Power at 4 Ohm compared to Hypex NC400 Audio Measurements.png

Sokel · Oct 5, 2022

-Matt- said:
Another possible factor to keep in mind...

The headline SINAD value for an amp (eg 113dB for the excellent benchmark ABH2) only applies at one particular signal level.

And particular frequency,that's why the THD+N vs freq is one of the most important measurements.

restorer-john · Oct 5, 2022

-Matt- said:
Eg. At 50mV SINAD of the ABH2 increases to about 93dB in low gain or 90dB in high gain.

The plot is in Watts, not mV.

Your point, 50mW, over 4R equates to 447mV.

GXAlan · Oct 5, 2022

Here's the phase difference. Seems like everything is below 2.4 kHz?

If you trust this:

Why is the highest frequency on a piano 4186 Hertz?

The human hearing range is between 20 Hz and 20,000 Hz. The lowest frequency on a piano is 27.5 Hz, which is almost at the start of the human hearing range. However, the highest frequency on a pian...

music.stackexchange.com

The highest frequency on a piano is 4.1 kHz and no one really uses the very last notes of a piano. There is a lot of OTHER accompanying instruments which apparently doesn't show up as a phase difference.

Subjectively, it's the piano that I hear the difference -- but the piece is also designed to showcase the pianist.

rwortman · Oct 6, 2022

MaxBuck said:
I think that what people perceive as "different sounding amplifiers" may reflect the different interactions amplifiers may have to the complicated impedance characteristics of their speakers. Impedance of speakers is monstrously nonlinear, and many otherwise excellent amplifiers will react quite differently to the nonlinearities.

I don’t know what “otherwise excellent” means in this context. The job of an amplifier is to supply a voltage to a speaker that is invariant to the load. None can do this perfectly but an amplifier that does this poorly when driving speakers with widely varying impedance is not an excellent amp in my estimation. An amplifier‘s excellent measurements into test resistors doesn't make it an excellent amplifier.

dlaloum · Oct 6, 2022

Compression issues are subtle.... and often difficult to identify.

I recall measuring a phono stage, and being puzzled by some of the results, when I measured at different levels, I found that the gain provided with differing input levels was not linear - and that it varied with frequency - linear in the lower frequencies, but compressed in the highs.

Normal testing just doesn't show this up.... I am not sure what the best way to identify this kind of behaviour is, or what kind of testing could be used as "standard" to show up this behaviour....

With that particular (highly regarded) phono stage, I think the compression in the highs was one of the reason many users liked it subjectively ... it seems to amplify low level signals more than high level signals - so detail is more audible... more exposed.

I tripped over this behaviour by serendipitous accident while measuring the f/r of different cartridges (and seeing variations in F/R in the high frequencies that I thought should not be there... - and which varied with input V level!)

tw99 · Oct 6, 2022

It’s an interesting test.

Looking at the comparison graphs in the first post, about two thirds of the way across, there’s a section of about four peaks that show up almost totally blue, almost as if one amplifier was hardly reproducing them at all. Is that just an artefact of how the graph is displayed? It seems very odd otherwise.

Edited to add: I guess it is an artefact because the difference graph below doesn’t show a very big difference at that time stamp.

pkane · Oct 6, 2022

tw99 said:
It’s an interesting test.

Looking at the comparison graphs in the first post, about two thirds of the way across, there’s a section of about four peaks that show up almost totally blue, almost as if one amplifier was hardly reproducing them at all. Is that just an artefact of how the graph is displayed? It seems very odd otherwise.

Edited to add: I guess it is an artefact because the difference graph below doesn’t show a very big difference at that time stamp.

It's not an artifact. If you're referring to PK Metric plot, it is scaled relative to the perceptual weight of the difference. Peaks could be the result of the difference falling mostly in a more perceptually audible range (with regards to frequency and masking effects).

GXAlan · Oct 6, 2022

@dlaloum, I don’t think we can standardize this test - which is a real “challenge.” If you have to listen, you introduce sighted bias in most cases.

For all we know, it’s a perfect storm of volume and content. Our best chance of “creating” a difference is to see if we can identify different songs and conditions that “maximize” the audible difference. To be fair, the PK metric beats the 50 dB threshold but it’s because there are areas that are really different but mostly areas where it will be hard to hear a difference.

1) Have “you” (to the community, not you specifically) heard differences between amps? What song were you listening to? What volume were you listening to? What subjective description can you give about the difference? What was your gear?

We should then try to see if we can hear differences with our own content too.

2) With my negative control showing that two DACs sound similar even though I am going physical disc to Seiko DAC versus digital file to AKM DAC and I have very different volume settings, I am more confident of this test too.

tw99 · Oct 6, 2022

pkane said:
It's not an artifact. If you're referring to PK Metric plot, it is scaled relative to the perceptual weight of the difference. Peaks could be the result of the difference falling mostly in a more perceptually audible range (with regards to frequency and masking effects).

I meant this section (attached) of the "aligned waveforms" graph. I thought that it just looks odd with the blue trace being very different in that area of the graph, and thought that this would be indicating some sort of gross difference between the two amplifiers given that the scale on the left is in dB. Perhaps someone can explain for those of us who haven't used Deltawave exactly what that graph is showing ?

MaxBuck · Oct 7, 2022

rwortman said:
I don’t know what “otherwise excellent” means in this context. The job of an amplifier is to supply a voltage to a speaker that is invariant to the load. None can do this perfectly but an amplifier that does this poorly when driving speakers with widely varying impedance is not an excellent amp in my estimation. An amplifier‘s excellent measurements into test resistors doesn't make it an excellent amplifier.

Well, and you've really hit upon the kernel of the matter. Are amplifiers ever tested into complex impedances like speakers present? Serious question from an ignorant person.

dlaloum · Oct 7, 2022

MaxBuck said:
Well, and you've really hit upon the kernel of the matter. Are amplifiers ever tested into complex impedances like speakers present? Serious question from an ignorant person.

Complex low impedance loads are not cheap or simple.... especially if they have to handle the full current/power of an amplifier - safely (!)

And no, they are pretty much never tested that way.

Even the simple "let's test at 2 ohm" - which provides some idea of the amps capabilities under low impedance stress - is very very rarely done.

However - an "ideal" amp and a "good" amp are two different things

For many/most speakers, there are many many amps that will serve as "good" amps - they are running within their optimal operating envelope with the speakers (easy loads, no excessively low impedances etc...) - connect those "good" amps to a difficult load (like my main speakers) - and it is no longer a "good" amp.

An "ideal" amp would handle all speakers and all circumstances (like the old description of power output from a Rolls Royce engine: adequate power under all circumstances..... and for Aston Martin: more than adequate power under all circumstances).... there are a few such rare beasts out there, but they typically cost astronomical amounts.... the rise of ClassD is starting to bring those kind of capabilities down from the stratosphere to where most of us can afford them - at a stretch.

GXAlan · Oct 7, 2022

MaxBuck said:
Well, and you've really hit upon the kernel of the matter. Are amplifiers ever tested into complex impedances like speakers present? Serious question from an ignorant person.

Stereophile tests into a simulated load but that load is hardly reflective of most speakers.

What is nice about these results presented is that it is into a resistor. So these differences are likely magnified with real loads. I used both a Studio 620 and JBL XPL90.

Now look at the curve, the piano is in a tough area to drive from the standpoint of high impendance. This is how I heard the difference. All testing and measurements are 4.48 ohms.

dlaloum · Oct 7, 2022

GXAlan said:
Stereophile tests into a simulated load but that load is hardly reflective of most speakers.

What is nice about these results presented is that it is into a resistor. So these differences are likely magnified with real loads. I used both a Studio 620 and JBL XPL90.

Now look at the curve, the piano is in a tough area to drive from the standpoint of high impendance. This is how I heard the difference. All testing and measurements are 4.48 ohms.

View attachment 235667

From the standpoint of high impedance? - Typically high impedance is easy for an amp to handle, it is low impedance that causes most of the problems?

I would expect there to be issues around the 200Hz area - where there is that trough in impedance - rather than at the 2k to 3k zone where there is a peak? (or perhaps they are just different issues!)

And how an amp handles phase is another different issue -that phase peak between 800 and 1000 Hz...

I don't have a good understanding of speaker phase impacts - but I presume this would impact on impulse response.... (perhaps the sort of thing that Dirac corrects for?)

rwortman · Oct 7, 2022

MaxBuck said:
Well, and you've really hit upon the kernel of the matter. Are amplifiers ever tested into complex impedances like speakers present? Serious question from an ignorant person.

I suspect some manufacturers do in the service of making better amplifiers. I don’t think reviewers ever do. Speakers are not just complex passive impedances, they are active devices that generate their own voltages that the amp output and its feedback network ”see”.

MaxBuck · Oct 7, 2022

rwortman said:
I suspect some manufacturers do in the service of making better amplifiers. I don’t think reviewers ever do. Speakers are not just complex passive impedances, they are active devices that generate their own voltages that the amp output and its feedback network ”see”.

Right. They're part of the circuit. And analysis of the circuit isn't entirely trivial.

I'm speaking as someone who had one semester of EE and who would have committed suicide had I been required to take a second.

GXAlan · Oct 8, 2022

Since it's been several days since the original post and I think we've had enough "blinded" discussion without knowledge of the amplification, I will go ahead and share the details now.

Source: Marantz SA-11s2 for this comparison
Integrated amp #1: Marantz PM-11s2
Integrated amp #2: Topping PA5*
* I didn't have TRS to XLR cables handy, but I did have TRS to RCA plugs available. This may negatively impact the potential performance of the PA5.

Validation of test environment compared to the APx555
The Topping PA5 tested here scores a 5W SINAD of 105.5 dB into 4 ohms with an input of 0.5V. Amir's APx555 loopback has a 122 dB SINAD.
I use an E1DA Cosmos Grade A, 4.48 ohm resistors and my Panasonic UBP-UB9000 loopback only reaches 112 dB SINAD loopback.
I have lower-end gear so small differences are more likely to be lost in nose.

My own Topping PA5 measurements with Panasonic UB9000 source
~2W SINAD -96.0 dB
~5W SINAD -102.1 dB

My own Marantz PM-11s2 measurements with Panasonic UB9000 source
~2W SINAD -94.1 dB
~5W SINAD -99.2 dB

By these 1 kHz SINAD metrics, both amps reach the blue zone for 5W SINAD. They're very good amps and the PA5 edges out the PM-11s2 for SINAD.

How loud was I listening?
The amount of power that I am putting through these amps is pretty light.
At listening volume, I recorded at -49.036 dB RMS, at 5W output (calibrated to 1 kHz test tone), the RMS recording was -33.926 dB RMS
This is a difference of 15.11 dB. I will round down to 15 dB to make it easy.

5W -> 2.5W (-3dB) -> 1.25 (-6 dB) -> 0.625 (-9dB) -> 0.3125 (-12 dB) -> 0.15625 Watts (-15 dB).

This is way lower than the power limits of both amps

Did the volume knob on the PA5 make a difference? Not really.
PK metric of the SA-11s2/PA5 through PM-11s2tape loop at <0.2 watts vs UB9000/PA5 @ 5 W direct: -93.6 dBFS
PK metric of the SA-11s2/PA5 direct at <0.2 watts vs UB9000/PA5 @ 5 W direct: -90.4 dBFS

Which was more accurate?
The PK Metric says that the PM-11s2 was closer to the digital file itself than the PA5. Therefore, the PM-11s2 was more accurate overall. Both were pretty lousy at matching the source digital file though!

Is the difference audible?
Yes! I was asked to measure the PA5 to make sure it was functioning properly. (It does, 5W SINAD of 102 dB!). Then when doing a quick listening test to my standard amp, I was surprised that I heard what I thought was a real difference. Since it wasn’t blinded and the difference was small, I went to measure… and that led to this whole thread.

Conclusion
I've shared this experiment in the hopes that others will try to make their own measurements. I'm pretty tired of testing these and I'm getting back to listening to music.

I am happy to know that I seem to have collected a lot of amplifiers and not very many sources and it turns out that sources sound alike while amplifiers may in fact sound different.

04gto · Oct 8, 2022

Great topic for discussion. (subbed)

valerianf · Oct 8, 2022

If I summarize properly the root cause issue around 4khz, if means that the amplifier needs to be matched with the speaker impedance.
This is very difficult/impossible to verify with the available marketing information.
Imagine what is the user experience with the Onkyo RZ50 on board amps going to safety mode without and warning.
In the old time, on some amplifiers, there was a vu-meter and a Led indicating an overload.
At list nowadays the amplifier manufacturers could put back the overload Led.

Measuring the "sound signature" of two different integrated amplifiers.

Major Contributor

Addicted to Fun and Learning

Master Contributor

Grand Contributor

Major Contributor

Addicted to Fun and Learning

Major Contributor

Senior Member

Master Contributor

Major Contributor

Senior Member

Attachments

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Addicted to Fun and Learning

Major Contributor

Major Contributor

Member

Addicted to Fun and Learning

Similar threads