• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required as is 20 years of participation in forums (not all true). There are daily reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Measuring the "sound signature" of two different integrated amplifiers.

pkane

Major Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
4,487
Likes
7,752
Location
North-East
@GXAlan,
very good effort, we certainly need more tests on this level of thoroughness and precision.

Alas, in my experience, one can never achieve the best possible resolution and robustness and most importantly, a reliable verification, of such difference tests with un-synced recording. That is, sample-synced record while playback (aka one integrated ADC+DAC device, both section running from one single clock) is required to really expose the fine grain of differences. As good as the Cosmos ADC is in term of standalone test, lacking any means of syncing to a source is a big drawback (and the reason I didn't buy one). For the same reason, a standalone source like a SACD is not the best possible option.

While DeltaWave is quite good at eliminating these dynamic differences from clock mismatch and drifts it can only do so much.
Notably the difference file is not very clean and thusly not directly usable for a verify test where we can "eliminate" the difference by adding (resp. subtracting) it as a pre-correction to the input signal for one (resp. the other) amp. Basically you emulated one amp with the other by pre-conditioning its input signal to forcibly match the output to that of the other amp.

IHMO and in my strong experience, the verify test is the most important test to make sure that the difference that was found is really responsible for the observed changes and it is absolutely mandatory to technically subtantiate the claims. It is also the only way to make such tests 100% repeatable.
It can even be extended to check for the influence of linear differences (frequency response changes, magintude and phase) and non-linear effects like distortion in isolation, seperated from each other. Uncorrected linear differences are often the dominant differences but in the end they are trivial as the can be inverted. Non-linear effects like the compression you seem to have found cannot be inverted. In many tests I've made, the linear differences were dominating the results even though they would seem to be irrelevant at first glance whereas striking differences in distortion often were actually inaudible.

With sample-synced technique using the difference signal directly for pre-correction, amp A should measure and sound close to identical to B (and vice versa). Further, one can actually listen to the difference signal as it is not disturbed by processing artifacts which would give false clues. Finally, sample-synced process allows for easy time-domain averaging which is alway welcome to reduce noise/hum/buzz and any other components not strongly correlated to the signal.

EDIT: Link to the outcome of some in-depth difference tests I made:

Since the digital part of the test set up wasn't changed between any of these tests, any constant clock drift between DAC and ADC should not be an issue as it would be the same in all captures. In fact, it's safe to turn off drift measurement/correction in DeltaWave (and I've done it for @GXAlan 's recordings) -- it made no measurable or audible difference in the result. There are much larger analog errors than the 1-2ppm clock drift. This was also verified by capturing two files from each of the amps and then comparing them to each other -- the RMS null was well over -100dB (edit: meant to say well lower than -100dB)
 
Last edited:

MaxBuck

Major Contributor
Forum Donor
Joined
May 22, 2021
Messages
1,243
Likes
1,625
Location
SoCal, Baby!
Also... speaker loads are not the same as resistive loads.

One could, for instance, do the same measurements once with a resistor as a load and once using a speaker and null those. This could potentially give you the influence a speaker has on the amplifier output.
I think that what people perceive as "different sounding amplifiers" may reflect the different interactions amplifiers may have to the complicated impedance characteristics of their speakers. Impedance of speakers is monstrously nonlinear, and many otherwise excellent amplifiers will react quite differently to the nonlinearities.
 
OP
G

GXAlan

Major Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
1,576
Likes
2,395
How about running a comparison of the recordings against the source material to see which amp system is more transparent?

Shouldn't nonlinearity of the peak response be captured as distortion in Amir's normal test sequence? (Especially in the multitone test).

If the volume knob is the issue that @restorer-john brings up, then it might not show up in the normal test sequence. That’s the best explanation I have seen so far since it would explain why all of the data is valid.

Hopefully @pkane can confirm my methodology.

I was able to extract the DSF from the SACD and then use DeltaWave to do the comparison. The only default setting I changed was telling DeltaWave to do its DSD to PCM conversion at 176 kHz instead of 192 kHz. DeltaWave only works on one channel so I went with the left channel and I trimmed the first 2 seconds like I have for my other comparisons.

PK Metric
System A: -44.2 dBFS / -51.05 dBA
System B: -43.3 dbFS / -49.83 dBA

System C: -43.9 dbFS / -50.16 dBA

Overall, this means that A is better than C which is better than B.

But if I am looking at the graphs, you see some interesting data
System A only has two big spikes; it's usually below -39 dB
System B/C have a more spikes above -39 dB

At the TWO second mark, System A is less transparent than System B or C.
At the TWELVE second mark, System A is more transparent than System B or C.


It's not a clear "always more transparent" or "always less transparent". It depends on what part of the music!

1664802670647.png

1664802682401.png


1664802688987.png



For validation, here are some digital to digital comparisons

1664808234150.png

1664808242293.png

1664808251034.png


(and I've done it for @GXAlan 's recordings)

Thanks for the comments. For everyone else, I did put my non-public recordings on a shared drive for Paul to see prior to posting. It was also Paul who suggested the validation testing in the first post.
 
OP
G

GXAlan

Major Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
1,576
Likes
2,395
I think that what people perceive as "different sounding amplifiers" may reflect the different interactions amplifiers may have to the complicated impedance characteristics of their speakers. Impedance of speakers is monstrously nonlinear, and many otherwise excellent amplifiers will react quite differently to the nonlinearities.

What's neat about these tests is that I'm running it through a simple non-inductive resistor! The real differences with real speakers are likely to be different!
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
17,127
Likes
29,825
What do the phase and FR differences look like?
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
17,127
Likes
29,825
Also if you let Deltawave fix FR and phase what results do you have?
 

-Matt-

Senior Member
Joined
Nov 21, 2021
Messages
335
Likes
244
What should be the conclusion here?

That the current suit of tests run by Amir don't show, in detail, all aspects of the potentially nonlinear response of an amplifier? (Leaving some possibility for audible differences between high scoring amplifiers)?

I don't think that was ever the aim; the intention is to provide a set of simple metrics that allow direct comparisons between devices.

If an amplifier has a nonlinear response (i.e. compressing the output) then this should surely add distortion in the fft and multitone tests? The nonlinearity of the volume control should also be sampled in the THD+N vs Measured Level plots shouldn't it?

Is the problem that you are addressing that all types of distortion are lumped together. I.e. Two amplifiers that each have SINAD of 100dB may sound different because the nature of that distortion (and noise) is different? In this case the magnitude of these distortions (relative to the source material) is already captured in Amir's tests.

...Or is the claim that the differences presented here are completely unsampled by Amir's existing tests? If so, without placing enormous additional workload on Amir, what additional testing would you suggest to include? Preferably it should be possible to refine the result to a single figure metric for easy comparison between devices.
 
Last edited:
OP
G

GXAlan

Major Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
1,576
Likes
2,395
What should be the conclusion here?

1) Anyone who claims that there are things that can be heard but cannot be measured is wrong. “Golden earred” audiophiles should try to quantify what they think they hear — the tools (both hardware and software) are good enough now.

2) Anyone who claims they can hear differences between two devices that should be identical (DACs, amplifiers, etc) shouldn’t be immediately ostracized. Maybe their exact setup is revealing *something*

3) I like @restorer-john ‘s explanation that the volume knob can play a role.

4) I don’t think @amirm needs to do anything different. He doesn’t shoulder the burden of all measurements and all tests for the industry or hobby. He has graciously provided the best advertiser free science driven audio website. We, the readers, should support Amir with extra tests if we want extra tests and rely on the group expertise to help us make sure that the approaches are done properly.

5) Have you heard of PK Metric? If not, now you do.
 

Sokel

Major Contributor
Joined
Sep 8, 2021
Messages
2,243
Likes
1,755
The nonlinearity of the volume control should also be sampled in the THD+N vs Measured Level plots shouldn't it?
That test does not include device's VC,the gain is applied by the analyzer.As @restorer-john explained about actual VC there's a lot at play there.
But even that way we have a hint of what the SINAD is,in the level we listen to (edit) if we had perfect VC)
 
Last edited:

sq225917

Major Contributor
Joined
Jun 23, 2019
Messages
1,292
Likes
1,437
Does anyone here doubt that two amps with specs beyond audibility, run below clipping, into a resistive load can sound different into a real life reactive load? I don't.

Nice test, thanks for doing it
 

Sokel

Major Contributor
Joined
Sep 8, 2021
Messages
2,243
Likes
1,755
Does anyone here doubt that two amps with specs beyond audibility, run below clipping, into a resistive load can sound different into a real life reactive load? I don't.

Nice test, thanks for doing it
I was nearly mocked about acoustical measurements (REW) showing elevated distortion in low's only changing (perfectly capable otherwise) amps in the low region of my actives.
Abyss...
 

McFly

Addicted to Fun and Learning
Joined
Mar 12, 2019
Messages
845
Likes
1,666
Location
NZ
The "Super Audio Check CD" from CBS/Sony, containing super-high precision test signals, would fit for these tests and calibrations. If you would be interested, please refer to my post here. You may find the PDF booklet (English translation by myself) of the CD there. If you would be seriously interested, please simply PM me writing your wish.
Thanks, but these test CD's can be easily made with REW
 

Doodski

Grand Contributor
Forum Donor
Joined
Dec 9, 2019
Messages
16,342
Likes
15,957
Location
Canada
Thanks, but these test CD's can be easily made with REW
The optical integrity of the test discs may be of the superior grade that Sony uses for it's YEDS disc(s). Where that really shines is when adjusting a RF amp for the laser pickup. The crosshatch diamond shaped RF eye pattern needs to be calibrated and to do that the CD optics must be calibrated to spec otherwise the calibration of the RF amp is not correct and it throws everything out of whack. Very important for a CD calibration procedure.
 

-Matt-

Senior Member
Joined
Nov 21, 2021
Messages
335
Likes
244
I think the efforts you are putting into making measurements of these hard to quantify differences is great. So, thanks, and I agree with most of your conclusions above except...

2) Anyone who claims they can hear differences between two devices that should be identical (DACs, amplifiers, etc) shouldn’t be immediately ostracized. Maybe their exact setup is revealing *something*

...the "should be identical" part.

Just because two devices measure with the same SINAD will not meant that they generate identical output waveforms. So there is potential for difference between devices with the same SINAD number. I'm fully onboard with that.

It's just that when you compare already very high performing devices then because they must both be so close to the source signal the differences between them are likely to be below the threshold of audiability.

Someone smarter than me can probably calculate the maximum possible difference between two signals with SINAD of -120dB relative to a common source.
 

McFly

Addicted to Fun and Learning
Joined
Mar 12, 2019
Messages
845
Likes
1,666
Location
NZ
The optical integrity of the test discs may be of the superior grade that Sony uses for it's YEDS disc(s). Where that really shines is when adjusting a RF amp for the laser pickup. The crosshatch diamond shaped RF eye pattern needs to be calibrated and to do that the CD optics must be calibrated to spec otherwise the calibration of the RF amp is not correct and it throws everything out of whack. Very important for a CD calibration procedure.
Yes that disc has many uses beyond measuring audio output.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
17,127
Likes
29,825
I think the efforts you are putting into making measurements of these hard to quantify differences is great. So, thanks, and I agree with most of your conclusions above except...



...the "should be identical" part.

Just because two devices measure with the same SINAD will not meant that they generate identical output waveforms. So there is potential for difference between devices with the same SINAD number. I'm fully onboard with that.

It's just that when you compare already very high performing devices then because they must both be so close to the source signal the differences between them are likely to be below the threshold of audiability.

Someone smarter than me can probably calculate the maximum possible difference between two signals with SINAD of -120dB relative to a common source.
-120 db mean a difference of 1 ppm or less. Of course that is at the signal levels tested into loads tested.
 

-Matt-

Senior Member
Joined
Nov 21, 2021
Messages
335
Likes
244
-120 db mean a difference of 1 ppm or less. Of course that is at the signal levels tested into loads tested.
So does that mean the largest possible difference between two signals with 120dB relative to the same source would be 2 ppm % (equivalent to -114 dB)? ... and therefore likely inaudible?
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
17,127
Likes
29,825
So does that mean the largest possible difference between two signals with 120dB relative to the same source would be 2 ppm % (equivalent to -114 dB)? ... and therefore likely inaudible?
If you had one harmonic of distortion and noise was even less, the harmonic would be 1 millionth of the fundamental. Or if there were 0 distortion and only noise, the noise over the bandwidth measured would add up to only 1 millionth of the signal level of the test frequency. Again with a given signal level and a given load. Change those and other things may change. Inaudibly different under those conditions.
 

HarmonicTHD

Major Contributor
Forum Donor
Joined
Mar 18, 2022
Messages
1,631
Likes
1,980
Measuring the "sound signature" of two different integrated amplifiers.
Extraordinary claims require extraordinary evidence.

I have two high-performance integrated amps. The identities of these amps are not important and both reach "blue region" status for 1 kHz test tones for SINAD at 5 watts/4 ohms. I am presenting these measurements to showcase how @pkane's Deltawave tools can be used to correlate subjective listening impressions with actual measurements/science. Thank you @pkane for reviewing my methodology.

1. Background
When comparing two integrated amps at my normal quiet listening levels, I felt as if one system had better "attack" or "PRAT" to the music and I wanted to try to measure it. My subjective impression is what started this process. I had no measurements beyond sighted bias of having two units with very high 1 kHz tone performance. The TLDR is that I was successful in proving that differences can be measured and it turned out to be some sort of volume compression for transients that would be within the threshold of audibility.

Music on test: Concerto symphonique No. 4 in D minor, Op. 102: Scherzo from SFS0060 - Masterpieces in Miniature (physical SACD)

System A: SACD player -- balanced--> Integrated Amp 1
"subjective greater attack to the piano notes"

System B: SACD player -- balanced--> Integrated Amp 1 --> tape out --unbalanced--> Integrated Amp 2

System C: SACD player --unbalanced--> Integrated Amp 2.

Recording system:
E1DA Cosmos ADC Grade A with 4.48 ohm Dale Vishay 1% NH-250 resistors, 32-bit/176 kHz
Windows Laptop on battery power for recordings

Classical musical content was used for measuring the sound signature.

2. Matching the Volume

Using System B, I attached real bookshelf speakers (~87 db/2.83V) and set the volume to the actual level I prefer listening to. It's a small room and this was around 68-70 dB at the listening position for the soft portions of the music. Normal piano practice is quoted as 60-70 dB. System B was chosen as the reference volume setting since Integrated Amp 2 has no display indicating the volume, so once it's set, it's set. I then detached the speakers and replaced it with an E1DA Cosmos ADC setup and made a recording of the first minute. I switched to System A with real speakers, set to the same volume by ear, made a recording and then evaluated peaks at various frequencies from Audacity (not just 1 kHz). The volume difference was not linear in Audacity and then based upon my best estimate, I adjusted the volume of System A to get proper volume matching to System B.

System A and System B were matched to 0.056 dB for the SPL peaks in analog (prior to any digital correction).

3. Validation of test environment precision

System A was measured twice, one day apart.
System B was measured twice, one day apart.
System C was measured twice, 3 hours apart.
Deltawave was used to compare the run-to-run variability, trimmed at start and end of the music for the middle ~50 seconds of analysis.

The PK Metric was created by Paul K, the inventor of DeltaWave, and is designed to "more directly answer the question of whether the difference between two devices is likely to be audible or not."

These results show that the test environment (the amplifiers being tested, the cables, the ADC, etc.) were all very precise and are able to generate reproducible results.

PK Metric showed very high precision of the test environment
System A repeated one day later. -120.4 dbFS
System B repeated one day later. -117.2 dbFS
System C repeated 3 hours later. -117.8 dbFS

View attachment 234841


View attachment 234843
View attachment 234844



4. System A vs. System B
Initial peak values Reference: -28.017dB Comparison: -28.073dB
Initial RMS values Reference: -50.67dB Comparison: -49.022dB

Final peak values Reference: -28.017dB Comparison: -29.707dB
Final RMS values Reference: -50.67dB Comparison: -50.647dB

Recall that I matched volume in analog by making a recording and then using the Peak DB tool for various frequencies in Audacity. That's the initial value. I got my peaks within 0.056 dB of each other yet the RMS values were off by more than 1.5 dB. DeltaWave digitally corrects the level based upon RMS and got the two recordings to <0.03 dB matching. In doing so, the peak values are different by about 1.7 dB.

The differences in volume are non-linear. Sometimes System A (blue) is louder than System B (white) and sometimes it's not. It's not consistent to a single channel either.
View attachment 234845
View attachment 234846
View attachment 234847


Subjectively, I liked System A better because it had better "attack." When analyzing the two recordings, we see that
System B actually has a subtle compression effect relative to System A
or
System A was adding artificial impulse relative to System B.


The opposite could be true. System A could be adding artificial volume to impulse. We don't know what is actually more correct. It just shows that there is a measurable difference between these two systems in a manner that makes sense that it could be audible.

The PK Metric is -48.6 dBFS.

5. System A vs. System C

System B has the limitation of running through the tape out of the integrated amplifier in System A. Is the compression effect from the circuitry in the tape out, or from the amplifier? To answer this question, the SACD Player was connected directly to Integrated Amp B. The resulting recording was quieter and so the volume knob on Integrated Amp 2 was raised to attempt to match the volumes. These recordings were a day apart!

Initial peak values Reference: -28.017dB Comparison: -28.571dB
Initial RMS values Reference: -50.605dB Comparison: -49.532dB

Final peak values Reference: -28.017dB Comparison: -29.563dB
Final RMS values Reference: -50.605dB Comparison: -50.443dB

The matching wasn't as good in the analog realm with the peak values differing by around 0.5 dB. However, once Deltawave corrected the RMS values to <0.2 dB deviation, the peaks are still >1.5 dB different.

The tape out circuitry was not responsible for any of the perceived compression effect.
The PK Metric is -48.6 dbFS.


View attachment 234848
View attachment 234849

6. System B vs. System C
The difference between B and C is the tape loop. We have the same integrated amplifier being used in both. Once the two recordings were calibrated by DeltaWave, the RMS was ultra-precise at 0.004 dB and the peaks really aren't different (<0.01 dB). These recordings were a day apart!

Initial peak values Reference: -28.073dB Comparison: -28.571dB
Initial RMS values Reference: -49.02dB Comparison: -49.532dB

Final peak values Reference: -28.073dB Comparison: -28.137dB
Final RMS values Reference: -49.02dB Comparison: -49.024dB

Going straight from the SACD player to an integrated amplifier versus having a tape out in the signal chain had very little difference subjectively or objectively. The PK Metric is -93.0 dbFS.


View attachment 234850
View attachment 234851

7. Conclusions
"Attack" isn't a precise description. "PRAT" (pace, rhythm and timing) isn't a precise description. These are simply words to describe subjective experiences where we like one audio system more than another and cannot articulate the differences with more precise language. What you consider "PRAT," I might consider "attack."

But what we can agree upon are volume differences and how differences in volume that are non-linear can change the sound. I won't be able to convince everyone that I heard these differences between the two amplifiers. I'm sure there are those who will say these measured differences are not audible. All I will say is that I heard a difference which is why I embarked on this test and my original "matching volume by ear" got me to nearly to ~1 dB matching which I attribute to the extensive formal training I've had as a classical pianist.

I won't be able to convince everyone that I ran these tests perfectly or setup each component in the chain to its very best. This is why I have separated these as "systems" rather than naming specific components. Running the E1DA Cosmos without optimized gains should actually make it harder to detect differences between the systems. The E1DA Cosmos does not have an input buffer, and maybe that makes a difference from one system to the other.

I won't be able to convince everyone that my selected music or preferred volume is universal.

I'm not trying to convince you that these differences are the most efficient/meaningful uses of your money. There is clear consensus that speaker placement/furniture placement is the best ban for the buck (free), and when it comes to gear, speaker/subwoofers make the biggest impact to sound.

Anything that can be heard can be measured.
Maybe 30-40 years ago, the test equipment wasn't good enough to capture everything audiophiles thought they could hear. In 2022, hobbyist level ADCs are so good that you should expect/demand claims to be backed by measurements. I very easily can substitute my term "attack" with "PRAT." We have seen time and time again that a lot of audiophile tweaks prove to be useless.

Everything measured cannot be heard.
We're still human. The PK Error Metric is something I just learned about this week. If someone says they can hear something -300 dB away from reference, that's not really believable. Or better stated to be generous, that difference is not going to be meaningful to you unless you have a genetic mutation allowing you to hear something most humans cannot. The threshold for audibility of the PK Error metric has been stated as -50 dB and my reported differences met this threshold.

Two high-performance amplifiers seem to have measurable differences in one real-world condition that also meets the threshold for audibility.

Both of these amps are rated into the triple-digit watts into 4 ohms and I was pretty much running ~50 milliwatt to ~5W peaks given that the volume was originally set with an ~87 db/2.83V stereo, 5-6 ohm speaker pair near a wall, 7 ft listening distance, resulting in 68~70 dB RMS with presumably 90-92 dB peaks based upon the analysis of the recordings.

Consider adding Track 1 from Masterpieces in Miniature when testing gear.
It's the kind of classical music that can bring a smile to aficionados of the symphony while still being a piece that can be appreciated by those who rarely listen to classical music. I was very surprised that I thought I heard a difference on this track between two integrated amplifiers and maybe this happens to be a very good test.



@amirm
Thanks for your work.

Are there any of the other tests or specs in these amps, which would give also an indication to the difference you measured?

How different are the SINAD values? IMDs?

Can you say which amps you measured? And if not what is your reason?

Thx.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
17,127
Likes
29,825
@GXAlan I'd still like to see the phase and FR difference graphs. As an example: if one device is very slightly drooping in frequency it might look like compression if those places where the signal level differs are also places of high frequency content.

Or make your files available for me to download and I'll look myself.
 
Top Bottom