• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Alternative method for measuring distortion

eliash

Senior Member
Joined
May 29, 2019
Messages
407
Likes
209
Location
Bavaria, near lake Ammersee
Here's a comparison between a sine wave with some distortion applied to it (harmonic and jitter) and below is the same distortion applied to a 32 tone signal. Which is better?

View attachment 42430


View attachment 42431

I see what you mean, but as I wrote, the discrete lines (and the general noisefloor in this case) in the multitone interspaces is the uneasy part for me, but probably the less annoying one.
The harmonic distortion is of course not nice as well, but actually my TT´s cart has a slightly higher d2...anyway for pure electronics I would demand lower in both cases.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,632
Likes
10,205
Location
North-East
I am supposed to be in stealth mode, but this is too interesting.

There are anyhow some points that a pure technician like me is not sure to follow:
1. as already mentioned, the correlation between df and perceived audio quality is not clearly explained, at least in this thread. I have to admit I didn't dig in further.
2. metric is based on a measurement device. How is this device assessed itself? What would be the equivalent of the loopback we are used to? How to make the metric universal if measurements device impacts it?
3. measurements themselves: the examples shown are recorded at 96 kHz. This could mean that more than half the analyzed bandwidth is outside the audible band. This brings back to point 1.
4. the very bad result that all DUTs have with square waves makes me wonder if the bandwidth limitation + DC removal is also applied to the reference signal. If not, this can be quite misleading.

@Serge Smirnoff and @pkane , I am sure these points have already been addressed, could you please clarify?

Hi Fred,

1: Need Serge to jump in here. As I said, I can add the df metric to DeltaWave so others can play with it and evaluate its effectiveness.
2: Good question. Normally, it will have to be a high-quality, well-measuring ADC. Obviously, any recording of a DUT will combine the distortions from the DUT with those from the ADC. ADC must be pretty good for this to work right. I usually use a digital interface with DAC and ADC built-in to measure the device in loopback, and then compare it to other devices using just the ADC. This gives me a standard point of reference.
3: At least with DeltaWave, you can apply filters to limit the bandwidth of the signal being measured/compared
4: Again, speaking just for DeltaWave, both DC and out-of-band components can be removed, if desired
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
1. Demonstrate that df metric is indeed correlated with perceived audio quality (I think you had some studies referenced on your site -- can you please link them and describe in more detail?) Is there a sufficient evidence that df is better correlated than, say THD or THD+N or other common metrics?
Such evidences can be obtained comparing results of listening tests with df levels. As df-metric uses black box concept I started the research [http://soundexpert.org/articles/-/b...asurements-to-predict-listening-test-results-] with listening tests of psychoacoustic encoders performed on HydrogenAudio forum. Results of these five cases:
kd74_regress.png

c2_native_regress.png

c3_native_regress.png

c4_native_regress.png

c5_native_regress.png

Red DUTs have substantially different artifact signatures so they can not be used for regression analysis. This research takes a lot of work as at least 10-15 cases of various DUTs must be examined (and not only encoders) in order to have reliable results. I will find the time; promise.

Another set of cases is in my already mentioned AES paper [http://soundexpert.org/documents/10179/11017/DiffLevel_AES118.pdf]. These test items were created from real sound excerpts using artifact amplification. Afterwards they were tested in normal listening tests:

aes_psy-metric.png


These cases have better correlation because each five test items in each plot are derived from the natural sound sample (dots without conf intervals) and consequently have very similar artifact signatures.

Histogram medians on df-slides of portable players [post #69] also give some idea about correlation to subjective quality, less evident though as I don't have corresponding results of listening tests.

In all cases that I examined so far the relation between df levels and subjective scores is not higher than 2nd order. In most cases it is linear.

The relation needs to be researched further but what can be stated for sure is extremely low correlation between df(m-signal) and df(Sine).
 
Last edited:
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
2. Is there a sufficient difference between the df metric and the RMS null difference as computed by DeltaWave (and, similar software like AudioDiffMaker)?
Sorry for self-citing:

aes_01.png


aes_02.png


In short:

aes_df-thd.png


These are just different methods of measuring power of difference signal. Df level is measured in the same units as THD.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
2. metric is based on a measurement device. How is this device assessed itself? What would be the equivalent of the loopback we are used to? How to make the metric universal if measurements device impacts it?
Partly the issue was addressed in my first post (below Apple adapter). For all measurement interfaces above certain accuracy df levels are equal (semi-absolute )). I'm thinking about the issue.

As test signals in df-metric are in digital form refDAC-refADC measurements could be a reference as @pkane mentioned already.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
3. measurements themselves: the examples shown are recorded at 96 kHz. This could mean that more than half the analyzed bandwidth is outside the audible band. This brings back to point 1.
Output signal is precisely low-passed at 22050 before computing df levels.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
4. the very bad result that all DUTs have with square waves makes me wonder if the bandwidth limitation + DC removal is also applied to the reference signal. If not, this can be quite misleading.
I use band-limited tech signals (except white noise) - http://soundexpert.org/test-signals without DC but this is not important as df is computed with corr.coeff, which is by definition insensitive to DC and RMS differences between signals.

Square signal is really difficult for DUTs.
 

boXem

Major Contributor
Audio Company
Joined
Jun 19, 2019
Messages
2,014
Likes
4,852
Location
Europe
The relation needs to be researched further but what can be stated for sure is extremely low correlation between df(m-signal) and df(Sine).
Engineer speaking: are there other technical signals showing better correlation between df(m-signal) and df(t-signal)?
Partly the issue was addressed in my first post (below Apple adapter). For all measurement interfaces above certain accuracy df levels are equal (semi-absolute )). I'm thinking about the issue.

As test signals in df-metric are in digital form refDAC-refADC measurements could be a reference as @pkane mentioned already.
You clearly need to be bullet proof on this side, if cheating is possible, cheating will be done.
Output signal is precisely low-passed at 22050 before computing df levels.
I use band-limited tech signals (except white noise) - http://soundexpert.org/test-signals without DC but this is not important as df is computed with corr.coeff, which is by definition insensitive to DC and RMS differences between signals.

Square signal is really difficult for DUTs.
Thanks for your clarifications
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Engineer speaking: are there other technical signals showing better correlation between df(m-signal) and df(t-signal)?
Yes. If you look at df-slides df levels with program simulation noise reliably follows medians of histograms. The difference between them increases only in cases of some psychoacoustics processing in a DUT. Other tech signals (including Sine) are far behind. BTW, you can design any signal you want, and df level will show to what extent it was distorted in a DUT.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
You clearly need to be bullet proof on this side, if cheating is possible, cheating will be done.
At this stage of the research I just clearly state what measurement interface was used. And yes, I always have the cheating issue in mind. Cheating by audio manufacturers first of all )).
 
Last edited:

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,632
Likes
10,205
Location
North-East
Such evidences can be obtained comparing results of listening tests with df levels. As df-metric uses black box concept I started the research [http://soundexpert.org/articles/-/b...asurements-to-predict-listening-test-results-] with listening tests of psychoacoustic encoders performed on HydrogenAudio forum. Results of these five cases:
View attachment 42436
View attachment 42437
View attachment 42438
View attachment 42439
View attachment 42440
Red DUTs have substantially different artifact signatures so they can not be used for regression analysis. This research takes a lot of work as at least 10-15 cases of various DUTs must be examined (and not only encoders) in order to have reliable results. I will find the time; promise.

Another set of cases is in my already mentioned AES paper [http://soundexpert.org/documents/10179/11017/DiffLevel_AES118.pdf]. These test items were created from real sound excerpts using artifact amplification. Afterwards they were tested in normal listening tests:

View attachment 42445

These cases have better correlation because each five test items in each plot are derived from the natural sound sample (dots without conf intervals) and consequently have very similar artifact signatures.

Histogram medians on df-slides of portable players [post #69] also give some idea about correlation to subjective quality, less evident though as I don't have corresponding results of listening tests.

In all cases that I examined so far the relation between df levels and subjective scores is not higher than 2nd order. In most cases it is linear.

The relation needs to be researched further but what can be stated for sure is extremely low correlation between df(m-signal) and df(Sine).

This is good information, thanks for posting, Serge. Seems to me that there is a good correlation between perceived quality and df score.

Interesting that perhaps my DISTORT software in combination with DeltaWave could be used to further validate this claim. DISTORT is capable of adding harmonic and phase noise distortion (more distortions coming soon) to any digital file, while DeltaWave could provide a simple mechanism to compute a df score, along with other difference metrics (correlated null depth, rms difference, phase error/jitter, etc.)
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,632
Likes
10,205
Location
North-East
Sorry for self-citing:

View attachment 42446

View attachment 42447

In short:

View attachment 42448

These are just different methods of measuring power of difference signal. Df level is measured in the same units as THD.

Yes, these are very similar, so it's not surprising that the results produced by RMS difference and df appear to correlate well. The main distinction is the scale of the signal used to compute the metric. For the Deltawave RMS difference it is the whole file, for df it's a short, 0.4s interval. This can potentially cause the two metrics to diverge under some conditions, possibly with non-linear amplitude and phase distortions. DW does provide a corrective filter for those cases, so with the non-linear amplitude and phase EQ enabled in delatwave, I would expect the two metrics to stay mostly in sync.

Like I said originally, I see the value of the short-window analysis used in df, that might be a good enough reason to simply add this as another metric to DeltaWave :)
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Interesting that perhaps my DISTORT software in combination with DeltaWave could be used to further validate this claim. DISTORT is capable of adding harmonic and phase noise distortion (more distortions coming soon) to any digital file, while DeltaWave could provide a simple mechanism to compute a df score, along with other difference metrics (correlated null depth, rms difference, phase error/jitter, etc.)
This will be really helpful. I also see potential of your soft in determining df level with m-signal when distortions (whatever they are) become not audible (s-level).
 
Last edited:
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Yes, these are very similar, so it's not surprising that the results produced by RMS difference and df appear to correlate well. The main distinction is the scale of the signal used to compute the metric. For the Deltawave RMS difference it is the whole file, for df it's a short, 0.4s interval. This can potentially cause the two metrics to diverge under some conditions, possibly with non-linear amplitude and phase distortions. DW does provide a corrective filter for those cases, so with the non-linear amplitude and phase EQ enabled in delatwave, I would expect the two metrics to stay mostly in sync.

Like I said originally, I see the value of the short-window analysis used in df, that might be a good enough reason to simply add this as another metric to DeltaWave :)
I think we need just one special test signal to sync our computations of df. I will prepare it. Give me a few days.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
@pkane

Here is signals for testing df-measurements [https://www.dropbox.com/s/2tyhv2lw7y22ksz/se-pwn44-48-v1.0.rar?dl=0]:

- se-pwn44.wav - band-limited (20Hz-20kHz) pseudo-white noise @44.1kHz sampling rate
- se-pwn48.wav - band-limited (20Hz-20kHz) pseudo-white noise @48kHz sampling rate; the signal has the waveform mathematically identical to se-pwn44.wav
- se-pwn48flip.wav - the same as se-pwn48.wav but flipped left to right
- se-pwn48mix.wav - mix of se-pwn48.wav and se-pwn48flip.wav

all signals are 30s long, have 1 channel and 32 bit depth; below are diffrograms (100ms) of the signals where se-pwn44.wav is used as a reference; Min, Median and Max of 300 df values are indicated.

(1) se-pwn44.wav vs. se-pwn48.wav
se-pwn48.wav(48)__se-pwn44.wav(44)__mono_100-100.5928-100.1370-99.6344.png

se-pwn48.wav(48)__se-pwn44.wav(44)__mono_100-100.5928-100.1370-99.6344.png

theoretically the signal have df = -Inf, but in practical computations df level with this signal shows accuracy of the computation and time-warping; in my case it is -100dB, which is a reasonable trade-off between accuracy and computation time as real DUTs have usually much lower accuracy of reproduction of white noise [Hugo 2: df(wn) = -28.4dB]


(2) se-pwn44.wav vs. se-pwn48flip.wav
se-pwn48flip.wav(48)__se-pwn44.wav(44)__mono_100-0.3213-0.2064-0.1019.png

se-pwn48flip.wav(48)__se-pwn44.wav(44)__mono_100-0.3213-0.2064-0.1019.png

df level for this signal is not 0dB because there is small dependency between the signal and its reversed copy (corrcoeff = 0.0057) due to pseudo-randomness (sum of 20000 sine waves with random freqs and phases).


(3) se-pwn44.wav vs. se-pwn48mix.wav
se-pwn48mix.wav(48)__se-pwn44.wav(44)__mono_100-5.7015-5.3648-5.0305.png

se-pwn48mix.wav(48)__se-pwn44.wav(44)__mono_100-5.7015-5.3648-5.0305.png

df level is slightly better than theoretical value 10*log(1 - 1/sqrt(2))=-5.3329 also due to small dependency of signals in the mix

You can safely use these df levels with the signals for calibrating your computation of df levels (while I generate “more random” signals, which will be closer to theoretical values).
 
Last edited:

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,632
Likes
10,205
Location
North-East
@pkane

Here is signals for testing df-measurements [https://www.dropbox.com/s/2tyhv2lw7y22ksz/se-pwn44-48-v1.0.rar?dl=0]:

- se-pwn44.wav - band-limited (20Hz-20kHz) pseudo-white noise @44.1kHz sampling rate
- se-pwn48.wav - band-limited (20Hz-20kHz) pseudo-white noise @48kHz sampling rate; the signal has the waveform mathematically identical to se-pwn44.wav
- se-pwn48flip.wav - the same as se-pwn48.wav but flipped left to right
- se-pwn48mix.wav - mix of se-pwn48.wav and se-pwn48flip.wav

all signals are 30s long, have 1 channel and 32 bit depth; below are diffrograms (100ms) of the signals where se-pwn44.wav is used as a reference; Min, Median and Max of 300 df values are indicated.

(1) se-pwn44.wav vs. se-pwn48.wav
View attachment 43020
se-pwn48.wav(48)__se-pwn44.wav(44)__mono_100-100.5928-100.1370-99.6344.png

theoretically the signal have df = -Inf, but in practical computations df level with this signal shows accuracy of the computation and time-warping; in my case it is -100dB, which is a reasonable trade-off between accuracy and computation time as real DUTs have usually much lower accuracy of reproduction of white noise [Hugo 2: df(wn) = -28.4dB]


(2) se-pwn44.wav vs. se-pwn48flip.wav
View attachment 43021
se-pwn48flip.wav(48)__se-pwn44.wav(44)__mono_100-0.3213-0.2064-0.1019.png

df level for this signal is not 0dB because there is small dependency between the signal and its reversed copy (corrcoeff = 0.0057) due to pseudo-randomness (sum of 20000 sine waves with random freqs and phases).


(3) se-pwn44.wav vs. se-pwn48mix.wav
View attachment 43022
se-pwn48mix.wav(48)__se-pwn44.wav(44)__mono_100-5.7015-5.3648-5.0305.png

df level is slightly better than theoretical value 10*log(1 - 1/sqrt(2))=-5.3329 also due to small dependency of signals in the mix

You can safely use these df levels with the signals for calibrating your computation of df levels (while I generate “more random” signals, which will be closer to theoretical values).

Just so I'm clear, Serge, you are comparing two files at different sampling rates and computing the df metric for these?

This is going to be harder to do with DeltaWave (at least until I make some changes) because DW uses an automatic resampler that will match the sampling rates when they are not the same. I can add an option to not resample and instead try to correct using linear phase correction. Let me play around with it.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Just so I'm clear, Serge, you are comparing two files at different sampling rates and computing the df metric for these?
Exactly. My algo measures difference between waveforms no matter what sample rate is used. In other words df levels do not depend on sampling rate.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,632
Likes
10,205
Location
North-East
Exactly. My algo measures difference between waveforms no matter what sample rate is used. In other words df levels do not depend on sampling rate.

That might be a problem, DW drift detection algorithm is designed for a few hundred ppm, not close to 1000. That’s why it resamples to match rates. I’ll run the test with the resampler, first.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
If your resampler is accurate it will not spoil df measurements. I also use resampling before measurement. Test 44 vs. 48 (1) shows how accurate is your processing (resampling+warping+computing df) as both files contain mathematically identical waveforms sampled at different rates.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,632
Likes
10,205
Location
North-East
@pkane

Here is signals for testing df-measurements [https://www.dropbox.com/s/2tyhv2lw7y22ksz/se-pwn44-48-v1.0.rar?dl=0]:

- se-pwn44.wav - band-limited (20Hz-20kHz) pseudo-white noise @44.1kHz sampling rate
- se-pwn48.wav - band-limited (20Hz-20kHz) pseudo-white noise @48kHz sampling rate; the signal has the waveform mathematically identical to se-pwn44.wav
- se-pwn48flip.wav - the same as se-pwn48.wav but flipped left to right
- se-pwn48mix.wav - mix of se-pwn48.wav and se-pwn48flip.wav

all signals are 30s long, have 1 channel and 32 bit depth; below are diffrograms (100ms) of the signals where se-pwn44.wav is used as a reference; Min, Median and Max of 300 df values are indicated.

(1) se-pwn44.wav vs. se-pwn48.wav
View attachment 43020
se-pwn48.wav(48)__se-pwn44.wav(44)__mono_100-100.5928-100.1370-99.6344.png

theoretically the signal have df = -Inf, but in practical computations df level with this signal shows accuracy of the computation and time-warping; in my case it is -100dB, which is a reasonable trade-off between accuracy and computation time as real DUTs have usually much lower accuracy of reproduction of white noise [Hugo 2: df(wn) = -28.4dB]


(2) se-pwn44.wav vs. se-pwn48flip.wav
View attachment 43021
se-pwn48flip.wav(48)__se-pwn44.wav(44)__mono_100-0.3213-0.2064-0.1019.png

df level for this signal is not 0dB because there is small dependency between the signal and its reversed copy (corrcoeff = 0.0057) due to pseudo-randomness (sum of 20000 sine waves with random freqs and phases).


(3) se-pwn44.wav vs. se-pwn48mix.wav
View attachment 43022
se-pwn48mix.wav(48)__se-pwn44.wav(44)__mono_100-5.7015-5.3648-5.0305.png

df level is slightly better than theoretical value 10*log(1 - 1/sqrt(2))=-5.3329 also due to small dependency of signals in the mix

You can safely use these df levels with the signals for calibrating your computation of df levels (while I generate “more random” signals, which will be closer to theoretical values).

Some initial DeltaWave results for these test files. This is using DeltaWave resampler to match rates and then to align and correct linear timing errors:

(1) se-pwn44.wav vs. se-pwn48.wav
DF Metric (step=100ms, overlap=0%): Median=-147.5dB Max=-99.6dB Min=-300dB 1% > -300.0dB 10% > -300.0dB 25% > -276.59dB 50% > -147.5dB 75% > -143.21dB 90% > -124.56dB

(2) se-pwn44.wav vs. se-pwn48flip.wav
DF Metric (step=100ms, overlap=0%): Median=-0.1dB Max=0.1dB Min=-0.3dB 1% > -0.28dB 10% > -0.19dB 25% > -0.16dB 50% > -0.1dB 75% > -0.05dB 90% > -0.01dB

(3) se-pwn44.wav vs. se-pwn48mix.wav
DF Metric (step=100ms, overlap=0%): Median=-5.4dB Max=-5dB Min=-5.7dB 1% > -5.62dB 10% > -5.52dB 25% > -5.44dB 50% > -5.36dB 75% > -5.27dB 90% > -5.2dB

I can now vary the time interval and the overlap, if desired. I see small changes due to varying these parameters, but not huge, maybe +/-1dB or so.
 
Top Bottom