# Alternative method for measuring distortion

#### Tks

##### Major Contributor
@Serge Smirnoff

Very happy to see you here and present your work to some of the folks here. I was wondering if you would ever swing by

I've tried reading the technical papers, but sadly as I went to exploring on what certain things meant I got into a rabbit hole I never managed to get out of >_<

OP
S

#### Serge Smirnoff

##### Active Member
Very happy to see you here and present your work to some of the folks here. I was wondering if you would ever swing by
Thank you! I'm trying hard. There will be something interesting ahead

#### pkane

##### Major Contributor
Forum Donor

Hi Serge,

Some preliminary results using the test files you provided and DeltaWave. Each run took about 2 minutes with 100ms non-overlapping window. Please ignore the vertical scale -- I didn't change the display for DF yet . Comments?

1. White noise:
(your results were Df.max = -7.8717 dB, Df.min = -8.4315 dB, Df.median = -8.1572 dB)

2. Sine wave (1000Hz vs 1000.002Hz):
(your results were Df.max = -141.9116 dB, Df.min =-Inf dB, Df.median = -149.1320 dB)

3. Glockenspiel
(your results: Df.max = -8.1194 dB Df.min = -68.3725 dB Df.median = -41.8556 dB)

OP
S

#### Serge Smirnoff

##### Active Member
Great results! You probably use 20lg definition of Df level while I use 10lg, so your values are multiplies by 2 of my ones. Otherwise the results are very close. Though some deviations from my Df levels is possible due to linearity issue discussed above and accuracy of FFT computing in your case. Mathematically the problem of linear time warping has only one possible solution, which is implemented in my algo. In the linear case precision of df computing can be compared between our computational methods. Additionally applying non-linear correction of time scale you can achieve better (lower df values). I also tried similar correction method (in time domain) at the beginning of my research. Such non-linear operation is known as canonical time-warping. Similar approach is used in Image Registration, which aimed at matching images in bio-medicine and computer vision [https://en.wikipedia.org/wiki/Image_registration]. I returned to linear transformation because of two reasons:

- I have some doubts that canonical time-warping is deterministic operation and has only one solution. I could not develop a reliable algo for the purpose.
- It turned out that in audio such non-linear correction is not necessary due to the nature of distortions in audio and practical convenience of computing df levels piece-wise with some window. The more so, linear correction allows to see that slow time deformations of signal on diffrograms. In practice they are very rear and usually point to some drawback in circuitry design. Most modern audio solutions do not have this problem.

So, I returned to linear time-warping as more simple, interpretative and helpful in audio.

Concerning precision of computation using FFT I think that some accuracy loss is acceptable in favor of computational efficiency. My algo computes white noise test vector in 7min on my notebook and sine one takes around 25min as the search of minimum is deeper in the case (lower df levels).

Last edited:

#### pkane

##### Major Contributor
Forum Donor
Great results! You probably use 20lg definition of Df level while I use 10lg, so your values are multiplies by 2 of my ones. Otherwise the results are very close. Though some deviations from my Df levels is possible due to linearity issue discussed above and accuracy of FFT computing in your case. Mathematically the problem of linear time warping has only one possible solution, which is implemented in my algo. In the linear case precision of df computing can be compared between our computational methods. Additionally applying non-linear correction of time scale you can achieve better (lower df values). I also tried similar correction method (in time domain) at the beginning of my research. Such non-linear operation is known as canonical time-warping. Similar approach is used in Image Registration, which aimed at matching images in bio-medicine and computer vision [https://en.wikipedia.org/wiki/Image_registration]. I returned to linear transformation because of two reasons:

- I have some doubts that canonical time-warping is deterministic operation and has only one solution. I could not develop a reliable algo for the purpose.
- It turned out that in audio such non-linear correction is not necessary due to the nature of distortions in audio and practical convenience of computing df levels piece-wise with some window. The more so, linear correction allows to see that slow time deformations of signal on diffrograms. In practice they are very rear and usually point to some drawback in circuitry design. Most modern audio solutions do not have this problem.

So, I returned to linear time-warping as more simple, interpretative and helpful in audio.

Concerning precision of computation using FFT I think that some accuracy loss is acceptable in favor of computational efficiency. My algo computes white noise test vector in 7min on my notebook and sine one takes around 25min as the search of minimum is deeper in the case (lower df levels).

Actually, I realized I was using 20log(1-corrcoeff) when I got -500dB results on the sine test file. I found the difference looking at your matlab code, so I changed it to 10log to be consistent. The results I posted were already with 10log.

By the way, I'm very familiar with image registration algorithms, having created astronomical software for aligning multiple frames that are possibly deformed in various ways and differently scaled with noise and distortions. Astronomy and astro photography are my other hobby

DeltaWave is performing a global analysis on the whole file to determine three parameters: initial sample offset, clock drift (linear time error) and amplitude. In this case, I didn't use the non-linear time (or amplitude) error correction since the files are too short. This operation requires quite a few more samples to find a repeating pattern. Most files I normally test with are 60 to 120 seconds in length.

For the next steps I'll run the analysis on a few real test tracks with known amounts of distortion to see how df metric fares against this. Do you recommend using simple test signals (sine waves, white noise, etc.) or real music files? I can do both.

Last edited:

#### digicidal

##### Major Contributor
I'll be the first to admit that much of the math is a bit "beyond my pay-grade"... but I'm slowly working through the paper and the site. It's very interesting and I look forward to the day a more "layman-digestible" format exists (though the site does go a decent way towards this). Thanks for this.

@Tks - I'm right there with you on the rabbit hole thing... happens more than I want to admit - but it usually ends up in some level of comprehension. Significantly more productive than watching a TV show at least in all cases.

OP
S

#### Serge Smirnoff

##### Active Member
Actually, I realized I was using 20log(1-corrcoeff) when I got -500dB results on the sine test file. I found the difference looking at your matlab code, so I changed it to 10log to be consistent. The results I posted were already with 10log.
Still I'm pretty sure we lost x2 multiplication somewhere. We'll find it later for sure.

For the next steps I'll run the analysis on a few real test tracks with known amounts of distortion to see how df metric fares against this. Do you recommend using simple test signals (sine waves, white noise, etc.) or real music files? I can do both.
You can choose/design any samples you like (be creative)). I think 30s tracks of any sample rate are ok. When the test vectors be ready, please, share them so I can process them too. We'll compare the results.

OP
S

#### Serge Smirnoff

##### Active Member
I'll be the first to admit that much of the math is a bit "beyond my pay-grade"... but I'm slowly working through the paper and the site. It's very interesting and I look forward to the day a more "layman-digestible" format exists (though the site does go a decent way towards this). Thanks for this.

#### pkane

##### Major Contributor
Forum Donor
Still I'm pretty sure we lost x2 multiplication somewhere. We'll find it later for sure.

You can choose/design any samples you like (be creative)). I think 30s tracks of any sample rate are ok. When the test vectors be ready, please, share them so I can process them too. We'll compare the results.

Sounds good. I can also post one of the test files already corrected by DeltaWave, so that you can compute the DF value directly, without warping. This will tell us if the difference is due to the df value computation or the warping algorithm. I don't have matlab, otherwise I'd run the test myself

OP
S

yes, good idea

#### pkane

##### Major Contributor
Forum Donor
yes, good idea

Give this a try. This is the #5_out_wn.wav file corrected by DeltaWave. I saved it as 64-bit floating point WAV, since that's the internal format used by DeltaWave: https://www.dropbox.com/s/wikgo7cxibetbkj/#5_out_wn_DW.zip?dl=0

Let me know what you find. My computation produced -18.8dB median df value.

(EDIT: I realized that I posted a delta file instead of the corrected one. Now fixed)

Last edited:
OP
S

#### Serge Smirnoff

##### Active Member
NoWarp df measurements:

14.3661-14.0887-13.8006

Warped df measurements:

-14.3709-14.0912-13.8007

I think if you reduce number of FFT points you will get even better (lower) Df measurements as the reducing hide some details of both signals and their correlation increases.

#### pkane

##### Major Contributor
Forum Donor
NoWarp df measurements:
View attachment 42214
14.3661-14.0887-13.8006

Warped df measurements:
View attachment 42215
-14.3709-14.0912-13.8007

I think if you reduce number of FFT points you will get even better (lower) Df measurements as the reducing hide some details of both signals and their correlation increases.

Great! So it looks like my df calculation is off by about -4dB then. Let me double-check (I am discarding the first and the last 100ms segments, are you doing the same?)

OP
S

#### Serge Smirnoff

##### Active Member
The concept of artifact signatures is important part of df-metric.

As we all use to say - not all distortions are equally annoying. Df level is not an exception. Equal average df levels with some m-signal does not necessarily means that DUTs will sound the same. And vice versa - better df level not always indicates better perceived audio quality. In other words, the dependency holds for some cases but not for the others. Df-metric has the method to discover the cases where the relationship between df levels and subjective scores is strong and the cases where such relationship is absent.

Usually df levels for m-signal are measured with some window (400ms) resulting a sequence of df values, which show how signal is distorted in some DUT with time. Such sequences can be visualized with diffrograms. Below are diffrograms of Chord Hugo 2 and FiiO M11 with the whole track “A Day in the Life - The Beatles”:

Each color corresponds to some df level according to the color scale which is algorithmically defined. We can see that Hugo 2 is more “greener” in general and in some parts of the track - substantially "greener". Hugo 2 is more accurate in reproduction of this waveform.

Such sequence of df levels for some DUT with some signal can be used as artifact signature of the DUT because its df values precisely (with 400ms resolution) register the character of distortion of the signal. The latter can be a set of various t-signals as well - for some “investigations” t-signals work even better than m-signals. But from listener perspective artifact signature computed using m-signal is much more interesting and informative. At this stage of research I use two hours of various music material as m-signal. Artifact signatures of our DUTs looks like follows:

Now we can compare them and measure their similarity. I tried several measures of similarity and they all give close results, so, at the moment I prefer to use distance measure based on Mean Average Error as simplest and easily interpretable, having its scale in dB units. Resulting distance between artifact signatures of Hugo 2 and FiiOM11 is 1.24dB.

Distances between artifact signatures can be visualized in 2D by means of dendrogram. For 11 measured devices it looks like follows:

Devices get into groups naturally according to similarity of their artifact signatures.

3D visualization gives even better representation of the clusters:

(interactive plot for different DUTs is here - http://soundexpert.org/portable-players#artsig)

Based on measurements of portable players and psychoacoustic encoders I can preliminarily conclude that the distance 1.5dB - 2.0dB is critical for relation of df levels to subjective scores. For a group of DUTs with similar artifact signatures (below the critical distance) the relation of df levels to corresponding subjective quality scores is linear or quadratic (not higher). In other words, within such group of DUTs lower df level indicates better perceived audio quality.

More reliable value of the critical distance can be derived from results of listening tests of various DUTs (I need to complete this research - http://soundexpert.org/articles/-/b...asurements-to-predict-listening-test-results-).

It is worth to add that the method works for any DUTs regardless of their nature (black box); yes, for psychoacoustic encoders as well )).

Last edited:

#### Tks

##### Major Contributor
I'll be the first to admit that much of the math is a bit "beyond my pay-grade"... but I'm slowly working through the paper and the site. It's very interesting and I look forward to the day a more "layman-digestible" format exists (though the site does go a decent way towards this). Thanks for this.

@Tks - I'm right there with you on the rabbit hole thing... happens more than I want to admit - but it usually ends up in some level of comprehension. Significantly more productive than watching a TV show at least in all cases.

and

DRINKS!

Is it better then ;P

#### pkane

##### Major Contributor
Forum Donor
Great! So it looks like my df calculation is off by about -4dB then. Let me double-check (I am discarding the first and the last 100ms segments, are you doing the same?)

On a whim, I tried to engage the non-linear phase and amplitude correction. This resulted in a much improved df metric. Not very conclusive, as a larger file size (more samples) is needed to compute this properly, but interesting, nevertheless:

I found the issue with the differences between our results. I looked over the code multiple times, and only at the last moment noticed that I was using natural log instead of base 10! That's now fixed.

The results for just the linear correction for the test that I posted earlier are now much closer to your values.

#### pkane

##### Major Contributor
Forum Donor
@Serge Smirnoff :

Continuing to test with real music recordings. I ran a few tests against recorded DAC/ADC loop files from the collection posted here. This is with non-linear amplitude/phase corrections enabled. What I find is that the median DF value is well correlated with the overall RMS of the difference (shown on the bottom of each screen capture), so I'm not sure the metric is sufficiently different from a simple rms delta value.

#### eliash

##### Senior Member
Interesting approach, especially regarding the significant differences with white noise. Looks to me like a circuit or chip design issue (what do they do differently inside, or is it a measurement issue after all?).

...After stumbling across all the class D amp eval elsewhere here in the forum, it would be interesting to compare good class AB and class D amps in that manor...

#### RayDunzl

##### Grand Contributor
Central Scrutinizer
I've lost the story line here.

What are the graphs above displaying?

Replies
19
Views
1K
Replies
6
Views
2K
Replies
72
Views
4K
Replies
8
Views
510
Replies
20
Views
920