• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Alternative method for measuring distortion

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,849
Likes
37,801
Normally in df-metric there is no necessity to weight various distortions because all of them are already perfectly weighted in output music signal (m-signal) of a DUT. And we can measure and research distortion of that m-signal directly. Technical signals (t-signals) are only useful during development of audio equipment; no need to use them for assessment of audio quality. All required info about the latter is in the output m-signal.

There is another reason why distortion of m-signal has the highest status. Scale of its distortion has a point (around -50dB to my current understanding) when DUT becomes transparent for any listener. In other words, the output signal follows the input one so accurately that human hearing can not discern them. In the end a listener wants to have at the output of his amplifier exactly the same waveform, which he has in file.

So, color scale is absolute as well as the scale of Df values. And yes, some signals are distorted badly even in high quality audio devices, so they are red, in accordance with the measurements )).
To ask questions, when you say a Df level is -50 db is that dbFS of the difference, is that RMS level of the difference vs RMS level of the original, in other words what is - 50 db the ratio of in your way of listing it?
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
OK, so most of the info is "by engineers for engineers", fair enough. I was more thinking in terms of the customers/average users point of view who would possibly be interested in a SQ synthetic measure/estimate that is grounded in science and also would like to know which particular characteristic "fails" an otherwise well-performing device.
Yes, I offer to that average user an aggregated natural (not synthetic)) audio parameter - the level of distortion of two hours of various music material (median of histogram). Hardly any other parameter could be better for predicting perceived audio quality.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
I think I understand. He takes two level matched time aligned files, applies algorithm to minimize phase differences, makes a difference and gets something like this:
Correct.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Are the listener testing results "preference" or "objective" data - or a combination of the two?
SE listening tests of codecs are normal blind tests with artifact amplification for high-bitrate codecs.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Might be interesting if @Serge Smirnoff could look at Deltawave, and try it with his algorithm on some given files and see if the results are the same or very similar to Deltawave results. You can download Deltawave here:
https://deltaw.org/

Oh, and welcome to ASR Serge Smirnoff.
Yes, the core idea is the same - to measure and research difference signal for making inferences about audio quality. I will check it for sure.

Thanks, Blumlein.
 

msmucr

Member
Joined
Aug 16, 2019
Messages
34
Likes
53
Location
Prague, Czech Republic
I admit, I haven't read all accompanying technical articles linked before. Honestly, I'm not really sure about such metrics.

Similar methods are there for quite long time, I recall for example thread at Gearslutz site, where someone collects AudioDiffmaker results from various AD/DA loops.. I personally don't find it very useful, rather misleading. This seems to be similar, just with defined set of various test signals and statistics.
I commonly use residuum analysis after subtraction for various small isolated comparisons of changed components in common chain, debugging of issues, quick checking of transparency etc. and it's indeed very useful for that.
However I was always bit skeptical about single all-encompassing figure based on signal difference level or correlation depth for generic evaluation of audio devices. Or more precisely about relation of such single figure to some perceptual differences.

Such difference is level rough indication about similarity of input and output signal and in general of course, the less difference, the better. But in practice it's not so good indicator for overal comparisons IMO, because there can be multitude of reasons, why signals are different (like all kinds of distortions with very different characteristics, phase shift...) and basically not all those reasons has the same weight in perceptible difference.
So two DUTs can have very similar overal level of diff signal, but one of them will be more transparent sounding.. for example because distortion will have different frequency distribution and different characteristics. Or with different example.. some device would have say 30 dB lower level of diff signal than another one. At first look it can indicate, the second device is vastly inferior to first one, however there can be just minimum phase interpolation filter at its DAC, which cause big difference in diff level, but in listening test, it's not necessarily perceived as problematic.
Also there can be issues with clock drifting of source and capture devices if those aren't synced, there are certainly ways to compensate that in measurement with varying degree of efficiency, but in that case you're also altering effects of clocking at source device, which is still one of important aspects for comparisons.

I'll definitely check those papers.. and please don't take my initial comment as sheer negativity towards your work or "diffing" in general. Just with my previous experiences, it had several natural limitations, which is why I personally wouldn't use it to make any general chart of various audio devices.

Michal
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Such difference is level rough indication about similarity of input and output signal and in general of course, the less difference, the better. But in practice it's not so good indicator for overal comparisons IMO, because there can be multitude of reasons, why signals are different (like all kinds of distortions with very different characteristics, phase shift...) and basically not all those reasons has the same weight in perceptible difference.
So two DUTs can have very similar overal level of diff signal, but one of them will be more transparent sounding.. for example because distortion will have different frequency distribution and different characteristics. Or with different example.. some device would have say 30 dB lower level of diff signal than another one. At first look it can indicate, the second device is vastly inferior to first one, however there can be just minimum phase interpolation filter at its DAC, which cause big difference in diff level, but in listening test, it's not necessarily perceived as problematic.
Also there can be issues with clock drifting of source and capture devices if those aren't synced, there are certainly ways to compensate that in measurement with varying degree of efficiency, but in that case you're also altering effects of clocking at source device, which is still one of important aspects for comparisons.
Right you are, amount of diff signal can be misleading in some cases. And I have the simple criterion for discovering such cases - artifact signatures of tested devices differ too much (more than 2-3dB). In other cases Df levels correlate well to perceived quality scores - http://soundexpert.org/articles/-/b...asurements-to-predict-listening-test-results-
Real example of this approach is in the use case - http://soundexpert.org/articles/-/blogs/audio-quality-of-sbc-xq-bluetooth-audio-codec
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,732
Likes
10,413
Location
North-East
I admit, I haven't read all accompanying technical articles linked before. Honestly, I'm not really sure about such metrics.

Similar methods are there for quite long time, I recall for example thread at Gearslutz site, where someone collects AudioDiffmaker results from various AD/DA loops.. I personally don't find it very useful, rather misleading. This seems to be similar, just with defined set of various test signals and statistics.
I commonly use residuum analysis after subtraction for various small isolated comparisons of changed components in common chain, debugging of issues, quick checking of transparency etc. and it's indeed very useful for that.
However I was always bit skeptical about single all-encompassing figure based on signal difference level or correlation depth for generic evaluation of audio devices. Or more precisely about relation of such single figure to some perceptual differences.

Such difference is level rough indication about similarity of input and output signal and in general of course, the less difference, the better. But in practice it's not so good indicator for overal comparisons IMO, because there can be multitude of reasons, why signals are different (like all kinds of distortions with very different characteristics, phase shift...) and basically not all those reasons has the same weight in perceptible difference.
So two DUTs can have very similar overal level of diff signal, but one of them will be more transparent sounding.. for example because distortion will have different frequency distribution and different characteristics. Or with different example.. some device would have say 30 dB lower level of diff signal than another one. At first look it can indicate, the second device is vastly inferior to first one, however there can be just minimum phase interpolation filter at its DAC, which cause big difference in diff level, but in listening test, it's not necessarily perceived as problematic.
Also there can be issues with clock drifting of source and capture devices if those aren't synced, there are certainly ways to compensate that in measurement with varying degree of efficiency, but in that case you're also altering effects of clocking at source device, which is still one of important aspects for comparisons.

I'll definitely check those papers.. and please don't take my initial comment as sheer negativity towards your work or "diffing" in general. Just with my previous experiences, it had several natural limitations, which is why I personally wouldn't use it to make any general chart of various audio devices.

Michal

Michal, all valid points but AudioDiffMaker is a bit out of date. Please take a look at DeltaWave. This has the ability to adjust for variable group delay, as well as for filter frequency-related attenuation, etc. before computing a null. The results are significantly lower nulls than with DiffMaker, since various additional distortions can be taken into account during processing.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Or with different example.. some device would have say 30 dB lower level of diff signal than another one. At first look it can indicate, the second device is vastly inferior to first one, however there can be just minimum phase interpolation filter at its DAC, which cause big difference in diff level, but in listening test, it's not necessarily perceived as problematic.
Precise time-warping algo completely removes all linear deformations of time axsis of the output signal, they are not accounted. So, minimum phase interpolation filter will not cause big difference in diff level.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
The results are significantly lower nulls than with DiffMaker, since various additional distortions can be taken into account during processing.
BTW, Df levels are the lowest possible values for any two given waveforms (in digital domain). Thanks to iterative search for global minimum, which is always one and can be found with any required accuracy (currently 1e-4 dB).
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,732
Likes
10,413
Location
North-East
Precise time-warping algo completely removes all linear deformations of time axsis of the output signal, they are not accounted. So, minimum phase interpolation filter will not cause big difference in diff level.

You'd think that, but not all devices have linear deformations. Here is a good example. I don't know what kind of filter caused this, but it certainly isn't correctable by simple time warping algorithm. Blue is the phase difference plot between the original file and file played back through the DUT:

1575849438599.png
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
Here is a good example. I don't know what kind of filter caused this, but it certainly isn't correctable by simple time warping algorithm. Blue is the phase difference plot between the original file and file played back through the DUT:
Yes, in most cases deformation of time scale is not linear. But with 400ms window of Df computing this is not a problem in real life because those non-linear time deformations are usually slow and within 400ms time window can be considered as linear. If such non-linearity occurs within 400ms window then it really increases Df level. Technically it is possible to find real Df level in such case by gradually decreasing time window. But 400ms is a pretty big time period for human hearing, there is high probability that such time deformation of the signal will be perceived. So, it should be registered/accounted. Further analysis of artifact signatures will show if such time distortion is important or not.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,732
Likes
10,413
Location
North-East
Yes, in most cases deformation of time scale is not linear. But with 400ms window of Df computing this is not a problem in real life because those non-linear time deformations are usually slow and within 400ms time window can be considered as linear. If such non-linearity occurs within 400ms window then it really increases Df level. Technically it is possible to find real Df level in such case by gradually decreasing time window. But 400ms is a pretty big time period for human hearing, there is high probability that such time deformation of the signal will be perceived. So, it should be registered/accounted. Further analysis of artifact signatures will show if such time distortion is important or not.

Sure. On a short interval the phase will not be nearly as big an issue. What is interesting though, in testing DeltaWave over hundreds of test files, correcting for non-linear phase differences improved the computed delta dramatically. Off the cuff, many of the recordings produced a difference closer to -80dB to -90dB RMS over the entire file, compared to only around -50 to -60dB without the non-linear phase corrections.

Again, most likely this will not so dramatic on a 400ms clip. This should be easy to test. If you don't object, I could try to add the df type of measure to DeltaWave to see how it will perform. All the necessary computations are already done by the software, except on a larger time scale.
 
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
What is interesting though, in testing DeltaWave over hundreds of test files, correcting for non-linear phase differences improved the computed delta dramatically. Off the cuff, many of the recordings produced a difference closer to -80dB to -90dB RMS over the entire file, compared to only around -50 to -60dB without the non-linear phase corrections.
This is how I see the time inconsistency of signal on diffrogram

iBassoDX50_sine12.5k_mono_100-78.9035-64.8288-46.2091.png


The image name: iBassoDX50_sine12.5k_mono_100-78.9035-64.8288-46.2091.png
100 = time window in ms
-78.9035-64.8288-46.2091 = Min Median Max of Df levels (excluding the first and the last Df levels of the signal as they are almost always erroneus due to edge effects)

Again, most likely this will not so dramatic on a 400ms clip. This should be easy to test. If you don't object, I could try to add the df type of measure to DeltaWave to see how it will perform. All the necessary computations are already done by the software, except on a larger time scale.
Cool! If you need any details I'm ready. The problem, which I see is time warping algo. You definitely use a different one; this can affect resulting Df levels. Unfortunately my algo is computationally very intensive (12 hours to compute histogram for df-slide) and I have no idea how to make it more efficient. On the other hand it robustly works with any signals, does not require any adjustments for time-warping and returns the lowest possible Df level for a given two signals and time window. This is the beauty of using liner-only phase/pitch correction. So, I probably have so-called reference implementation of the required processing. Matlab code is here - http://soundexpert.org/articles/-/blogs/visualization-of-distortion#part3
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,732
Likes
10,413
Location
North-East
This is how I see the time inconsistency of signal on diffrogram

View attachment 41948

The image name: iBassoDX50_sine12.5k_mono_100-78.9035-64.8288-46.2091.png
100 = time window in ms
-78.9035-64.8288-46.2091 = Min Median Max of Df levels (excluding the first and the last Df levels of the signal as they are almost always erroneus due to edge effects)


Cool! If you need any details I'm ready. The problem, which I see is time warping algo. You definitely use a different one; this can affect resulting Df levels. Unfortunately my algo is computationally very intensive (12 hours to compute histogram for df-slide) and I have no idea how to make it more efficient. On the other hand it robustly works with any signals, does not require any adjustments for time-warping and returns the lowest possible Df level for a given two signals and time window. This is the beauty of using liner-only phase/pitch correction. So, I probably have so-called reference implementation of the required processing. Matlab code is here - http://soundexpert.org/articles/-/blogs/visualization-of-distortion#part3

My implementation is in the frequency domain, and takes a minute or less for a 2-3min file on a modern PC. It'll take longer to do the histogram for multiple 400ms sections, but I suspect it will not be much more than a few minutes. Do you overlap the sections (and if so, by how much?) and is 400ms the best size or should I make this a variable setting?

Also, can you please point me to a few test files with the corresponding df results using your method? It'll make it easier for me to validate my implementation and to see if we are getting similar results.
 
Last edited:
OP
S

Serge Smirnoff

Active Member
Joined
Dec 7, 2019
Messages
240
Likes
136
I do not overlap the sections, but I plan to implement this. It does not affect Df values but it is important for generating correct audio files (time-warped).

Different windows are helpful for psycho-acoustic research; I implemented "any value" solution; for the purposes of testing audio 50 and 400ms are enough.

A few pairs of 30s test signals will be OK?
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,732
Likes
10,413
Location
North-East
I do not overlap the sections, but I plan to implement this. It does not affect Df values but it is important for generating correct audio files (time-warped).

Different windows are helpful for psycho-acoustic research; I implemented "any value" solution; for the purposes of testing audio 50 and 400ms are enough.

A few pairs of 30s test signals will be OK?

That would be perfect!
 
Top Bottom