Beta-test: DeltaWave Null Comparison software

pkane · Jun 4, 2022

BeerBear said:
This is maybe a more general question, but I'm trying to make sense of the various null depth figures.

I've been used to loading files in a DAW, inverting the phase of one of the two and playing them together.

If I do this for a couple of 24bit files I have here, I get a signal at -138dB (the only difference between the two is dither).
But DeltaWave says the null depth is 120dB. And the correlated null depth is 193dB.
What's the explanation for that difference? And I guess correlated null depth measures something else... but what?

Null depth is the average (RMS) of the difference waveform between the two tracks.

Correlated null depth is a measure of how well the two tracks match in time. The larger the value, the smaller the timing errors. It makes sense that there are no timing errors between a dithered and undithered tracks, so the correlated null is very large: the larger it is, the smaller the timing error.

BeerBear · Jun 4, 2022

pkane said:
Null depth is the average (RMS) of the difference waveform between the two tracks.

How is the difference waveform produced or calculated?
And how does it relate to inverting the phase and mixing of the two tracks?

pkane · Jun 4, 2022

BeerBear said:
How is the difference waveform produced or calculated?
And how does it relate to inverting the phase and mixing of the two tracks?

Same idea, except that DeltaWave removes linear differences between the two tracks that would otherwise corrupt the difference file. Things like different signal levels, different starting points (delay) to a tiny fraction of a sample, any differences in clock rates and even difference in sampling rates. DW can also remove some of the non-linear differences, if that's of any interest.

BeerBear · Jun 4, 2022

Yeah, I see how those features can help when comparing some files.

I should have asked this before, but if Null Depth is the RMS of the difference waveform, why is its (absolute) value not the same as that of Difference (RMS)?

And is there a way in DW to show the peak difference (not RMS), like the -138dB value in my example that you'd get from phase inverting and mixing?

pkane · Jun 4, 2022

BeerBear said:
Yeah, I see how those features can help when comparing some files.

I should have asked this before, but if Null Depth is the RMS of the difference waveform, why is its (absolute) value not the same as that of Difference (RMS)?

And is there a way in DW to show the peak difference (not RMS), like the -138dB value in my example that you'd get from phase inverting and mixing?

I use the terms null depth (RMS) and difference RMS interchangeably. Correlated null is computed differently to emphasize timing errors.

You can see the peak value of the difference on the Δ Waveform plot.

syn08 · Jun 5, 2022

Very interesting software... I'm far from understanding all the output, but until I'll educate myself, got a question: what would be the criteria to define a "good" remastered version of a pre-digital era master tape?

I tried a few examples (I have both the original CD and the remastered version) and noted a pattern as below (blue is the original, white is the re-mastered version). To my ears, both sound quite the same, which sounds about right given the major differences (If I could call them so) start around 18KHz.

What else should I look to estimate the re-mastering quality?

pkane · Jun 5, 2022

syn08 said:
Very interesting software... I'm far from understanding all the output, but until I'll educate myself, got a question: what would be the criteria to define a "good" remastered version of a pre-digital era master tape?

I tried a few examples (I have both the original CD and the remastered version) and noted a pattern as below (blue is the original, white is the re-mastered version). To my ears, both sound quite the same, which sounds about right given the major differences (If I could call them so) start around 18KHz.

View attachment 210858

What else should I look to estimate the re-mastering quality?

First, you'll want to perform an alignment that will remove level differences and timing differences. Look at Aligned Spectrum then, to see how the frequency/EQ is different between the two. You can check the delta of spectra (difference in frequency response) plot to see what frequencies differ and by how much, exactly. PK Metric plot is something that will estimate how audible these differences are based on some estimates of audibility and frequency masking. You can listen to the difference between the two files to hear for yourself what was added or removed in the processing. And then, you can try using DeltaWave to undo some of the changes, such as frequency response and level differences to see if that's all there is to the remastering (using non-linear EQ functions in DW). There is plenty more to do and and check, including various AB and ABX testing functions, phase difference plots, etc.

dc655321 · Jun 5, 2022

syn08 said:
What else should I look to estimate the re-mastering quality?

How badly bungled the compression was done!

Looks like the remaster has significantly more energy over most of the useful spectrum. Was the dynamic range terribly squashed?

syn08 · Jun 5, 2022

pkane said:
First, you'll want to perform an alignment that will remove level differences and timing differences. Look at Aligned Spectrum then, to see how the frequency/EQ is different between the two. You can check the delta of spectra (difference in frequency response) plot to see what frequencies differ and by how much, exactly. PK Metric plot is something that will estimate how audible these differences are based on some estimates of audibility and frequency masking. You can listen to the difference between the two files to hear for yourself what was added or removed in the processing. And then, you can try using DeltaWave to undo some of the changes, such as frequency response and level differences to see if that's all there is to the remastering (using non-linear EQ functions in DW). There is plenty more to do and and check, including various AB and ABX testing functions, phase difference plots, etc.

I guess I'm missing a sense of what would be considered a "big" or "small" difference in all these plots... Again, they sound exactly the same to me, after some level matching. Loaded the files in a random generator and ran a paired preference test on the first 30 seconds or so. I got 7-3 preference for the re-mastered version, which is not really statistically significant, don't have the time now to significantly increase the samples number.

Not sure if this was previously answered, how is the PK metric calculated?

Note: both files don't play properly in WASAPI exclusive mode; sound is jagged, like, play stops every 200mS or so, for long enough to make impossible to listen to. Works fine without exclusive mode. Win 10 latest patched version.

Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -500dB
Correct Non-linearity: False
Correct Drift:True, Precision:30, Subsample Align:True
Non-Linear drift Correction:False
Upsample:False, Window:Kaiser
Spectrum Window:Kaiser, Spectrum Size:32768
Spectrogram Window:Hann, Spectrogram Size:4096, Spectrogram Steps:2048
Filter Type:FIR, window:Kaiser, taps:262144, minimum phase=False
Dither:False bits=0
Trim Silence:False
Enable Simple Waveform Measurement: False

Discarding Reference: Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s

Initial peak values Reference: -6.563dB Comparison: -0.266dB
Initial RMS values Reference: -23.558dB Comparison: -15.638dB

Null Depth=5.459dB
Phase inverted
X-Correlation offset: -16789 samples
Residual error too large 3.728246898. Trying alternative drift correction method

Drift computation quality #2: Excellent (1.26μs)

Trimmed 0 samples ( 0.00ms) front, 0 samples ( 0.00ms end)

Final peak values Reference: -6.563dB Comparison: -8.889dB
Final RMS values Reference: -23.558dB Comparison: -24.32dB

Gain= 8.7444dB (2.7367x) DC=0 Phase offset=-380.702948ms (-16789 samples)
Difference (rms) = -20.92dB [-24.92dBA]
Correlated Null Depth=14.4dB [20.67dBA]
Clock drift: -0.34 ppm

Files are NOT a bit-perfect match (match=0.02%) at 16 bits
Files match @ 49.9921% when reduced to 3.66 bits

---- Phase difference (full bandwidth): 90.2281195832894°
0-10kHz: 105.78°
0-20kHz: 92.01°
0-24kHz: 90.23°
Timing error (rms jitter): 20.4ms
PK Metric (step=400ms, overlap=50%):
RMS=-21.6dBFS
Median=-22.9
Max=-13.8

99%: -15.47
75%: -19.4
50%: -22.94
25%: -32.04
1%: -57.59

gn=0.365409454656468, dc=0, dr=-3.38575459151237E-07, of=-16789.0000002714

DONE!

Signature: a911b4af92b90891658be9eb050a0679

RMS of the difference of spectra: -86.1671776658777dB
DF Metric (step=400ms, overlap=0%):
Median=0dB
Max=2.3dB Min=-8.4dB

1% > -4.09dB
10% > -0.82dB
25% > -0.36dB
50% > 0.02dB
75% > 0.33dB
90% > 0.78dB
99% > 1.57dB

Linearity 2.6bits @ 0.5dB error

pkane · Jun 5, 2022

syn08 said:
I guess I'm missing a sense of what would be considered a "big" or "small" difference in all these plots... Again, they sound exactly the same to me, after some level matching. Loaded the files in a random generator and ran a paired preference test on the first 30 seconds or so. I got 7-3 preference for the re-mastered version, which is not really statistically significant, don't have the time now to significantly increase the samples number.

Not sure if this was previously answered, how is the PK metric calculated?

Note: both files don't play properly in WASAPI exclusive mode; sound is jagged, like, play stops every 200mS or so, for long enough to make impossible to listen to. Works fine without exclusive mode. Win 10 latest patched version.

Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -500dB
Correct Non-linearity: False
Correct Drift:True, Precision:30, Subsample Align:True
Non-Linear drift Correction:False
Upsample:False, Window:Kaiser
Spectrum Window:Kaiser, Spectrum Size:32768
Spectrogram Window:Hann, Spectrogram Size:4096, Spectrogram Steps:2048
Filter Type:FIR, window:Kaiser, taps:262144, minimum phase=False
Dither:False bits=0
Trim Silence:False
Enable Simple Waveform Measurement: False

Discarding Reference: Start=0s, End=0s
Discarding Comparison: Start=0s, End=0s

Initial peak values Reference: -6.563dB Comparison: -0.266dB
Initial RMS values Reference: -23.558dB Comparison: -15.638dB

Null Depth=5.459dB
Phase inverted
X-Correlation offset: -16789 samples
Residual error too large 3.728246898. Trying alternative drift correction method

Drift computation quality #2: Excellent (1.26μs)

Trimmed 0 samples ( 0.00ms) front, 0 samples ( 0.00ms end)

Final peak values Reference: -6.563dB Comparison: -8.889dB
Final RMS values Reference: -23.558dB Comparison: -24.32dB

Gain= 8.7444dB (2.7367x) DC=0 Phase offset=-380.702948ms (-16789 samples)
Difference (rms) = -20.92dB [-24.92dBA]
Correlated Null Depth=14.4dB [20.67dBA]
Clock drift: -0.34 ppm

Files are NOT a bit-perfect match (match=0.02%) at 16 bits
Files match @ 49.9921% when reduced to 3.66 bits

---- Phase difference (full bandwidth): 90.2281195832894°
0-10kHz: 105.78°
0-20kHz: 92.01°
0-24kHz: 90.23°
Timing error (rms jitter): 20.4ms
PK Metric (step=400ms, overlap=50%):
RMS=-21.6dBFS
Median=-22.9
Max=-13.8

99%: -15.47
75%: -19.4
50%: -22.94
25%: -32.04
1%: -57.59

gn=0.365409454656468, dc=0, dr=-3.38575459151237E-07, of=-16789.0000002714

DONE!

Signature: a911b4af92b90891658be9eb050a0679

RMS of the difference of spectra: -86.1671776658777dB
DF Metric (step=400ms, overlap=0%):
Median=0dB
Max=2.3dB Min=-8.4dB

1% > -4.09dB
10% > -0.82dB
25% > -0.36dB
50% > 0.02dB
75% > 0.33dB
90% > 0.78dB
99% > 1.57dB

Linearity 2.6bits @ 0.5dB error

View attachment 210868

View attachment 210870

View attachment 210872

One thing that you need to address: you're comparing left to right channels. That's not going to help much to determine real differences:

syn08 said:
I guess I'm missing a sense of what would be considered a "big" or "small" difference in all these plots... Again, they sound exactly the same to me, after some level matching. Loaded the files in a random generator and ran a paired preference test on the first 30 seconds or so. I got 7-3 preference for the re-mastered version, which is not really statistically significant, don't have the time now to significantly increase the samples number.

The results are meant to measure differences, not to tell you if you can hear them -- for that you'll need to know your own audibility thresholds. You can tell what was changed and by how much in the processing. That said, differences in certain part of the spectrum (1-4kHz, for example) can be audible even down to 0.1dB but that depends on listener and material. A 10dB difference at 20kHz may not make any difference at all unless you're 10 years old.

syn08 said:
Not sure if this was previously answered, how is the PK metric calculated?

Here's a discussion on this topic: https://www.audiosciencereview.com/...-error-metric-discussion-and-beta-test.19841/

pkane · Jul 19, 2022

A new (for test only) version 2.0.4 of DeltaWave is now available with just a single new feature: the loopback recorder.

The recorder is added by popular demand (well, OK, at least one user requested it

)

The idea is to make it easy to capture and analyze loopback recordings without ever leaving DeltaWave. The process is simple:

1. Pick a reference file that you want to use to play through the output audio device
2. Click on Record menu
3. From the Recorder screen, pick the output and input devices for loopback capture
4. Press the red Record button

The Recorder will play back the selected file through the output (DAC) audio device, and record it through the input device (ADC). You can always stop the recording by pressing the stop button.

When finished, the recording will be saved to disk and automatically placed in the Compare field on the main DeltaWave window. The match operation will commence immediately if the "Automatic Match" option is checked. The result will be displayed in the Recorder window. Of course, many more detailed results of the comparison can be viewed in the main DeltaView window under all the different tabs, as usual. Here's a video illustrating the process:

Note: the correct sampling rates must be selected (as configured) for WASAPI audio drivers under Windows. Windows volume control and mute setting must be set correctly. ASIO drivers can be configured with any of the supported sampling rates.

MC_RME · Jul 19, 2022

Why can't I give more than one Like here? Fabulous new feature!

Grooved · Jul 19, 2022

Great idea @pkane !

BeerBear · Jul 19, 2022

Thanks for the update!
One thing I noticed with the DeltaWave recording feature (but also before with MultiTone) is that it needs a higher ASIO buffer size, otherwise the audio starts to pop and crackle. It's a bit more than what I'm used to in DAWs and my PC is relatively new. Maybe the CPU usage could be lower during playback/recording? It's not a big deal, though.

pkane · Jul 19, 2022

BeerBear said:
Thanks for the update!
One thing I noticed with the DeltaWave recording feature (but also before with MultiTone) is that it needs a higher ASIO buffer size, otherwise the audio starts to pop and crackle. It's a bit more than what I'm used to in DAWs and my PC is relatively new. Maybe the CPU usage could be lower during playback/recording? It's not a big deal, though.

Yes, probably more so with Multitone, since it's doing a significant number of FFT calculations while playing and capturing audio, which can interfere with low-latency ASIO drivers, and even more so on slower/older computers.

EDIT: just posted a bit more optimized version of Multitone for testing

Tj99 · Jul 27, 2022

So just to understand it right:

- "Level EQ" and "Phase EQ" corrects changes in frequency and phase differences
- "match gain" corrects level differences
- "correct clock drift" corrects clock drifts (which only can happen when comparing files coming from different converters)
- "subsample offset" makes sure the files are aligned properly
- "auto trim start and end" cuts away unnecessary chunk at the beginning/end which could falsify the results
- "remove DC offset" removes potential clicks and pops

And is somebody able to explain, what the point of the following settings is?

- "dither to original bit size"
- "prompt to invert phase"
- "measure simple waveforms"
- "non linear drift correction"
- "Level linearity correction"

Thank you very much.

pkane · Jul 27, 2022

Tj99 said:
So just to understand it right:

- "Level EQ" and "Phase EQ" corrects changes in frequency and phase differences
- "match gain" corrects level differences
- "correct clock drift" corrects clock drifts (which only can happen when comparing files coming from different converters)
- "subsample offset" makes sure the files are aligned properly
- "auto trim start and end" cuts away unnecessary chunk at the beginning/end which could falsify the results
- "remove DC offset" removes potential clicks and pops

And is somebody able to explain, what the point of the following settings is?

- "dither to original bit size"
- "prompt to invert phase"
- "measure simple waveforms"
- "non linear drift correction"
- "Level linearity correction"

Thank you very much.

Warning: all options under "Non-linear calibration" are advanced and should be used very carefully and only if you know what you're doing

Normally, these are best to be left turned off.

- "Level EQ" and "Phase EQ" corrects changes in frequency and phase differences
EQ settings (level and phase) use deconvolution to fix differences between the reference and comparison files. This is an advanced setting, as deconvolution is a process that depends on very low noise and lots of samples to look at. It corrects errors that may be very significant and could easily be audible. These errors may also be non-linear, such as variable group delay, for example, or frequency-dependent ripple due to a poor filter. For any simple comparisons, I recommend these settings to be left off.

- "match gain" corrects level differences
Gain and DC offset removal are linear operations that simply adjust the level and DC offset of the comparison file to best match it to the reference. This is a must when trying to match two files for best null. DC offset is a constant/fixed level difference between the two files.

- "correct clock drift" corrects clock drifts (which only can happen when comparing files coming from different converters)
Exactly. Any time there are two clocks involved, there will be some drift unless a master-clock is used to synchronize them.

- "subsample offset" makes sure the files are aligned properly
Yes, also a must when comparing two analog captures due to slight differences in timing. DW computes the sub-sample delay to a tiny fraction of the sample to ensure best possible null.

- "auto trim start and end" cuts away unnecessary chunk at the beginning/end which could falsify the results
Right. There are a number of converts I've tested that take some time to sync to the incoming signal, leading to initially large errors that then stabilize. DW will check for large errors at the start and the end, and will not use those samples for computation when this is checked.

- "dither to original bit size"
Since DW does calculations to adjust comparison file to match the reference in double-floating point format, there is some quantization error that will occur in the process of converting it back to a WAV file. This option will apply dither to the comparison file after these computations to the number of bits/sample in the reference file. The latest version of DW lets you select dither size or pick '(original)' to match it to reference sample size:

- "prompt to invert phase"
Absolute phase can be flipped between reference and comparison files. Normally DW will automatically detect and fix this difference, but in case you want to know before doing the processing, it'll alert you to this fact. Sometimes noisy data or simple test signals can result in a false inversion indicator. This setting will let you decide if you want to process the reference as inverted or not.

- "measure simple waveforms"
DW was designed to compare complex music/sound files. Simple test tones, such as a sine wave, can be very ambiguous as to where the reference matches the comparison files. In fact, for a sine wave, these may line at every start of a period, with potentially millions of matches. This option uses a different alignment algorithm that isn't going to be confused by simple periodic waveforms. DW will in many cases detect such waveforms even if this setting is not checked and ask you if you want to use this alternate alignment algorithm.

- "non linear drift correction"
Another advanced option, not recommended for simple comparisons. This will fit a curve to the clock drift error. Clock drift correction removes only linear drift. This will attempt to measure and remove non-linear differences between clocks.

- "Level linearity correction"
Yet another advanced option, not recommended for simple comparisons. This will compute a linearity error between the two curves (a transfer function) and will attempt to apply the inverse of this transfer function to correct for any non-linearity differences between them. In theory this can correct for even things like harmonic distortion and IMD. In reality, the linearity error isn't going to be computed precisely enough from noisy data to remove such low-level non-linearities, but larger errors will be corrected given enough samples and clean data.

Hope this helps!

Tj99 · Aug 3, 2022

pkane said:
Warning: all options under "Non-linear calibration" are advanced and should be used very carefully and only if you know what you're doing Normally, these are best to be left turned off.

- "Level EQ" and "Phase EQ" corrects changes in frequency and phase differences
EQ settings (level and phase) use deconvolution to fix differences between the reference and comparison files. This is an advanced setting, as deconvolution is a process that depends on very low noise and lots of samples to look at. It corrects errors that may be very significant and could easily be audible. These errors may also be non-linear, such as variable group delay, for example, or frequency-dependent ripple due to a poor filter. For any simple comparisons, I recommend these settings to be left off.

- "match gain" corrects level differences
Gain and DC offset removal are linear operations that simply adjust the level and DC offset of the comparison file to best match it to the reference. This is a must when trying to match two files for best null. DC offset is a constant/fixed level difference between the two files.

- "correct clock drift" corrects clock drifts (which only can happen when comparing files coming from different converters)
Exactly. Any time there are two clocks involved, there will be some drift unless a master-clock is used to synchronize them.

- "subsample offset" makes sure the files are aligned properly
Yes, also a must when comparing two analog captures due to slight differences in timing. DW computes the sub-sample delay to a tiny fraction of the sample to ensure best possible null.

- "auto trim start and end" cuts away unnecessary chunk at the beginning/end which could falsify the results
Right. There are a number of converts I've tested that take some time to sync to the incoming signal, leading to initially large errors that then stabilize. DW will check for large errors at the start and the end, and will not use those samples for computation when this is checked.

- "dither to original bit size"
Since DW does calculations to adjust comparison file to match the reference in double-floating point format, there is some quantization error that will occur in the process of converting it back to a WAV file. This option will apply dither to the comparison file after these computations to the number of bits/sample in the reference file. The latest version of DW lets you select dither size or pick '(original)' to match it to reference sample size:

View attachment 220733

- "prompt to invert phase"
Absolute phase can be flipped between reference and comparison files. Normally DW will automatically detect and fix this difference, but in case you want to know before doing the processing, it'll alert you to this fact. Sometimes noisy data or simple test signals can result in a false inversion indicator. This setting will let you decide if you want to process the reference as inverted or not.

- "measure simple waveforms"
DW was designed to compare complex music/sound files. Simple test tones, such as a sine wave, can be very ambiguous as to where the reference matches the comparison files. In fact, for a sine wave, these may line at every start of a period, with potentially millions of matches. This option uses a different alignment algorithm that isn't going to be confused by simple periodic waveforms. DW will in many cases detect such waveforms even if this setting is not checked and ask you if you want to use this alternate alignment algorithm.

- "non linear drift correction"
Another advanced option, not recommended for simple comparisons. This will fit a curve to the clock drift error. Clock drift correction removes only linear drift. This will attempt to measure and remove non-linear differences between clocks.

- "Level linearity correction"
Yet another advanced option, not recommended for simple comparisons. This will compute a linearity error between the two curves (a transfer function) and will attempt to apply the inverse of this transfer function to correct for any non-linearity differences between them. In theory this can correct for even things like harmonic distortion and IMD. In reality, the linearity error isn't going to be computed precisely enough from noisy data to remove such low-level non-linearities, but larger errors will be corrected given enough samples and clean data.

Hope this helps!

Thank you very much for this detailed information!
Much appreciated

slartibart · Sep 18, 2022

I'm trying to prove to myself that an mpd player should sound exactly the same as a piCorePlayer by capturing the output of my DAC for both players. I find I get a different correlation depth if I swap the reference and comparison files around. Is that what you would expect?
Also when I try to compare either file with the original FLAC file the program tells me there is a 17 sample offset when I took great care to ensure all three files were aligned to the nearest sample and all exactly the same length.

pkane · Sep 18, 2022

slartibart said:
I'm trying to prove to myself that an mpd player should sound exactly the same as a piCorePlayer by capturing the output of my DAC for both players. I find I get a different correlation depth if I swap the reference and comparison files around. Is that what you would expect?
Also when I try to compare either file with the original FLAC file the program tells me there is a 17 sample offset when I took great care to ensure all three files were aligned to the nearest sample and all exactly the same length.

Sounds like you may have some clock drift. When using separate DAC and ADC, the clocks in the two are not running at exactly the same rate. DeltaWave corrects for this, but the result can be a timing offset between the capture and the original file. Use RMS difference (null) instead of correlation null -- correlation null is an overly sensitive metric designed to indicate timing errors.

Beta-test: DeltaWave Null Comparison software

Master Contributor

Active Member

Master Contributor

Active Member

Master Contributor

Senior Member

Master Contributor

Major Contributor

Senior Member

Master Contributor

Master Contributor

Major Contributor

Addicted to Fun and Learning

Active Member

Master Contributor

Member

Master Contributor

Member

Member

Master Contributor

Similar threads