• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Does Phase Distortion/Shift Matter in Audio? (no*)

but most members consider the difference inaudible
Can't audible or doesn't it irritate? I can listen to digitaled vinyl with pleasure, the noise in which will be heard by everyone. A slight noise at the limit of audibility may not interfere. But whether it will be so or not for any listener always - we do not know. Therefore, the boundary requirements are set -110 dB Noise, which with a probability of 99.99% will not be heard by anyone and never. To avoid potential problems. Especially since it costs nothing. However, topic not about noise, but about phase, so I will not continue.
 
Why those answers stating 70 dB?
You are getting different *opinions* not consensus. My opinion, backed by research into what is considered a dead silent system, while playing back 120 dBSPL peak, is that you want a SINAD of 115 dB (give or take). This is possible today with very low cost DACs. No one gives you a discount for getting a 90 dB DAC.
 
Don't we already have the data resulting from experiments in this thread and https://www.audiosciencereview.com/...ility-of-group-delay-at-low-frequencies.8571/. I came to the same conclusions as @KSTR .
Not that I see. Go ahead and listen blind a few times and rate which one sounds better and report back. Pure detection is not where it is at.

P.S. I stopped reading the thread after the first few pages. Did the issues with frequency response and such get corrected?
 
You are getting different *opinions* not consensus. My opinion, backed by research into what is considered a dead silent system, while playing back 120 dBSPL peak, is that you want a SINAD of 115 dB (give or take). This is possible today with very low cost DACs. No one gives you a discount for getting a 90 dB DAC.
Got it! In fact one of the first thing I did when joining ASR is watching your YouTube videos regarding DAC measurements, so I had those numbers in mind.

BTW were excellent, I continue with other series now, congratulations for your efforts.
 
You are getting different *opinions* not consensus. My opinion, backed by research into what is considered a dead silent system, while playing back 120 dBSPL peak, is that you want a SINAD of 115 dB (give or take). This is possible today with very low cost DACs. No one gives you a discount for getting a 90 dB DAC.

Would you be able to expand on this please? I would have thought that -115dB would be masked by music and be completely inaudible. Even lower SINAD's would be masked by music, or even the noise floor of acoustic recordings.
 
Not that I see. Go ahead and listen blind a few times and rate which one sounds better and report back. Pure detection is not where it is at.

P.S. I stopped reading the thread after the first few pages. Did the issues with frequency response and such get corrected?

Like this topic is being sidetracked it’s difficult to keep up. But we do have a few people who did blind tests which confirm group delay in low frequencies is audible. In my case it’s a DSP filter I'm using to correct the GD of vented bookshelf speakers, which I easily can switch on and off via software. With some tracks the difference is very clear, with others it’s not noticeable.

Not sure about that problem with frequency response, I don’t recall the whole discussion. For the filter that I was using I know I had to watch out for pre-ringing (so I didn't fully correct GD, only brought it down considerably). I should gather some more data, like applying the filter in software on one of the tracks where the difference is audible so we can analyze the signals, and make a recording in my room.
 
Risking to repeat myself...
When we did phase distortion audibilty tests (professional blind testing, mind you) more than a decade ago, the follwowing perceptual effects were noted when introducing a "typical" 4-way speaker XO allpass response to systems that were made (close to) perfect by proper DRC:
  • timbre change in the bass (this is the easiest one)
  • "speed" and punch of bass transients (also easy with the used 4th order sub<-->mains XO)
  • perceived frequency response balance of steady-state signals vs. wide-band transients
  • general soundstage gestalt and coherence, mostly sharpness of discrete phantom sources vs "3D-quality" of diffuse content (reverb tails etc) giving different amounts of listening fatigue

Similar differences were found when switching polarity (with the corrected setup), except for the speed and the frequency response balance things.

I personally repeated all this testing many times over the course of years and the outcome always was the same:
A system without phase distortion never sounded worse to me than one with phase distortion and typically sounded better, more enjoyable. The differences are subtle at first sight but do make a difference, notably in the long term and when you have a good starting point (DRC is a must, IHMO).
YMMV, but at least for me the topic is fully settled. I simply don't care anymore if there is any publicly accessible "body of evidence" or not.
 
Recently I performed blind tests with the tracks proposed before, with no correction, 20 ms and 100 ms.

With no surprise I was very unpleasant with no correction and also 20 ms, I have a strong reflex (specially with headphones) when delays left-right are present bending the delayed side.

Is a poor studied postural echo location present in some blind people or low vision patients, apparently we help our verticality partially by processing delay between sound reflected from the floor to each ear.

Theoretically the brain can find 1 ms difference inter lateral, but was only able to recognize 100 ms correction.
 
YMMV, but at least for me the topic is fully settled. I simply don't care anymore if there is any publicly accessible "body of evidence" or not.
That's fine for you, but unfortunately doesn't mean much to anyone else without said body of evidence. If we accepted such anecdotal evidence, I'd need to swap out the cabling in my walls for $$$$ silver, cryogenically treated, directional ones. Get some hand-built wooden boxes full of dirt to ground my system. Replace my speakers with full-range drivers to eliminate evil crossovers. The list goes on.
 
That's fine for you, but unfortunately doesn't mean much to anyone else without said body of evidence. If we accepted such anecdotal evidence, I'd need to swap out the cabling in my walls for $$$$ silver, cryogenically treated, directional ones. Get some hand-built wooden boxes full of dirt to ground my system. Replace my speakers with full-range drivers to eliminate evil crossovers. The list goes on.

I try to assess the quality of such evidence though the person's posting history (that I'm familiar with).
As I've become more knowledgeable about audio, it has become easier to ascertain the validity and potential bias in such evidence.
(Frankly, I think we even have to assess formal academic research in much the same manner.)

I don't think it's fair to relate clear audio snake oil examples, with sincere attempts at good science trying to find true audio improvements.
fwiw, I've learned to appreciate and value @KSTR's posts

We all have different levels of audio aspirations, and rightly so eh? :)
 
Last edited:
That's fine for you, but unfortunately doesn't mean much to anyone else without said body of evidence.
That's why I want to encourage everyone to conduct these tests and let me emphasize this is really easy to implement compared to ABX'ing anything that involves changed hardware.
The only pre-requisite is that you have proper DRC running (and I'm assuming anybody who is serious about HiFi *is* using DRC anyway) or alternatively, use headphones -- but those have significant drawbacks, above all the completely unnatural spacial representation.
Let me close with the #1 rule when doing audio testing in general: use low volumes. The moment adrenalin kicks in from sheer loudness and stapedius reflex is setting in, almost everything sound exciting ;-) And rule #2: take your time.
 
You are assuming that the impulse is band limited therefore a band limited signal can create it.
You can get a digital unit impulse from a band limited continuous time impulse.
I decided to write an Octave script (will probably work in MATLAB as well) to demonstrate what I mean and to get an idea of how close one might be able to get to an ideal digital unit impulse with a real ADC. The script does the following:
  1. Generates a good approximation of an analog impulse with a minimum phase first-order low pass response at a very high sample rate (many MHz).
  2. Generates a windowed sinc filter at the same sample rate with a cutoff of exactly half the final sample rate. This represents the ADC's antialiasing filter.
  3. Convolves the "analog" impulse with the windowed sinc filter.
  4. Decimates to get the digital output samples.
  5. Calculates the residual compared to a digital unit impulse and plots some stuff.
Here's the code:
Code:
os_fact = 100;  # oversampling factor
imp_bw  = 10;   # impulse bandwidth relative to antialiasing filter
# offsets:
#   imp_bw=5:  (~8-bit null)
#     os_fact=100: +0.661
#   imp_bw=10: (~10-bit null)
#     os_fact=100: +0.539
#   imp_bw=82: (~16-bit null)
#     os_fact=10:  +0.4739
#     os_fact=100: +0.498
offset = -os_fact/pi+0.539;

x = [0:os_fact*imp_bw*100-1];
# finite-bandwidth impulse (first-order low pass response)
y_imp = exp(-x/os_fact*pi)/os_fact*pi;
# windowed sinc antialiasing filter
y_aa = sinc((x-length(x)/2-offset)/os_fact/imp_bw) .* blackman(length(x))';

# convolve impulse with antialiasing filter
y_flt = fftconv(y_imp, y_aa);
y_flt = y_flt / max(y_flt);  # normalize
# decimate to get output samples
y_out = y_flt(1:os_fact*imp_bw:length(y_flt));

# residual (output minus ideal unit impulse)
residual = horzcat(y_out(1:50), y_out(52:101));
max_residual_dBFS = 20*log10(max(abs(residual)))

fig = figure;
# octave doesn't seem to like plotting a huge number of points...
plot(x(1:imp_bw:length(x))/os_fact/imp_bw, y_aa(1:imp_bw:length(y_aa)));
hold on;
plot([0:100], y_out(1:101));
hold off;
legend('Antialiasing filter', 'Output samples');
title(sprintf('Impulse Response — %dx Analog Bandwidth', imp_bw));
xlabel('Output Sample');
grid on;
grid minor;
print(fig, 'impulse.png', '-S800,600');

out_mag = 20*log10(abs(fft(y_out))(1:length(y_out)/2+1));
semilogx([1:length(out_mag)-1]/(length(out_mag)-1), out_mag(2:length(out_mag)));
title(sprintf('Digital Output Magnitude — %dx Analog Bandwidth', imp_bw));
xlabel('Normalized Frequency');
ylabel('Magnitude (dBFS)');
grid on;
print(fig, 'impulse_mag.png', '-S800,600');

Audio ADCs will generally have a 2nd- or 3rd-order analog low pass with a cutoff of a few hundred kHz. The AK5578 datasheet, for example, suggests a 2nd-order filter with Fc=351kHz. My script uses a 1st-order response for simplicity. What matters, I believe, is the magnitude and phase error around Fs/2 at the final sample rate.

It does turn out that more analog bandwidth is required than I was originally thinking in order to get a result accurate to 16 bits: 82 times the digital bandwidth. For Fs=44.1kHz, this is of course about 1.8MHz, which is beyond what you'd likely find in any audio ADC. Something on the order of 10 bits (around -60dBFS) or a bit more is probably roughly the limit. This requires 10x the digital bandwidth or 220kHz for Fs=44.1kHz. Plots for 10x bandwidth (-59.91dBFS residual):

impulse_10x.png

impulse_mag_10x.png


Obviously, precise alignment to a sampling instant is required to get this result. If the pulse isn't aligned, you'll get something more like this instead:

impulse_10x_misaligned.png

impulse_mag_10x_misaligned.png


The change in magnitude response with time offset is, I think, related to the regularity problems with these kinds of "improper" antialiasing filters that @j_j mentioned previously.

So, in conclusion: Will you get a good approximation of a digital unit impulse from an audio ADC by accident? Not likely (and especially not from an acoustic source). But it'd probably be possible to get something accurate to ~10 bits using a well-timed, wide-bandwidth electronic pulse. If you were to do something weird like pre-warp the pulse to compensate for the ADC's analog response, this could be further improved.
 
It does turn out that more analog bandwidth is required than I was originally thinking in order to get a result accurate to 16 bits:
You think??? :)
82 times the digital bandwidth. For Fs=44.1kHz, this is of course about 1.8MHz, which is beyond what you'd likely find in any audio ADC. Something on the order of 10 bits (around -60dBFS) or a bit more is probably roughly the limit. This requires 10x the digital bandwidth or 220kHz for Fs=44.1kHz. Plots for 10x bandwidth (-59.91dBFS residual):
I ported your script to Matlab. May have made a typo but this is the results I am getting:

1755917332435.png


Samples are aligned but there is no pure impulse as you showed.
 
Samples are aligned but there is no pure impulse as you showed.
This is similar to the unaligned case I posted, so no, it's not aligned properly. In any case, this whole thing is obviously entirely academic (which I never denied, to be clear). You'd certainly never see anything like what I showed from an acoustic source.
 
how about this:
clc
clear all
close all
f=[0 20 22.05 48]/48;
a=[1 1 0 0 ];
w=[1 15];

bb=firpm(350,f,a,w);

bbt=fft(bb,16384);

bbt=abs(bbt);
t=max(bbt);
bbt=bbt/t;
bbt=20*log10(bbt);

subplot (4, 1, 1)
plot(bbt(1:8192))
axis([ 1 8192 -150 5])
subplot(4,1,2)
plot(bbt(1:8192))
axis([ 1 8192 -.0005 .0005])

f=[0 30 192 384]/384;
a=[1 1 0 0];
w=[1 10];
subplot(4,1,3)

bb=firpm(32,f,a,w);
bbt=fft(bb,16384);
subplot(4,1,3);
bbt=abs(bbt);
t=max(bbt);
bbt=bbt/t;
bbt=20*log10(bbt);
plot(bbt(1:8192))

axis([ 1 8192 -150 5])
subplot(4,1,4)
plot(bbt(1:8192))
axis([ 1 8192 -.0005 .0005])

That's a complete design for an 8x oversampling in 2 steps, 44 to 96, and then up to 384

I think the plot makes it great (**(&( clear that this is pretty much better than electronics can do?

Also, all of this bit about "phase shift", but when an impulse response is symmetric THERE IS NO PHASE SHIFT only PURE DELAY so why is this in the phase shift thread, and yes, intraaural phase shift CAN be audible. Not small amounts per ERB, but large amounts, yes.

I suppose I should also mention that I did not add the actual upsampling filter from 96 to 384, but that can also be added into the second filter if you want to do that, at a cost of a bit more flops.
 
Last edited:
Samples are aligned but there is no pure impulse as you showed.
Something that confused me once, relating to impulse response of sharp linear phase lowpass filters (and some commercial attempts to "fix" impulse response):
You can make a frequency domain data file with amplitude=0 and phase=0) at all frequencies equally spaced up to some frequency 1/fs, take the discrete IFFT of that and see a time domain sampled result showing only a single impulse at time=0 (that is, zero at each sample point except at t=0). No pre-ring or post-ring! So why can't a DAC do that too if it has flat mag and phase response? Are DACs broken?

The punch line is that there really IS pre and post ring, you just aren't looking at the mathematically modelled result other than at exact instances of the sampled data (times equal to n*1/fs) where the ring waveform then just happens to be at zero crossings . In between those points the amplitude is not 0 (as would be seen if result is low pass filtered -- i.e, reconstructed -- to reveal what lies between what were the sample points). How to show that ringing was really there mathematically? Just delay your original frequency response by something less than 1/fs time (modelled by shifting the phase at each point by, for example, +-1/2fs *frequency of each freq data point), then IFFT that -- and boom, the time domain result of that has the ringing in its full glory, from same flat phase and still linear phase response, only delayed (or advanced) by a bit Apparent ringing will disappear again if you do the same with any delay that puts the impulse point right on a sample time.

Don't forget the assumptions when doing math on sampled data -- you are looking only at certain time points.
 
... The punch line is that there really IS pre and post ring, you just aren't looking at the mathematically modelled result other than at exact instances of the sampled data (times equal to n*1/fs) where the ring waveform then just happens to be at zero crossings . In between those points the amplitude is not 0 (as would be seen if result is low pass filtered -- i.e, reconstructed -- to reveal what lies between what were the sample points). ...
Another way to think about this is that sampling and reconstruction is consistent and doesn't rely on "luck". If you hit "record" and sampling started a moment later, the sample points will hit different spots along the waveform. But this doesn't matter. The entire sequence of sampling points can be shifted left and right (earlier and later) along the waveform, and you get the exact same result when reconstructing it.

Conversely, if you ever see a situation where the reconstructed wave depends on where the sample points hit the waveform, you know something is wrong. Either it wasn't sampled properly (for example not bandwidth limited before sampling) or it wasn't reconstructed properly.
 
Conversely, if you ever see a situation where the reconstructed wave depends on where the sample points hit the waveform, you know something is wrong. Either it wasn't sampled properly (for example not bandwidth limited before sampling) or it wasn't reconstructed properly.

Oh yeah. I've made that point, oh, about a thousand times or so. It does not stick. But it's true.
 
Back
Top Bottom