• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Does Phase Distortion/Shift Matter in Audio? (no*)

Francis Vaughan

Addicted to Fun and Learning
Forum Donor
Joined
Dec 6, 2018
Messages
933
Likes
4,697
Location
Adelaide Australia
The problem with audio/video delay goes back a long way, but things have reversed here.
The problem was that there is latency in the video chain, measured in multiples of the frame period. Video gets buffered up by frame in many places, and the delay can become irksome. When we used DVD/BluRay players as the prime source both the video and audio came off the disk at the same time in the one data stream. The video then went through multiple steps of decompression and various processing steps to improve the quality. Most of these steps work on an entire frame, so it was easy to get quite a few frames worth of delay. TVs did their own processing on the input, and added a frame or two (at least) as well. So AVRs (and TVs, players) all provided a simple delay on the audio to get the sync back. This is easy, as the data rate is low compared to video.
Now we have the opposite issue. Processing of audio becomes the pacing item. So we want to delay the video. AVRs have pretty much moved away from any video processing. There just isn't any need. There is no capability in an HDMI switch (which is all most AVRs have inside now) to delay video. So if you want to delay the video longer than the existing processing chain you have to look elsewhere.
Obviously a HTPC, if it has locally stored content, is able to access that content at arbitrary offsets, and can offset the audio and video in either direction with any desired offset.
Streaming boxes are more interesting. Internally all streamers must include some buffering. Some have a lot. The potential exists for all of them to provide near arbitrary offsets, but whether any provide what is needed, and useful control, is another matter.
An interesting problem is that streaming devices can mess with the video formats, and can uncompress at various rates and resolutions. This can result in different video processing delays, something that seems to cause some amount of grief when aligning audio on some systems.
 

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,335
Likes
6,702
Again, not feasible with that many taps and resulting delays >1000ms. SSPs usually let you delay audio but not video.

Maybe we're talking about different things. What do you mean by not feasible? The audio is perfectly aligned with the video in my system with the AVR. The system can add up to 3(?) full seconds of delay. That should be enough? Maybe I'm misunderstanding the issue. I'll go dig around.
 
Last edited:

Francis Vaughan

Addicted to Fun and Learning
Forum Donor
Joined
Dec 6, 2018
Messages
933
Likes
4,697
Location
Adelaide Australia
The system can add up to 3(?) full seconds of delay.
As above, it can add lots of audio delay, if needed. But use of a big FIR may mean your audio delay is already significantly longer than the intrinsic processing video delay. What the system can't do is reduce audio delay to match the video, which is what is needed. That requires delaying the video, not the audio.
 

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,335
Likes
6,702
Maybe we're talking about different things. What do you mean by not feasible? The audio is perfectly aligned with the video in my system with the AVR. The system can add up to 3(?) full seconds of delay. That should be enough? Maybe I'm misunderstanding the issue. I'll go dig around.

I'm an idiot. The setting I was thinking of in my AVR actually does the opposite of what I was thinking. It delays the video, not the audio ;)

Doing a bit of googling, it's seems that this is actually the more common scenario. It's funny reading old forum posts of people asking how to delay their video, and most of the answers are telling them how to delay the audio :D.

@markus help me understand your situation a bit better. You were trying to use DSP, but couldn't because of this problem? I think you should be able to delay the video using the same player you upload the convolution file to.
 
Last edited:

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,335
Likes
6,702
As above, it can add lots of audio delay, if needed. But use of a big FIR may mean your audio delay is already be significantly longer than the intrinsic processing video delay. What the system can't do is reduce audio delay to match the video, which is what is needed. That requires delaying the video, not the audio.

You slightly beat me, check my latest post.

Now I'm wondering why my video is so delayed. I'm going through 3 different DSP engines, yet still having to delay the video to match. Video source is Apple TV or PS5 hooked into AVR.
 

markus

Addicted to Fun and Learning
Joined
Sep 8, 2019
Messages
681
Likes
783
I'm an idiot. The setting I was thinking of in my AVR actually does the opposite of what I was thinking. It delays the video, not the audio ;)

Doing a bit of googling, it's seems that this is actually the more common scenario. It's funny reading old forum posts of people asking how to delay their video, and most of the answers are telling them how to delay the audio :D.

@markus help me understand your situation a bit better. You were trying to use DSP, but couldn't because of this problem? I think you should be able to delay the video using the same player you upload the convolution file to.

None of that. You might want to go back where the discussion started. The particular correction approach isn't feasible for content that requires low latency, e.g. gaming. Furthermore there's no SSP to support such long user generated FIR filters. And, in my opinion the proposed approach is psychoacoustically questionable.

All of this is also a bit off-topic here.
 

richard12511

Major Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
4,335
Likes
6,702
None of that. You might want to go back where the discussion started. The particular correction approach isn't feasible for content that requires low latency, e.g. gaming. Furthermore there's no SSP to support such long user generated FIR filters. And, in my opinion the proposed approach is psychoacoustically questionable.

All of this is also a bit off-topic here.

Gotcha.

I think Mitch is proposing EQ via a convolution file in jRiver, so gaming and streaming should still be perfectly synced. Unless I'm mistaken, the only content that would need 1000ms video delay would be the content run through that EQ engine(ie just local music and movies), and JRiver can provide that.

You'd need a separate EQ solution for gaming/stream, as that's outside of JRiver. I don't think you should need a video delay with most EQ systems, though(I actually have to delay my audio :p).
 
Last edited:

Compact_D

Member
Joined
Jun 10, 2021
Messages
47
Likes
17
More likely the perceived position change is because of frequency dependant directivity combined with the room.
I assume you tested this with one speaker?
I did a fresh experiment with 2 speakers (studio monitors, not an audiophile system).
Playing back a mono tone, there is no depth information in it and it sounds like coming from the mid point between speakers. If I change the phase with a plugin, I *think* that I hear the source moving in depth, while staying centered between the speakers. As there is some pitch change during the phase change, I am not sure if "blind" test is even possible, but I do not see any reason to question this effect.
I am not convinced that my system is good enough to evaluate the following, but theoretically - playing back a sweep tone, source still should appear static in depth, as there is no depth info in the signal. However, if there is a phase shift for different frequencies, depth could change in the same way as for the single tone with changing phase.
For the multi-frequency signal like an instrument, this could mean uncertain depth info. Or not? Sure, it could be an effect due to reflections in the room, then I am not sure how to test it properly.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,458
Likes
2,446
Location
Sweden
I did a fresh experiment with 2 speakers (studio monitors, not an audiophile system).
Playing back a mono tone, there is no depth information in it and it sounds like coming from the mid point between speakers. If I change the phase with a plugin, I *think* that I hear the source moving in depth, while staying centered between the speakers. As there is some pitch change during the phase change, I am not sure if "blind" test is even possible, but I do not see any reason to question this effect.
I am not convinced that my system is good enough to evaluate the following, but theoretically - playing back a sweep tone, source still should appear static in depth, as there is no depth info in the signal. However, if there is a phase shift for different frequencies, depth could change in the same way as for the single tone with changing phase.
For the multi-frequency signal like an instrument, this could mean uncertain depth info. Or not? Sure, it could be an effect due to reflections in the room, then I am not sure how to test it properly.

Are you changing the absolute polarity? As mentioned already in the thread, with examples, some types of asymmetric signals are audible when the polarity is switched. This is due to the half-wave rectifier mechanism of the hair cells in the ear, giving a change in timbre, i.e. the switch of the harmonic overtones. This works in the bass region using signals/tones with a lot of harmonics. As for the thread topic, there is absolutely no way to hear differences is phase due to a slight early roll-off in the highest octave of our hearing range, e.g. 0.5 dB at 20 kHz.
 

Francis Vaughan

Addicted to Fun and Learning
Forum Donor
Joined
Dec 6, 2018
Messages
933
Likes
4,697
Location
Adelaide Australia
If I change the phase with a plugin, I *think* that I hear the source moving in depth, while staying centered between the speakers. As there is some pitch change during the phase change, I am not sure if "blind" test is even possible, but I do not see any reason to question this effect.

By definition the pitch should not change if you only change phase. What is the plugin? Effects plugins can do all manner of things, (a phaser doesn't just mess with phase, but that is a bit extreme). Changes in relative amplitude of different frequency components of a complex sounds are part of what gives us depth perception.

For a pure tone changing phase is essentially meaningless when listening. Phase must be relative to something.
 

Compact_D

Member
Joined
Jun 10, 2021
Messages
47
Likes
17
By definition the pitch should not change if you only change phase. What is the plugin?
I was probably not clear.
I used same plugins as used for "phase correction" namely Logic Pro's Sample Delay and Waves InPhase. Technically those are delay, but for single frequency it changes phase. The perceived change of depth happens during I manually adjust phase. It also causes pitch change similar to Doppler effect, and that is normal.
If I change the phase then listen, of course there would be no perception of depth change.
 

Francis Vaughan

Addicted to Fun and Learning
Forum Donor
Joined
Dec 6, 2018
Messages
933
Likes
4,697
Location
Adelaide Australia
The perceived change of depth happens during I manually adjust phase. It also causes pitch change similar to Doppler effect, and that is normal.

Well it is actually changing the pitch. During the time you are changing you don't actually have a changed phase, rather an intermediate period where the signal is being re-sampled at different rates, which isn't really amenable to any sort of simple analysis - certainly not just a phase change. Only when the settings stabilise do you have a definable phase again. What you are likely hearing is really the effort the plugin is making to maintain a signal whilst the settings are changing. In this transition all manner of things will be happening to the signal, with pitch change being one. No doubt all sort of spatial perceptions might arise. But it isn't phase of itself that is doing it.
 

ernestcarl

Major Contributor
Joined
Sep 4, 2019
Messages
3,110
Likes
2,327
Location
Canada
You do that test the other way round. Take a system with proper "full-blown DRC" and resulting smooth FR and flat phase response from each speaker. Then introduce a analytically calculated phase distortion which mimics for example the phase rotations from a typical 3-way passive speaker. This can then be A/B tested (or ABX) and then most people will find that phase distortion is audible and can be detected (once you know what to listen for and have the right music tracks that highlight the effects) but the difference is very subtle, the proverbial icing on the cake. Channel matching of FR and phase is much more important, exactly as you say.
Phase contribution of electronics... forget about it, except for very pathological cases.

Bit different way...

I have done several simple AB tests in my near-field and mid-field listening setups (semi-treated space) and I can hear a difference. The difference with better, much clearer bass is more evident in my couch setup. I did a more formal ABX today using foobar2000 and I can even notice a subtle improvement in the midrange vocals of some pop songs -- little bit cleaner & less harsh -- but, I mean, it's also very subtle... so I'm fine without the post-correction to be honest if need be for zero latency.

1624174820962.png



magnitude level is not changed with and without convolution -- that is, unless I apply frequency dependent windowing


1624180195346.png



1624180210003.png



The flattening of the group delay in the mid-range is more evident when comparing the zoomed-in before and after spectrographs:

1624180236919.png

40 dB scale, 1/6 resolution, Normalized

For this test the volume was set medium at ~65 dB (C) SPL using band-limited pink noise sent to the L+R channels.
 

Attachments

  • TEST SET 1.txt
    639 bytes · Views: 84
  • TEST SET 2.txt
    637 bytes · Views: 85
  • TEST SET 3.txt
    637 bytes · Views: 87
Last edited:

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,161
Location
Riverview FL
Something that I tried and made me wonder what was going on:

Play 400Hz in left speaker, 405Hz in right. You'll hear a 5 Hz "beat" as the combined amplitude of the two signals adds and cancels in the air.

(Or, play 400 and 405Hz in one speaker, and the audible result is similar though the combining occurs at the speaker)

No surprise there, expected result.

Now, play the same source through headphones - no crosstalk or crossfeed - so one ear gets 400Hz and the other gets 405Hz.

To my (somewhat) surprise, the "beating" at 5 Hz was still "audible", though there was no 5 Hz beating, except that generated in my brain.
 

MrPeabody

Addicted to Fun and Learning
Joined
Dec 19, 2020
Messages
657
Likes
944
Location
USA
Arrgh. Lordy that video from PS Audio was just appalling. That was an exercise in picking how many things were wrong, misunderstood, or just plain false. That is a three minutes of my life someone owes me.

My usual complaint about discussions of phase is that almost invariably the conversation starts to confuse time and phase. Paul does this constantly, and even Amir is guilty of one tiny slip in his video. The two are linked, but they cannot be used interchangeably.

At any frequency w, and time t, the signal = A * sin(wt + ø) where ø is the phase and A is the amplitude. This is the definition of phase.
Phase is not a delay. A delay creates a phase difference, but the phase change depends upon the frequency.
Phase is measured as an angle, not as a time. If anyone is discussing phase and they mention a time delay without a specific frequency, there is a problem.

Amplifiers with negative feedback must manage phase as part of their design. Here we do work with time and phase, because we are looking for the frequency where the inherent delay (due to things like slew rate limiting) results in the phase of the output swinging by 180 degrees. The amplifier's gain must be less than one at this frequency, otherwise it will oscillate. This is simply a way of stating the Nyquist stability criterion. This is what (nearly) every audio amplifier is bound by, and what determines its bandwidth.

A lot is made about the ear/brain's ability to locate with time information. This is really a very remarkable thing, as it requires the brain and not the ear to manage the offset. The signal from each ear to the brain must allow this time offset to be detected when the brain integrates the sound. The ear can send useful time up to about one millisecond resolution. That is 1kHz. But it isn't steady state phase information within the signal. The idea that phase information in any part of the signal of more than 1kHz adds to time based localisation is just plain wrong. This difference is not far off the offset of the ears in space when the speed of sound is taken into account. Which is hardly a surprise. As Amir notes, this is a relative offset between channels, so any offset in the reproduction chain is cancelled out anyway.

Inevitably someone will talk about speaker phase, and then absolute phase. It would be so much more helpful if such discussions used the term "polarity". Yes a polarity inversion is a 180º phase shift. One can usefully regard this as a neat coincidence rather than anything profound.

sin(wt + 180º) = -sin(wt) That is all.

But it is really unhelpful to confuse this into discussions of phase.

Absolute phase gets may people riled up. There is almost no evidence it matters, unless something is being driven into some interesting non-linearity. At high enough levels even your ears become non-linear enough that absolute phase can change the way they distort. So you can hear a difference. But unless you do something silly, like connect one channel with reverse polarity, it doesn't matter.

Very excellent post there, Mr. Vaughan.
 

René - Acculution.com

Senior Member
Technical Expert
Joined
May 1, 2021
Messages
427
Likes
1,302
It is always unfortunate when people get attention and a following with content, which simply does not hold up to scrutiny. Paul seemingly does not have the necessary knowledge to talk about things like phase (this video demonstrates this clearly:
), and another youtube channel with the word 'Research" in it(!), is clearly highly unscientific ("flat earthers, flat earthers"). Loudspeakers fall onto a weird edge of engineering, where 'professional hobbyist' can actually make DIY products that match in quality that of well-renowned loudspeaker companies, and these same people often have the same, if not a better, understanding of things like linear phase, minimum phase, poles and zeros, and so on, as the typical loudspeaker engineer; I have worked with a lot of them, and they often are confused about physics and signal processing, even with MSc degrees. Working with more complex products such as hearing aids, there is large drop-off of hobbyists that have knowledge on the topic, so there aren't any videos muddying the waters with bad information (but the engineers are typically almost as confused as those in the loudspeaker industry).

For sure, there are great people in either industry, that publish valuable content in peer-reviewed journal and similar, and they are a joy to work with and discuss with, because all of the things that are perpetually discussed and misunderstood (phase is one of the big ones) on forums, are in place, and we don't have to discuss them, so we can go straight to the meat of the problem.

Regarding phase specifically, there is often a mixup between phase coming from 1) the phasor, 2) the time-dependency, and 3) possibly wave propagation. To have a fruitful discussion, one needs to make very clear if they are talking about a signal, a system, or a wave. By and large, the phase that we in engineering are concerned with is the phasor phase: That is the complex amplitude of e.g. an input signal, an output signal, or a transfer function. This phase is the phase that we plot in a frequency response, together with the magnitude of said phasor. For harmonic signal, we don't carry around the phase related to the time variation, since information like group delay, phase delay, linear vs non-linear phase, minimum phase vs non-minimum phase, resides within the phasor phase. Waves are signals combined with spatial information. The phase related to waves having a spatial aspect to them is often 'rolled back' in measurements by inputting the distance from the microphone to the loudspeaker, so that we again only have the phasor phase of the transfer function (output sound pressure/input voltage), whereas in other cases such as phase decomposition (https://www.comsol.dk/blogs/phase-decomposition-analysis-of-loudspeaker-vibrations/) it is very important to take the individual phases coming because of looking at different points in space into account, but again it is really 'rolled back' to get into the phasor phase information and do the analysis, and then 'rolled forward' to get the actual result in space.

When looking with multiple sources, such as multiple drivers in a single speaker, or multiple loudspeakers, phase must of course be taken into account, and here it is both the phasor phase and the 'wave' phase we need to consider, such as when having different distances from the ear to the individual drivers in a loudspeaker, while those drivers also have their individual phasor phases coming from their electromagneto-mechanical setup plus from their cross-over. And here you need to know how distance and delays relate to linear phase, but it is really not that complicated.

While there are really many aspects to consider in this field, such as how causality plays into some of the discussions, real signals and negative frequencies, and so on, it should absolutely be possible to do a short video, where things are not mixed up the way that Paul does it. When looking at videos on the topic of transfer functions and such, there are actually some good ones, but they reach far less people. I don't know if it is a big concern or not, but I think we would be better off with not having these videos floating around.
 

MrPeabody

Addicted to Fun and Learning
Joined
Dec 19, 2020
Messages
657
Likes
944
Location
USA
2 videos and 8 pages discussing the audibility of phase later and nobody's posted a single blind ABX test result? I'll get things started:

View attachment 134830

Pretty obvious to me. Much more than any difference in SINAD between two non-broken DACs/amps (i.e. inaudible).

Eight pages yes, but this thread wasn't mainly about the question of "absolute phase" prior to your having posted this. Various interpretations of "phase" were juggled around in this thread right from the start, but only after this post of yours did the thread turn sharply to a debate over the audibility of "absolute phase".

Also, based on the discussion that ensured, the term "absolute phase" no longer seems a fitting alternative to the phrase "absolute polarity".

There have now been fifteen pages, and it does not seem to me that much of it has dealt with the original question. But this is largely because in Paul McGown's video, it was not altogether apparent what he was specifically referring to (and he wouldn't have been able to clarify it if his life depended on it), but Amir responded to it anyway.
 

Wes

Major Contributor
Forum Donor
Joined
Dec 5, 2019
Messages
3,843
Likes
3,790
It is easier to BS people if you are not too specific. I bet Paul knows that.
 

MrPeabody

Addicted to Fun and Learning
Joined
Dec 19, 2020
Messages
657
Likes
944
Location
USA
It seemed to me that there should have been discussion on the extent to which an ultrasonic filter, that has essentially no significant affect on amplitude at 20 kHz, would nevertheless affect phase at frequencies at and below 20 kHz.

It seemed to me that the discussion here should have started with this question, to first establish how much phase shift we're talking about. Perhaps this question can be answered from the standpoint of what is typical. Or perhaps not, but if there is a typical case, it is useful to consider.

Once that question has been answered, it would be meaningful to take up the question of the audibility of phase distortion at high audible frequency where it evidently occurs. Some of what I've read suggests the possibility that even if phase distortion can be detected through much of the midrange (in particular circumstances which may or may not be realistic), that at frequency upwards of just one or two kHz it might nevertheless be audible. This would be a meaningful question to consider here, after first reaching some sort of consensus as to how much phase shift typically occurs in the audible band for a typical amp of good quality that has a flat response only out as far as 25 kHz or thereabouts.

To be clear, when I write "phase shift" what I mean is simply that phase shift introduced by a piece of equipment at one frequency isn't the same as phase shift introduced by that piece of equipment at another frequency. People sometimes talk about "linear phase", which I assume means that the relationship between phase and frequency is linear. Even with linear phase, there is phase distortion. And the "minimum phase" behavior of a single loudspeaker driver (and possibly electronic components) is phase distortion no matter that it is minimum phase. If a component introduces phase shift that is different at different frequencies, this to me implies phase distortion. I'm not saying that phase distortion is audible in these cases, only that it is phase distortion regardless of whether you can hear it.

To summarize, it seems that there are these primary questions here:
1. How much phase distortion are we talking about?
2. What frequencies are we talking about?
3. Is this specific amount of phase distortion at these specific frequencies audible?

It does not seem to me that the bulk of the discussion was concerned with these questions.
 

Le Concombre

Active Member
Joined
Jul 8, 2020
Messages
120
Likes
34
There's been quite an interesting (not withstanding involved politics) debate on AS's PGGB thread pertaining to this topic.
The rich and readily despising guys are fond of this way to turn a CD rip into a 20 GB file overnight with 128 GB RAM computers (can't say it sounds bad though and the creator seems nice and savvy).
Turns out that convolution has been implemented with Linear Phase filters in PGGB : no eQ induced phase shift but a guarantee for pre ringing (and claims by rich guys that -- at last -- convo sounds good (the stage at which it's implemented and the fact that filters are insanely long are to be considered though)
The usual suspects that made me lose time with no sense target curves and no sense full range correction chimed in and attacked with their time domain approach.
Maybe it's because I have an active "asservi" (probe fed electronics making sure membranes move as intended, no more) system that has been factory recalibrated/checked quite recently but I'm better off not messing with Phase (and believe me I lost hours fiddling) and keeping eQ -- few , the fewer the better -- points below 500 Hz.

Bob Katz points (favourably) to https://www.hometheatershack.com/threads/waterfalls.7135/ and if I read it correctly modal corrections's time domain modifications are to be preserved
 
Last edited:
Top Bottom