• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Does Phase Distortion/Shift Matter in Audio? (no*)

Has anyone here tried to implement the various techniques that Obsessive Compulsive Audiophile has explained (or attempted to) on YouTube? He claims that correcting phase below Schroeder frequency is important and his technique results in a marked increase in clarity, and now proposes that any efforts to do any correction above that are doomed to result in more problems than are solved.

I tried to implement some of his previous techniques and couldn't get them to work. He's a gifted vector math expert, but he tended to make mistakes and/or get off on tangents in his previous videos and hasn't put together a pdf that clearly explains every step in sequence. At the very least his latest iteration is much simpler than his previous efforts, but I have yet to give it a shot.


I haven’t seen all his videos, but my guess is what he means above Schroeder is “auto-magic” inversion to correct additional room reflection effects can make things worse with excessive “correction”. But, IMO, the same can still happen even below that.

The inversion techniques in REW, I find, can tend to work too aggressively and personally I prefer manual EQ.
 
I haven’t seen all his videos, but my guess is what he means above Schroeder is “auto-magic” inversion to correct additional room reflection effects can make things worse with excessive “correction”. But, IMO, the same can still happen even below that.

The inversion techniques in REW, I find, can tend to work too aggressively and personally I prefer manual EQ.

Have you found his videos to be as hard to follow as I have? If I were him, I'd do one video on theory for those interested in delving into that, then another strictly on the cookbook procedure for implementation. His latest video is indeed more straightforward, but if you haven't tried all his previous methods and used REW to phase correct your mic cal file, he merely says it's a vital step previously covered. Doh.

I was initially interested in his technique because Audyssey in my ca. 2010 Onkyo AV receiver never sounded right, and I was not at all satisfied with the results using a miniDSP-DDRC-24 and Dirac. Both resulted in a smeared phantom center that was highly annoying.

I would think that non-heavy handed manual EQ to bring down broad peaks in anechoic response would be preferable to what I have heard from the automated systems.
 
I do not use Audyssey nor do I own an AVR. I make use of JRiver’s internal convolution engine and the FIR section of a miniDSP 2x4HD. I really do not see a huge difference by having the estimated phase of the microphone embedded in the calibration file — nevertheless, I have added the phase information as it seems to give a better result when “est. IR delay” is applied.
 
Old school here, Foobar2000 and its convolver plug-in.
 
Have you found his videos to be as hard to follow as I have?

Another problem with inversion in REW is that it really isn't as easy or straightforward as it altogether sounds. Even with semi-automation of the inversion of just the magnitude response, there's whole a lot of manual tinkering involved to get to a certain desired result:

1/48 smoothing MP FDW 15 cycles under 500Hz (10% regularisation) and psychoacoustic smoothing MP FDW 7 cycles above 500Hz (16% regularisation)
1707424764944.png

*I should have manually truncated that unwanted rise/boost in the filter above 16kHz. There's too many steps along that need double-checking for this to feel seamlessly "automatic" and simple -- and this is just for a single mic position measurement EQ.

I had to split the filtering into two sections. And even then, manually edit the IR data exported into text files to truncate information below a certain freq point. Convolve both sets of filters, and then apply filtering at the lower tail end to prevent pre-echo... my bad, you gotta convert the magnitude only filter to its min-phase version before it's usable. The whole thing is very involved and it's super easy to make mistakes and produce not so good results, unfortunately.

I still prefer a gentler, fully manual filter creation method with something like rePhase.

MDAT example (*re-edited target curve slightly -- I hope this makes sense):
 
Last edited:
I still prefer a gentler, fully manual filter creation method with something like rePhase.
Hi,
I like an automated FIR filter generator, that allows strong manual control.
I prefer to be able to make measurements with whatever software i like,.......... REW, Smaart, ARTA, OSM, etc.
And then define the passband targets, and let the auto-generator do its work.
The manual control I need is how much fractional octave smoothing to apply both to the measurement, and also to the auto-magnitude and phase matching.

I've only found two products like that.
First, and one I have tons of experience with is the FirDesigner family from ElipseAudio. There are several versions, with the lower two at $250 and $300.
Relatively easy to learn and use.

The second is a new one to me. Crosslite+ https://www.fmscience.com.br/en/crosslite.php
I think it's meant for the large scale proaudio system engineer. Considerable dual channel FFT experience is a prerequisite imo.
Incredibly capable,.... it's both a dual channel measurement system (for $500). And then an auto FIR generator (for an additional $500).
Works much like FirDesigner.

I've looked at Acourate and Audiolense many times, but frankly they seem overly complicated and don't really appear that they can give me the straightforward easy control I want when building FIR files. I've also played with REW's impulse inversion capability....but like you, I find it too strong/too unprecitable.

FirDesigner is really the DIY way to go I think.
 
Another problem with inversion in REW is that it really isn't as easy or straightforward as it altogether sounds. Even with semi-automation of the inversion of just the magnitude response, there's whole a lot of manual tinkering involved to get to a certain desired result:
....

In re-reading your above reply, I believe that's why OCA now states his simple inversion method should only be applied below Schroeder frequency. He said results are too variable at higher frequencies, generally causing worse results than leaving things alone, and that most of the improvement to be had is up to ca. 200 Hz in any case. His reasoning for that is how the brain learns to process direct vs. reflected sound and the vector rather than scalar nature of sound at higher frequencies that cannot be measured with omni mics anyway.
 
In re-reading your above reply, I believe that's why OCA now states his simple inversion method should only be applied below Schroeder frequency. He said results are too variable at higher frequencies, generally causing worse results than leaving things alone, and that most of the improvement to be had is up to ca. 200 Hz in any case. His reasoning for that is how the brain learns to process direct vs. reflected sound and the vector rather than scalar nature of sound at higher frequencies that cannot be measured with omni mics anyway.
Do you have any reason to believe that he is doing controlled listening tests, and not simply fooling himself?
 
Do you have any reason to believe that he is doing controlled listening tests, and not simply fooling himself?

No. I'm just curious as to what others may have tried based on his technique, what the results were, and a bit of what he said as to his reasoning for limiting the inversion filter to lower frequencies. As I noted, I attempted to work through one of his previous procedures and it failed miserably. Vocals that are at least fairly well centered in headphones or in the uncorrected speaker system sounded bizarrely unfocused/"phasey" so I assumed I screwed up.
 
The trouble is, unless these 'others' you seek out are doing controlled listening evaluations (and guess what, they never are), it's all worthless chatter.

Sighted listening evaluations of room correction (including Dirac ART) are rife, and leading to a lot of mythmaking.
 
You are missing the point. I'm not even asking if it's "better". I'm just asking if it freaking worked at all to get to the point of making any sort of a judgement, or to the point of being able to compare a measured response to a predicted response.

If vocals that were in phase are now out of phase between left and right, something went wrong in the process.
 
No. I'm just curious as to what others may have tried based on his technique, what the results were, and a bit of what he said as to his reasoning for limiting the inversion filter to lower frequencies. As I noted, I attempted to work through one of his previous procedures and it failed miserably. Vocals that are at least fairly well centered in headphones or in the uncorrected speaker system sounded bizarrely unfocused/"phasey" so I assumed I screwed up.

@OCA would you mind jumping in, they are talking about you :)

I would assume that he limits the inversion to lower frequencies because that is where you will find minimum phase behaviour. After you go above the transition zone, results are no longer minimum phase and can not be corrected with inversion. This does not mean that the software won't try to correct it, software is stupid and does whatever you ask it to do. My procedure is as follows:

1. Calculate Schroder frequency based on the formula Fs = 2000 * sqrt (T30/room volume) for metric, or Fs = 11,885 * sqrt (T30/room volume) for imperial.
2. Transition frequency is 4 * Fs
3. Look at your FR sweep. You will find that the pattern of peaks and dips becomes increasingly close together as you progress up the frequency range. Above the transition zone 4Fs, you will see "grass". This is a useful way to estimate the end of your transition zone in your room.
4. I apply tight windowing and only correct up to 4Fs. Above 4Fs, I apply loose windowing and correct "broad tone bands". I might give it a downwards tilt if I see the result is too flat.

I have wondered if critical bands (i.e. the ability of our brain to mask sounds closer in frequency becomes broader in frequency range at high frequencies) might have something to do with it.
 
Last edited:
Thanks, Keith...I had no idea (or totally forgot, maybe?) hat @OCA participated here. I originally just found his videos via a sidebar suggestion on YouTube, which I assume showed up when I was viewing manufacturer videos on room correction gear.

I'm an engineer, but a chemical engineer, so I have minimal grasp of the basics of wave phenomena from physics, but not much beyond that. It's closing in on 50 years since I had controls theory, taught from a process point of view, and spoke in terms of lags and used Laplace transforms, not of phase or Fourier transforms. When we asked the prof to explain what the hell the S domain was supposed to be, he essentially gave up and said "trust me, it works, you get to do complex calculations using simple algebra." Then we learned in the lab that all our efforts to model processes and lags in the real world didn't give us better results than simply following cookbook procedures to adjust real-world analog controllers, which deviated from ideal action themselves, to do their best on real-world processes and variables.. Hence, I never again opened that controls theory book in 35+ years in industry.

Unlike electrical engineers, we didn't have any previous coursework in complex number math. All I know is my EE friends made jokes about Poles (with a capital P) and planes.

Last time I attempted to follow OCA's procedure was before I had to totally rearrange this room to accommodate boxes full of belongings to a family member back here on a temporary basis, so my measurements from before are no longer representative. That's why I am curious about any results from other members who've tried the latest suggestions before I dive in again.
 
Last edited:
How, tell me, far do you think a correction for one point in a soundfield at 1kHz applies? How far can you move from that point and still have meaningful correction.

That is, assuming that you correct pressure. But you have to pick something out of those 4 variables at one point, and most choose pressure.

Now, critical bands mentioned above, I would prefer you use a more modern understanding, including using ERB's instead of critical bands, and, well, I'd add some discussion to the wiki if somebody stuck in 1955 wouldn't just remove it again. BUT yes, the fact that ERB's are reasonably estimated as 40Hz wide until you get to the point that 1/4 octave is wider than 40 hz, at which case it becomes about 1/4 octave, yes, that does imply that too too too narrow correction, even if it was possible (breathe out and your correction is wrong, for instance) at high frequencies is pointless.

This is not a short discussion, but yes, above about 250Hz or so (this has nothing to do with the so-called Schroder frequency, btw, it has to do with head size vs. wavelength) it's quite difficult to even correctly do an "exact correction". For which ear, maybe, pick ONE? This does mean that you should flatten the spectrum you measure BEFORE you calculate the correction filter. How much you smooth is not a short discussion, either.
 
In re-reading your above reply, I believe that's why OCA now states his simple inversion method should only be applied below Schroeder frequency. He said results are too variable at higher frequencies, generally causing worse results than leaving things alone, and that most of the improvement to be had is up to ca. 200 Hz in any case. His reasoning for that is how the brain learns to process direct vs. reflected sound and the vector rather than scalar nature of sound at higher frequencies that cannot be measured with omni mics anyway.

"Room correction" programs, of course, can window and smooth out some of the messiness when reading and analyzing measurements taken inside a room at a distance, but it's never perfect -- there are other things that can't be fully automatically accounted for such as exact listener(s) positioning (obviously, people move -- not to mention the junk inside our rooms does get re-arranged over time) and then there's the specific speaker position and angle relative to the room and listener(s), significant room boundary induced reflections, apriori knowledge of the directivity behavior of the speaker etc. Taking every single variable into account is time consuming and requires additional expertise. But, for "finished" speakers that are presumably already carefully "speaker corrected" or calibrated/"voiced" by the manufacturer and positioned optimally in-room, I think it's fair to say the need for significant additional post-installation room EQ above the bass region should be minimal, if at all -- on the other hand, there's also nothing really grievously wrong when general tonal EQ adjustments are applied esp. when it sounds better to you -- no one's going to die if you break the default "voicing" created by the speaker maker/manufacturer or didn't conduct a formal double blind test.
 
Last edited:
Hi,
I like an automated FIR filter generator, that allows strong manual control.
I prefer to be able to make measurements with whatever software i like,.......... REW, Smaart, ARTA, OSM, etc.
And then define the passband targets, and let the auto-generator do its work.
The manual control I need is how much fractional octave smoothing to apply both to the measurement, and also to the auto-magnitude and phase matching.

I've only found two products like that.
First, and one I have tons of experience with is the FirDesigner family from ElipseAudio. There are several versions, with the lower two at $250 and $300.
Relatively easy to learn and use.

The second is a new one to me. Crosslite+ https://www.fmscience.com.br/en/crosslite.php
I think it's meant for the large scale proaudio system engineer. Considerable dual channel FFT experience is a prerequisite imo.
Incredibly capable,.... it's both a dual channel measurement system (for $500). And then an auto FIR generator (for an additional $500).
Works much like FirDesigner.

I've looked at Acourate and Audiolense many times, but frankly they seem overly complicated and don't really appear that they can give me the straightforward easy control I want when building FIR files. I've also played with REW's impulse inversion capability....but like you, I find it too strong/too unprecitable.

FirDesigner is really the DIY way to go I think.

I think FIR designer and Crosslite+ are really good commercial software packages... beggars (like moi) can't be choosers, however!
 
  • Like
Reactions: OCA
How, tell me, far do you think a correction for one point in a soundfield at 1kHz applies?

Interesting question and I don't know of any psychoacoustic studies that specifically looked at this in detail. Celestinos determined in his Ph.D. thesis a "cell size" of 1/5 to 1/10 of wavelength. 1kHz equals 34.4cm (about 1ft). So about 7cm to 3.4cm. But (how) does this even correlate to human hearing (=perception)? Does the knowledge of cochlear filters help us here? Do you know more?
 
Last edited:
Interesting question and I don't know of any psychoacoustic studies that specifically looked at this in detail. Celestinos determined in his Ph.D. thesis a "cell size" of 1/5 to 1/10 of wavelength. 1kHz equals 34.4cm (about 1ft). So about 7cm to 3.4cm. But (how) does this even correlate to human hearing (=perception)? Do you know more?
Well,that's 1/2 or 1/4 the length from the center of your head to your ear, yes?
 
Well,that's 1/2 or 1/4 the length from the center of your head to your ear, yes?
Yes but the question is, by how much is the correction for neighbouring points wrong? Is it still an improvement or did it make things worse? By how much? How does it relate to what actually is corrected (minimum phase, mixed phase, etc.)? And, how is the correction affected when you replace a mic with a human head and torso. Does the correction for even the (small) center point area get invalidated?
 
Last edited:
Yes but the question is, by how much is the correction for neighbouring points wrong? Is it still an improvement or did it make things worse? By how much? How does it relate to what actually is corrected (minimum phase, mixed phase, etc.)? And, how is the correction affected when you replace a mic with a human head and torso. Does the correction for even the (small) center point area get invalidated?

You could just measure the relevant neighbouring points and discover what the variation is…

MMM or spatial averaging could help assist as well.
Normally, the phase (thinking more of absolute phase here) shouldn’t really change much with the influence of the room — with the exception of a few areas where there is particularly sparse, strong modal and boundary influences. And when measuring at the far edges of the speakers’ beam width range or pattern control.

As for the head and torso, well, I think the brain compensates the effect of your own body so I’m not sure why it would matter all that much.
 
Back
Top Bottom