• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Understanding the State of the Art of Digital Room Correction

Status
Not open for further replies.
uh, asked and answered? You did see the comment about "minimum phase"? right? Hello?
I'm not sure where this was asked and answered, it was not my intention to ask a redundant question. Even if if was I'm not sure your response above was warranted.
Further adding that if you have a minimum phase system, another minimum phase system can correct it. Now, acoustic systems do not have to be anything at all like minimum phase, but there is a large part that is, and you're not going to correct the rest (for phase) without getting into linearity issues in hearing.
Given that Mitch is promoting the use of time correction and non minimum phase correction filters in this thread and you said that they were absolutely, positively not right for room correction I was hoping you might be able to expand on that thought.
 
Given that Mitch is promoting the use of time correction and non minimum phase correction filters in this thread and you said that they were absolutely, positively not right for room correction I was hoping you might be able to expand on that thought.

Instead of playing straw man and 'let's you and him fight', consider what I actually said, which is this:
"Symmetric FIR's are absolutely, positively NOT what you want for room correction. Nope."

There are a host of options between constant delay and minimum phase. In a symmetric FIR, every matching quad of complex roots, and every matching pair of real roots can be massaged in 3 ways. They can be left as inside/outside the unit circle, they can both be inside (minimum phase for those roots only), or they can both be outside (maximum phase).

that means for every set of roots, complex or real, that are not on the unit circle, you have 3 choices. If you have 50 such pairs of roots, well, there you go, lots of choices there, all of which have absolutely identical frequency response.

I didn't say "non-minimum phase" I said "constant delay". Constant delay, for high precision inversion, can start to invade the pre-echo domain.
 
I have to say I am not really a fan of room correction software. This is because I don't really like have the sound itself altered so it sounds "right". I would rather accept the way it is due to environment or alter enviroment to improve it.

Haha, you can say its old fashion thinking.
 
Thanks so much for the wonderful video!
I am certain the video will be referenced for years to come.

I am likely going to play around with DRC-FIR in the next few days and hopefully post my process and results.

A few questions / clarifications @mitchco before I start though.
1) My understanding from the talk is that you want a larger FDW window for lower frequencies as we hear typically the steady state of the room and the room is modal at that point, so we want to take in a larger window and then at higher frequencies we don't want too long of a window as late reflections will add to envelopment etc. Is that a reasonable understanding?
How do you determine the FDW and what would be a reasonable window for the low and high frequencies? In your presentation you mentioned 6 cycles at the low frequency and 1 cycle for higher frequencies. Does that seem reasonable for your typical listen room size? In the DRC-FIR scripts by gmad I think he recommended playing around with 3-6 cycles, but I don't seem recall them talking about having a larger number of cycles at low frequencies and fewer at higher frequencies.

2) For whatever reason I previously ( and wrongly) though of taps as the amount of delay the filter could allow at various frequencies. But it appears that the number of taps is just the resolution of the filters? If one wants 'tight bass' then having more taps allows for better resolution of the correction yet the number of taps does not indicate the amount of potential correction in terms of delay (phase correction)? Sorry, clearly I am confused. I guess I previously thought that more taps = more ability to correct the phase of low frequencies (which is longer) = longer filter generated.... yet that is not the case, the taps is the resolution of the filter? Then what controls / sets the time/length of the filter...the FDW?

Sorry, I am clearly a noob when it comes to FIR filters, but I am keen to learn and certainly your video has been much appreciated.
 
Last edited:
Instead of playing straw man and 'let's you and him fight', consider what I actually said, which is this:
"Symmetric FIR's are absolutely, positively NOT what you want for room correction. Nope."

That was not at all my intention and there you put words in my mouth that I did not say or imply.

I asked a simple question as I wasn't sure what you meant or exactly what you thought was best.

You responded as you did and I tried to clarify. I can see I took two of your statements and put them together in a way that was not clear.
That was not a deliberate act to cause trouble either.

Apologies to Mitch for the side show in an otherwise interesting thread.
 
Speech intelligibility is a completely different critter. EOF.
Just noting that is where the spectral envelope got it's legs in the first place. It was right in the description, suggesting that speech could be recognized across different circumstances (e.g whisper vs normal speech) by focusing on the envelope. And maybe that's a strategy if not peculiar to speech, is at least an essential strategy. It may or may not have more general utility. That's all--just the scientist in me wondering aloud about assumptions. Apparently it works well. Just not sure if all eggs in the basket should be those.
 
Multiple measurements can actually reduce the resolution of the correction at the listening position. This was mentioned in Sean Olives study on, “The Objective and Subjective Evaluation of Room Correction Products. See: https://www.audiosciencereview.com/...review-room-eq-setup.26397/page-9#post-906241 You can hear it in AB tests, which I have done, and encourage others to compare and hear with your own ears.

I have also measured the reduction in resolution:

View attachment 161854

I took sweeps of two correction filters through the convolution engine so we can see each filter that is being applied to the same loudspeaker system in the same room. The top correction filter is Dirac's using their recommended multiple measurement approach and the bottom is one of the DRC packages discussed using a single measurement. The red line can be considered 0 dBFS for the bottom correction as it is cut only with no boosting.

As one can see, below 600 Hz, the Dirac correction has significantly less frequency correction resolution than the other correction filter. And the Dirac filter is over correcting at 95 Hz. As one should be able to glean, the audible differences between the two filters is significant in AB testing.

On a slightly different note, Dirac uses a mix of IIR and FIR filters. IIR filters are used at the low frequencies and therefore offers no excess phase correction at low frequencies to correct for the rooms non-minimum phase response.

Don't get me wrong, I have reviewed Dirac extensively here. But because of these audible/measurable shortcomings, it did not make the SOTA list. Which is what this post and video is all about.

If you disagree, fine. But show some data/measurements/listening tests to support your position. So far I see a lot of words, but no real data.

One crucial mistake in what you just shown is that you cannot control effects of room correction based on multipoint measurement with single point sweep. Of course single point control will be pretty much identical to correction based on single point measurement, and I assumed it goes without saying that correction based on multipoint spatial measurement should be checkeded with multipoint spatial control measurement.
 
Just noting that is where the spectral envelope got it's legs in the first place. It was right in the description, suggesting that speech could be recognized across different circumstances (e.g whisper vs normal speech) by focusing on the envelope. And maybe that's a strategy if not peculiar to speech, is at least an essential strategy. It may or may not have more general utility. That's all--just the scientist in me wondering aloud about assumptions. Apparently it works well. Just not sure if all eggs in the basket should be those.

Oh yes, spectral envelope is a big thing for speech intelligibility. But enhancing speech for best intelligibility (which I have some experience with, no I can't say quite what, grrr) does NOT SOUND NATURAL, nosirree, but you will (*&(&* well understand it as well as possible.
 
Also mentioned in the video, in my tests, David’s Focus Fidelity Designer is the only DSP FIR designer that gets multiple measurements correct. Focus Fidelity uses multiple measurements to build a transfer function model (this is difficult to do well) and from there apply less correction to features which change with position. Focus Fidelity avoids the resolution reduction (heard as over correction) by not just averaging the multiple measurements like so many other DSP/DRC packages do. Talking about the state of the art here.

Neither @Sean Olive nor I ever stated that spatial/multipoint measurements should be simply averaged. The point was that room correction shouldn't be based on a single point measruement.
 
That was not at all my intention and there you put words in my mouth that I did not say or imply.

I asked a simple question as I wasn't sure what you meant or exactly what you thought was best.

You responded as you did and I tried to clarify. I can see I took two of your statements and put them together in a way that was not clear.
That was not a deliberate act to cause trouble either.

Apologies to Mitch for the side show in an otherwise interesting thread.

Well, I apologize for being grouchy, but you did generalize quite inaccurately. "what's best" is rather annoyingly complicated, in my view, but is neither minimum phase (although it's closer to that) nor anything remotely like constant delay. "what's best" is room, data, speaker, and pretty much everything-dependent.
 
Olive and Toole will be the demiurges?

For the moment no mathematics can predict the type of correction to apply. Only empiricism like 17th century alchemy

The impulse response canot be averaging.


I never stated that spatial multipoint measurements should be averaged, but, for your information, impulse response can be averaged. As an example what you will get with it try reading about "vector average" in REW.

Olive and Toole are the guys you should be learning from. Of course, there is still a lot to be discovered, but, as you don't seem to know that IR can be averaged obviously there are a lot of things you can learn from Olive and Toole.
 
Multiple measurements can actually reduce the resolution of the correction at the listening position. This was mentioned in Sean Olives study on, “The Objective and Subjective Evaluation of Room Correction Products. See: https://www.audiosciencereview.com/...review-room-eq-setup.26397/page-9#post-906241 You can hear it in AB tests, which I have done, and encourage others to compare and hear with your own ears.

I have also measured the reduction in resolution:

View attachment 161854

I took sweeps of two correction filters through the convolution engine so we can see each filter that is being applied to the same loudspeaker system in the same room. The top correction filter is Dirac's using their recommended multiple measurement approach and the bottom is one of the DRC packages discussed using a single measurement. The red line can be considered 0 dBFS for the bottom correction as it is cut only with no boosting.

As one can see, below 600 Hz, the Dirac correction has significantly less frequency correction resolution than the other correction filter. And the Dirac filter is over correcting at 95 Hz. As one should be able to glean, the audible differences between the two filters is significant in AB testing.

The "resolution" of both filter responses look very similar. It's not obvious to me the filters have different resolution let alone reduced resolution in the upper graph.
Whether the Dirac filter "over corrects" or not can't be assessed by looking at the filter's magnitude response (unless you believe boosting is equivalent to "overcorrection"). You would need to show the effects on the actual room response. Dirac does apply up to 10dB boost.
What can be seen in your plots though is that the magnitude response of each filter is quite different (and quite different over a broad range) so I'd expect them to sound quite differently. Which one is "correct" would be speculation at this point.

P.S. The Dirac filter doesn't seem to do anything below 30Hz – maybe a "curtain" was set which makes the filter taper off at that frequency?
 
Last edited:
Uhhh. NO.

It is possible, indeed probable, that one might build an FIR in a minimum-phase configuration. For such, preloading with zeros can eliminate all but processing latency.

Symmetric FIR's are absolutely, positively NOT what you want for room correction. Nope.

I was referring to FIR filters mitchco promotes. They are 65k. Any signal arriving at the filter needs to go through each and every tap until it can be sent to the DAC, i.e. the output signal is shifted in time with respect to the input. So I'm not sure how preloading with zeros would help?
 
I was referring to FIR filters mitchco promotes. They are 65k. Any signal arriving at the filter needs to go through each and every tap until it can be sent to the DAC, i.e. the output signal is shifted in time with respect to the input. So I'm not sure how preloading with zeros would help?

Is it pure symmetric? Or is it minimum phase? Or something else. No, the first piece of data does not have to go through the whole filter to start, if it's minimum phase. Consider, there's nothing coming out of the system, so the problem is identically like having stuffed zeros.

Now, a constant-delay filter WILL have a 32k delay IN THE FILTERING. That latency could never be removed.

Do you understand the difference between filter DELAY and filter LENGTH now?
 
Really? So you deny the bandwidths of the hearing apparatus (i.e. ERB, cochlear filters, etc) that define where and what is audible?

You're dismissing all psychoacoustic research out of hand?

Not at all but all I've read so far didn't really answer the questions I had and still have. Maybe you have better applicable references? For what it's worth I went through a ton of papers (including yours) and Jan Schnupp's book.
 
Is it pure symmetric? Or is it minimum phase? Or something else. No, the first piece of data does not have to go through the whole filter to start, if it's minimum phase. Consider, there's nothing coming out of the system, so the problem is identically like having stuffed zeros.

Now, a constant-delay filter WILL have a 32k delay IN THE FILTERING. That latency could never be removed.

Do you understand the difference between filter DELAY and filter LENGTH now?

Not sure why you keep asking me about the filter. Again, it's the filters mitchco is promoting. See post 1. On the playback side the signal needs to go through a convolution engine. How computationally costly can that get to achieve zero latency in a multichannel AVR for example?
 
Last edited:
What do you think is the right type of filter and why is a symmetric FIR wrong?

It's wrong because it introduces pre-ringing and that is audible. With minimum phase FIR you don't have pre-ringing.

Hopefully this picture will help explain:
 

Attachments

  • Capture.JPG
    Capture.JPG
    44.6 KB · Views: 162
Well, then, you'd be wrong, for both acoustic and psychoacoustic reasons.

Corrections with a tiny width are incredibly position dependent. You'll never get it right for both ears at the same time. Remember, correction bandwidth times wavelength tells you a rough estimate of how far before you go completely wrong (note, there is another factor there under 1/2 but that depends on the tolerance you prefer).

Furthermore, that means longer filters, and much more chance of engaging nonlinearities in the auditory system, which can make "better via LMS" sound worse in practice.

All of this is testable and verifiable. Those who reject the actual behavior of the ear are simply scoffing at the science.

I have a hard time understanding the use of critical bands for frequency response of a reproducing system.

let's take a FR, it is flat on the erb scale:


a.jpg


in "reality" though 55Hz (A1) is 3.6dB louder than 52Hz (G♯1):

b.jpg


If I play a "melody" alternating G♯1 and A1, I wont hear a diference before and after using this filter?
 
Well, I apologize for being grouchy, but you did generalize quite inaccurately. "what's best" is rather annoyingly complicated, in my view, but is neither minimum phase (although it's closer to that) nor anything remotely like constant delay. "what's best" is room, data, speaker, and pretty much everything-dependent.
Apology accepted, this would have a been a great response to my initial question :)
I'm not sure where I generalized inaccurately (certainly not my intention). If you care to point out where and how, here or via PM that would be appreciated.
 
It's wrong because it introduces pre-ringing and that is audible. With minimum phase FIR you don't have pre-ringing.

Hopefully this picture will help explain:

the problem is that you are asuming that "post ringing" is inaudible. it is audible:
So you want to correct for best phase response while avoiding AUDIBLE pre-ringing, not avoiding it at all
 
Status
Not open for further replies.
Back
Top Bottom