• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Understanding Audio Frequency Response & Psychoacoustics (Video)

For room correction the REW recommendation to use var smoothing is very reasonable due to two reasons.
In the modal region when you want to compensate room modes you must hit them precisely as they are minimum phase phenomena.
At higher frequencies where you mainly correct the loudspeakers you don't want narrow Qs as they can cause time domain problems, plus of course our hearing is logarithmic as correctly written by @jaakkopasanen.
 
For room correction the REW recommendation to use var smoothing is very reasonable due to two reasons.
In the modal region when you want to compensate room modes you must hit them precisely as they are minimum phase phenomena.
At higher frequencies where you mainly correct the loudspeakers you don't want narrow Qs as they can cause time domain problems, plus of course our hearing is logarithmic as correctly written by @jaakkopasanen.
I do see your point about hitting them precisely, but if Var Smoothing is not how we actually perceive, then you could be correcting stuff that you don't need to in the bass, or correcting it more than you need to. In that case (something I thought of just now), would the best method be to first use Var Smoothing to identify the precise/exact Hz location of the peaks, make a note of the exact positions on which to precisely apply the peak filters, then switch to ERB smoothing and put in negative gain on all those previously identified peaks until the ERB plot lines up on the Target? That way the exact location of the peaks would be targeted, but instead they would only be reduced enough to satisfy the perceptual ERB Target.
 
  • Like
Reactions: EdW
Thanks very much jaakkopasanen, I can definitely visualise all the effects you're mentioning there, and how it applies to reading the significance of the peaks & dips within the frequency response graph, I can also see why ERB smoothing works like it does there. As an extension on this topic, do you get involved much in Room EQ.....you'd think then this would make room EQ very simple - apply ERB smoothing and then it's super easy to iron out the dips & peaks because it smooths the bass so much.....is that the best approach to room EQ, because general advice seems contrary to that in being that you should have less smoothing on the bass for EQ purposes? In a few minutes I'll edit this post & attach a couple of room EQ measurement graphs of mine to illustrate the ease of room EQ using ERB smoothing vs no smoothing in bass (Var Smoothing):

Var Smoothing Measurement (no bass smoothing) & EQ Graphs:
View attachment 125773
View attachment 125774

ERB Smoothing Measurement & EQ Graphs:
View attachment 125775
View attachment 125776

Yes, so you can see it's a lot easier in terms of using less filters and less sharp filters when using ERB Smoothing for roomEQ, but which is better.....you'd kinda think the ERB is more applicable due to it's perceptual nature, but why so much content out there saying that you shouldn't smooth the bass much (or at all) when doing roomEQ?


The x-axis definitely has an implication here in the reading of the significance of the dip or peak, a wide peak or dip will visually look more significant than a less wide peak or dip (talking pixels on the screen here).....so that would influence your intuitive EQ decision on whether or not to EQ that peak or dip. Jaakopasenen cleared up the points re the logarithmic x-axis & combined ERB influence, concluding in the relation of that to using ERB smoothing......which for me is a different conclusion than what your video implies.....maybe I had misinterpreted your video, but it seemed that one conclusion from your video is that when reading frequency response graphs (logarithmic x-axis) that you should be less concerned by peaks & dips the further up the frequency range you go, due to ERB reasons.....however that's in contrast to what jaakopasanen has explained above......it could just be that your video is easily open to misinterpretation, but I definitely understand it based on jaakopasnen's explanation above.
Room correction is a bit different from headphone eq because with rooms and speakers you typically only apply eq to low frequencies (~300 Hz and below). The frequency response above ~700 Hz is dominated by the speaker itself and that shouldn't be corrected based on in-room measurements because the measured in-room frequency response doesn't tell the whole story how we perceive it. This is because brains can filter out many artifacts in the frequency response caused by reflections and such. The corrections in the upper mids and treble should be limited to very broad general tonality changes. Low frequencies can be corrected with fairly narrow filters even if it wouldn't be exactly necessary. Or at least this is how I understand it, but then I'm not as familiar with room corrections as headphone eq.
 
I do see your point about hitting them precisely, but if Var Smoothing is not how we actually perceive, then you could be correcting stuff that you don't need to in the bass, or correcting it more than you need to. In that case (something I thought of just now), would the best method be to first use Var Smoothing to identify the precise/exact Hz location of the peaks, make a note of the exact positions on which to precisely apply the peak filters, then switch to ERB smoothing and put in negative gain on all those previously identified peaks until the ERB plot lines up on the Target? That way the exact location of the peaks would be targeted, but instead they would only be reduced enough to satisfy the perceptual ERB Target.
This way you wouldn't completely compensate the minimal phase of the mode and get some phase problems. Also ERB is just one and very simplified/limited model/approximation of how we are hearing, in the end its always best to try individually what works best in every case. For example my current main setup room has some strong bass absorbing regions and if I correct there to linear bass the perceived bass is too much, thus nice looking frequency responses can sound worse than less linear ones.
 
I am sorry but I have to stand in the defense @amirm .

Even though he may had best intention in mind, in my humble opinion @jaakkopasanen is the one who has created a confusion here. I must say that he did a brilliant job in explaining how we should interpret what we see on the graphs when we want to apply EQ and we should all thank him for this, but I think he actually did no favor to Amir's point of view in the video.

@amirm did a great job in explaining what we actually hear vs. what we see on the graph. He used ERB as a point of reference with regards to psychoacoustics to make a point and teach us why is it that we hear something the way we do. But for @jaakkopasanen to turn this other way around in the same thread on the forum to make a polar opposite point explaining what we see vs. what we hear, IMHO must not be done without previous explanation, just for the sake of avoiding confusion.

In a similar fashion, one could make a video explaining why we see colors the way we do, and use the very presence of light as a point of reference. Someone else could try to make his own point in the same thread about why we don't see colors in the absence of light and create similar confusion. And the discussion goes in the opposite direction, and now everyone is trying to prove weather or not the colors even exist in the absence of light.

Although the knowledge provided is doubled for someone, for someone else it may be just confusion. IMHO, threads in which we discuss what we hear (vs what we see on the graph) and the threads in which we discuss what we see on the graph (vs. do we hear it or not) should be miles apart. Simply because of possible misinterpretation.

I hope I did not create even more confusion here. But if you think I did, just please follow your intuition and learn both aspects of the same thing, but please don't further undermine Amir's work here, because he did his best not to create confusion in the first pace. We hear with our ears and we see with our eyes, both are interpreted by the brain simultaneously. Audio is about what we hear.
 
Understanding Audio Frequency Response & Psychoacoustics
"...variations here (around 240Hz) is far more audible because hearing bandwidth, ERB is very narrow"
"...whereas, when you get up here (around 15kHz), you can see these little wiggles, your ERB is big chunk over here...these little peaks and valleys are never heard"
"the same peaks and valleys here (around 240Hz) are (can be) audible."

First of all, Amirm is clearly saying the word "same peak" on the log axis, otherwise, it does not look the same at all.
And well, ERB's 51Hz bandwidth on 240Hz corresponds to about 0.306 oct and 1640Hz bandwidth on 15kHz corresponds to about 0.158 oct, obviously former is greater on the log axis.
So, if you see the "same peak" on 240Hz and 15kHz on the log axis, the former has less possibility of audibility.

I'm very sorry to say, because I'm paying respects for his efforts, but this time, I should say Amirm is completely wrong.
 
Last edited:
Room correction is a bit different from headphone eq because with rooms and speakers you typically only apply eq to low frequencies (~300 Hz and below). The frequency response above ~700 Hz is dominated by the speaker itself and that shouldn't be corrected based on in-room measurements because the measured in-room frequency response doesn't tell the whole story how we perceive it. This is because brains can filter out many artifacts in the frequency response caused by reflections and such. The corrections in the upper mids and treble should be limited to very broad general tonality changes. Low frequencies can be corrected with fairly narrow filters even if it wouldn't be exactly necessary. Or at least this is how I understand it, but then I'm not as familiar with room corrections as headphone eq.
Thanks, I'm aware about the transition frequency, etc. (I'll be using Anechoic EQ above the transition & in-room measurements below transition).
This way you wouldn't completely compensate the minimal phase of the mode and get some phase problems. Also ERB is just one and very simplified/limited model/approximation of how we are hearing, in the end its always best to try individually what works best in every case. For example my current main setup room has some strong bass absorbing regions and if I correct there to linear bass the perceived bass is too much, thus nice looking frequency responses can sound worse than less linear ones.
Ok, I'll be trying MMM method next time I do roomEQ, so I'll see how it goes then.
 
then switch to ERB smoothing and put in negative gain on all those previously identified peaks until the ERB plot lines up on the Target? That way the exact location of the peaks would be targeted, but instead they would only be reduced enough to satisfy the perceptual ERB Target.

I agree with @thewas that ERB smoothing in REW is just a simplified approximation. John Mulcahy himself has said that he finds little practicality with it when it comes to room measurements -- the resolution in the bass is too low to be useful. You might as well cycle between var, 1/12, 1/6, and psychoacoustic instead as a visual aid check. ERB can be useful in the HF, but since it and psychoacoustic gives you very similar/identical results there... why not use the latter instead?

Personally, I prefer the "brute-force" technique, as Mitch would call it, which is to take a gazillion measurements and base corrections mainly on an average. Stuff that's less controllable and more chaotic around the transition zone, I tend to apply only small changes or leave alone. At least with my own measurements, I've seen a pattern where psychoacoustic smoothing gives somewhat similar results to a FDW of 7 cycles (where time domain information filtering is used).

Below is left channel of my Sceptre S8 using the average of 30 sweeps 1/48 around a squarish area of 15x15x5 inches at the MLP (a dedicated listening room) where no further smoothing is applied. Stuff in the middle gets smoothed out a little more with MMM, and when the entire length area of the couch is measured and averaged.

1619176044178.png


For the space around my open-plan living room precise correction at one area is not a realistic goal, and some "fuzzy" EQ may be all that's needed.

1618882705869.gif


Var smoothing in the lower frequencies would be too noisy for me to visually read and interprete for the above space.
 
I agree with @thewas that ERB smoothing in REW is just a simplified approximation. John Mulcahy himself has said that he finds little practicality with it when it comes to room measurements -- the resolution in the bass is too low to be useful. You might as well cycle between var, 1/12, 1/6, and psychoacoustic instead as a visual aid check. ERB can be useful in the HF, but since it and psychoacoustic gives you very similar/identical results there... why not use the latter instead?

Personally, I prefer the "brute-force" technique, as Mitch would call it, which is to take a gazillion measurements and base corrections mainly on an average. Stuff that's less controllable and more chaotic around the transition zone, I tend to apply only small changes or leave alone. At least with my own measurements, I've seen a pattern where psychoacoustic smoothing gives somewhat similar results to a FDW of 7 cycles (where time domain information filtering is used).

Below is left channel of my Sceptre S8 using the average of 30 sweeps 1/48 around a squarish area of 15x15x5 inches at the MLP (a dedicated listening room) where no further smoothing is applied. Stuff in the middle gets smoothed out a little more with MMM, and when the entire length area of the couch is measured and averaged.

View attachment 125789

For the space around my open-plan living room precise correction at one area is not a realistic goal, and some "fuzzy" EQ may be all that's needed.

View attachment 125794

Var smoothing in the lower frequencies would be too noisy for me to visually read and interprete for the above space.
 
I agree that ERB smoothing is not appropriate in every case. I love no-smoothing response for some things, when designing a speaker, and progressively more smoothing for other. ERB smoothing is a reasonable mechanism to provide a way into our perception of the speaker/room response. For removing resonances, one likes to know exactly where it is, and it's bandwidth, demanding higher resolution. Knowing the decay time is also usefull (as gleaned from waterfall plots), as I personally would prefer to notch out resonances with long decays.
 
I am sorry but I have to stand in the defense @amirm .

Even though he may had best intention in mind, in my humble opinion @jaakkopasanen is the one who has created a confusion here. I must say that he did a brilliant job in explaining how we should interpret what we see on the graphs when we want to apply EQ and we should all thank him for this, but I think he actually did no favor to Amir's point of view in the video.

@amirm did a great job in explaining what we actually hear vs. what we see on the graph. He used ERB as a point of reference with regards to psychoacoustics to make a point and teach us why is it that we hear something the way we do. But for @jaakkopasanen to turn this other way around in the same thread on the forum to make a polar opposite point explaining what we see vs. what we hear, IMHO must not be done without previous explanation, just for the sake of avoiding confusion.

In a similar fashion, one could make a video explaining why we see colors the way we do, and use the very presence of light as a point of reference. Someone else could try to make his own point in the same thread about why we don't see colors in the absence of light and create similar confusion. And the discussion goes in the opposite direction, and now everyone is trying to prove weather or not the colors even exist in the absence of light.

Although the knowledge provided is doubled for someone, for someone else it may be just confusion. IMHO, threads in which we discuss what we hear (vs what we see on the graph) and the threads in which we discuss what we see on the graph (vs. do we hear it or not) should be miles apart. Simply because of possible misinterpretation.

I hope I did not create even more confusion here. But if you think I did, just please follow your intuition and learn both aspects of the same thing, but please don't further undermine Amir's work here, because he did his best not to create confusion in the first pace. We hear with our ears and we see with our eyes, both are interpreted by the brain simultaneously. Audio is about what we hear.
Amir does good job in explaining how to read frequency response graphs and how the graph corresponds to our hearing perception. Unfortunately Amir made one big mistake in claiming that peaks and dips matter more in the bass range than in treble range when the opposite is true. This wouldn't be a big deal if he wasn't such an authoritative figure in audio for many. In this video Amir is spreading misinformation and most people don't question what he says and now there are who knows how many people who think that peaks and dips matter more in low frequencies when the truth is that they matter more in high frequencies. Pointing out this was what my first response was about. I wrote my second response because some people in this thread got confused about the logarithmic x-axis issue, which is essential in understanding why peaks and dips in frequency response graphs matter more in high frequencies. My intention was not to divert the discussion to away from the topic in Amir's video but to point out the mistake and then after that clarify some points which honestly I should have done already in my first response.

Here is another visualization of the filter widths in case the problem is still unclear to some readers. These represent rectangular filters at 9 arbitrary frequencies. See how the filters get narrower towards the higher frequencies. That is because x-axis is logarithmic in this plot, just like it is typically in frequency response graphs. The first filter has more room for dips and peaks which will be averaged by our hearing than the last one meaning that small peaks and dips in the high frequencies are more audible than in low frequencies.
ERB Filter Widths.png
 
I found this video very interesting and useful just like all the others. I'm just a music addict and a tech quasi-moron so anything you are willing to explain is like gold to me.

Thanks for your precious work Amir and keep 'em coming.

Regards
 
Thanks to Amir for succinctly explaining a phenomenon that has (perhaps) been deliberately complicated for the purpose of marketing.

This illuminates why I leave well enough alone although DIRAC is available to me. Sometimes you get lucky in a given room, with gear selection.

SWAG - as we get older, our HF hearing is inconsistent. Mine is most sensitive to the sound of crinkled Cheetos bags.
 
Unfortunately Amir made one big mistake in claiming that peaks and dips matter more in the bass range than in treble range when the opposite is true.

I did not get this from his video.
I can see where it can easily be misinterpreted though when he talks about ERB and about room resonances for lower frequencies which are very room dependent. Too condensed and not elaborate enough.. perhaps.. but videos can't become too long.

Looking at reviews made by Amir and the corrections he made for somewhat more audible peaks in the treble it is evident that Amir does not think peaks/dips in the treble are not as audible as the ones in the bass.
 
Amir does good job in explaining how to read frequency response graphs and how the graph corresponds to our hearing perception. Unfortunately Amir made one big mistake in claiming that peaks and dips matter more in the bass range than in treble range when the opposite is true. This wouldn't be a big deal if he wasn't such an authoritative figure in audio for many. In this video Amir is spreading misinformation and most people don't question what he says and now there are who knows how many people who think that peaks and dips matter more in low frequencies when the truth is that they matter more in high frequencies. Pointing out this was what my first response was about. I wrote my second response because some people in this thread got confused about the logarithmic x-axis issue, which is essential in understanding why peaks and dips in frequency response graphs matter more in high frequencies. My intention was not to divert the discussion to away from the topic in Amir's video but to point out the mistake and then after that clarify some points which honestly I should have done already in my first response.

Here is another visualization of the filter widths in case the problem is still unclear to some readers. These represent rectangular filters at 9 arbitrary frequencies. See how the filters get narrower towards the higher frequencies. That is because x-axis is logarithmic in this plot, just like it is typically in frequency response graphs. The first filter has more room for dips and peaks which will be averaged by our hearing than the last one meaning that small peaks and dips in the high frequencies are more audible than in low frequencies.
View attachment 125942

It's fine what you say when we talk about effects of ERB on the headphones which excludes the entire field of in room reproduction where we use loudspeakers. Here we cannot even talk about it without including many troubles we get into in the area bellow transition frequency, as well the psychoacoustic effects of the reflections above the transition frequency. For the former i can highly recommend chapter 13 of dr Toole's book and for the latter chapter 6 of the same book.

If Amir was to include all of this in a single video it would simply be too long. But nothing he said was wrong in this short video.

At our listening position we get in room waveform summations at low frequencies which must be correct not only in frequency but also in time domain, simply because their length cannot fit the room. If we follow equal loudness curve, here is where we get in trouble where equalization sometimes may not suffice. The longer wavelengths, the more room modes are excited and the more has to be corrected. And yes, it is certainly more audible if we get it wrong, which is what Amir says in the video. Again, chapter 13 of dr Toole's book contains information which are beyond the scope of this video.

On the other hand, for higher frequencies we hear sum of direct sound and the reflections. In Toole's book, chapter 6 we can get familiar with the terms of the precedence effect and time interval of the fusion zone where we cannot discriminate direct from the reflected sound. Here is where ERB comes in handy because in this range of frequencies we cannot exclude time domain either. We can go on applying more and more filters but in the end we may come to conclusion that mild equalization in a wider bandwidth may sound very similar, depending on the room reflections and loudspeaker properties such as directivity.

So, it all depends how the speaker interacts with the room. But it certainly needs more help in the lower region as fundamental tones resolution is crucial here. Upper harmonics below the fusion interval are to our ears blended with reflections and for time differences which are greater we perceive them as spatially separated auditory images. Again we cannot equalize and exclude the psychoacoustic effects of the room reflections.
 
Nothing he said was wrong in this short video?
Well, Amirm's believer can not understand because what Amirm says is just dogma, but this is a very basic and obvious misunderstanding.
He never finds any room for argument, Amirm himself knows that.

No one is flawless, but keep to having a rational, scientific attitude or not matters, that divides us.
Who would pay respect for a ruffian who electrocuted animals just for having the negative campaign of alternating current?
I hope Amirm remind himself once again why people love him, what will maintain his reputation and dignity, keep him regarded as a respectable person.
 
Last edited:
Posted this on YouTube but will also post here to see if I can get an answer. My question is we all know with age that there is high frequency hearing loss. But with age, does the width of the ERB change as well? In other words, as you age, do you start to average larger bands of frequency together? Thanks in advance!
 
but why so much content out there saying that you shouldn't smooth the bass much (or at all) when doing roomEQ?

because it's very easy to fall into the trap of boosting sharp dips in bass when they're smoothed. phase cancellations happen and happen frequently at these frequencies, if you boost the direct sound in this case the reflected sound will be boosted equally and the dip remain.

this might appear as a 'no harm done' situation since nothing actually changes on the frequency response curve; However, you have significantly lowered your speaker headroom in the boosted region and it's very likely to hear distortion on transients.
 
Amir does good job in explaining how to read frequency response graphs and how the graph corresponds to our hearing perception. Unfortunately Amir made one big mistake in claiming that peaks and dips matter more in the bass range than in treble range when the opposite is true. This wouldn't be a big deal if he wasn't such an authoritative figure in audio for many. In this video Amir is spreading misinformation and most people don't question what he says and now there are who knows how many people who think that peaks and dips matter more in low frequencies when the truth is that they matter more in high frequencies. Pointing out this was what my first response was about. I wrote my second response because some people in this thread got confused about the logarithmic x-axis issue, which is essential in understanding why peaks and dips in frequency response graphs matter more in high frequencies. My intention was not to divert the discussion to away from the topic in Amir's video but to point out the mistake and then after that clarify some points which honestly I should have done already in my first response.

Here is another visualization of the filter widths in case the problem is still unclear to some readers. These represent rectangular filters at 9 arbitrary frequencies. See how the filters get narrower towards the higher frequencies. That is because x-axis is logarithmic in this plot, just like it is typically in frequency response graphs. The first filter has more room for dips and peaks which will be averaged by our hearing than the last one meaning that small peaks and dips in the high frequencies are more audible than in low frequencies.
View attachment 125942

So if i understood correctly, in layman terms, humans are more likely to tell if the tonality is wrong if the peak/dip is in the high frequency, but they won't be able to tell (as easily) at which exact frequency range the dip/peak is happening.

However in lower frequencies the change in tonality wouldn't be as 'annoying' as in higher frequencies, but trained listeners will find an easier time telling at which exact frequency range the problem exists.
 
Back
Top Bottom