• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Understanding Audio Frequency Response & Psychoacoustics (Video)

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
7,020
Likes
6,882
Location
UK
There seems to be some confusion here nobody has addressed yet. The key point in the video is how the equvalent rectangular bandwidth (ERB) grows as frequency grows which means that two sounds in high frequencies need to be further apart (in Hertz) than in lower frequencies for humans to be able to discriminate the sounds. However when we look at bandwidth in octaves, we'll see that the bandwidth decreases as frequency goes up. Combine this with the fact that frequency response graphs typically have logarithmic x-axis and the conclusion is that a peak or a dip in the frequency response graph of the same width will be more audible in high frequencies than in low frequencies. So the opposite of @amirm 's statement.

Here's the plot of ERB in octaves vs frequency
View attachment 125657
Moreover, Room Eq Wizard has an ERB smoothing function where the high frequencies are smoothed with narrower smoothing window than low frequencies: https://www.roomeqwizard.com/help/help_en-GB/html/graph.html
I think this has finally sunk in what you're saying & showing here, in that case I don't think I'll be cutting down the number of EQ filters in the 1000Hz+ zone then, as I EQ "visually" in terms of how the proportions of the peaks & dips look on the graph (Anechoic EQ), and this new ERB knowledge in the vid doesn't change that given the perspective you're now showing in your post here.....where you say (as well as show) that a "peak or a dip in the frequency response graph of the same width will be more audible in high frequencies than in low frequencies".
 
Last edited:

jaakkopasanen

Member
Joined
Jul 12, 2020
Messages
87
Likes
344
Thanks for the tip about ERB smoothing function in REW, I didn't know about that. How come Var Smoothing is recommended at that link for EQ purposes and not the ERB smoothing function......is there much difference?
Hopefully someone else can answer this one because I can't. I hardly ever use REW but just had
Regarding your very last statement....shouldn't the smoothing be done on a -wider- frequency range at higher frequencies, and lower frequencies smoothed using -narrower- frequency widths? Or did I just completely misunderstand Amir's discussion on ERB?
You understood Amir's discussion correctly, however Amir got the concept mixed up. He claims people can discriminate frequencies better at low frequencies when the opposite is true (when looking at the bandwidth in octaves). Because people are in truth better at discriminating higher frequencies, the frequency response graph should be smoothed with smaller window (or frequency width as you put it) in the high frequencies.
 

ck42

Active Member
Forum Donor
Joined
Feb 11, 2020
Messages
121
Likes
95
Location
N. Atlanta
Hopefully someone else can answer this one because I can't. I hardly ever use REW but just had

You understood Amir's discussion correctly, however Amir got the concept mixed up. He claims people can discriminate frequencies better at low frequencies when the opposite is true (when looking at the bandwidth in octaves). Because people are in truth better at discriminating higher frequencies, the frequency response graph should be smoothed with smaller window (or frequency width as you put it) in the high frequencies.

So Amir got it backwards then regarding the ERB box width being wider at frequencies increase and narrower as it goes lower?
 

Andrej

Member
Joined
Mar 21, 2021
Messages
94
Likes
130
Regarding your very last statement....shouldn't the smoothing be done on a -wider- frequency range at higher frequencies, and lower frequencies smoothed using -narrower- frequency widths? Or did I just completely misunderstand Amir's discussion on ERB?
Strictly speaking you are correct, but also you are wrong. If you are talking in terms of actual HZ, as you go up in frequency, the range of Hertz you use in smoothing is higher. However, in terms of "bandwidth" which is always relative to the center frequency (think an octave or a fraction/multiple thereof) the bandwidth narrows as you go from 20Hz to 1kHz, and then narrows so little as you go up in frequency that you can think of it being constant. Amir got it wrong, as in backwards.
 

ck42

Active Member
Forum Donor
Joined
Feb 11, 2020
Messages
121
Likes
95
Location
N. Atlanta
Strictly speaking you are correct, but also you are wrong. If you are talking in terms of actual HZ, as you go up in frequency, the range of Hertz you use in smoothing is higher. However, in terms of "bandwidth" which is always relative to the center frequency (think an octave or a fraction/multiple thereof) the bandwidth narrows as you go from 20Hz to 1kHz, and then narrows so little as you go up in frequency that you can think of it being constant. Amir got it wrong, as in backwards.

Okay. So the frequency 'sample set' that is used for ERB smoothing is larger (sample set being start/end range of frequencies) as you move to the right on the frequency plot, but then there's also the concept that a frequency sample is not a singular frequency, but also consist of a start/end frequency range, which instead grows smaller in bandwidth as you move to the right.
 

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,502
Likes
4,137
Location
Pacific Northwest
Perceptually, even when we express pitch differences as frequency ratios, acuity of pitch perception is not equal across octaves. We can detect finer pitch gradations in the mids to treble, around 2 kHz, than we can in the bass, like 100 Hz. By "finer gradations" I mean smaller ratios. But in absolute arithmetic terms, those smaller ratios in high frequencies are still numerically larger. That's why in REW, "perceptual" frequency smoothing smooths the bass more than it does the treble.

Another perceptual quirk is that pitch perception of some sounds is affected by loudness. In the bass, the same frequency played louder sounds lower pitched, in the midrange, loudness has little or no effect on pitch perception, and in the treble, the same frequency played louder sounds higher pitched.
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,746
Likes
242,021
Location
Seattle Area
Some thoughts:
- FQ charts are logarithmic, so on left side the steps are 10 Hz, right 1000 Hz, our different hearing possibilities should be correct represent?
No. Log scaling changes how the graph is presented/stretched. Concept of ERB impacts the average level heard so it deals with the vertical axis, total energy for a band, than the X axis. If you take that graph I showed that had hundreds of spikes, it would have those spikes no matter how you displayed, log or linear. Yes, the appearance will change but not the smoothness of the response. So you have to apply filtering and hence the reason this option exists in measurement programs even though they almost always default to log display for frequency.
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,746
Likes
242,021
Location
Seattle Area
I have a question about the way the frequency graph is laid out. Why do the box sizes vary like they do? They start wide , go narrow then wide then narrow ect. It doesn't seem to correlate with any aspect of pshcoacoustics that I've heard discussed.
As I think someone explained, log display is done for a different reason that the concept in this video. Here, as I explained just above, we want to average the power of a range of frequencies when showing the Y axis. Not X. Log vs linear X axis doesn't deal with this. Often I use linear axis when I want to show all artifacts such as in Jitter graph.
 

KiyPhi

Active Member
Joined
Mar 24, 2021
Messages
148
Likes
271
Good video. I think it covered a lot of topics beginners misunderstand in a very digestible way. I wish this resource was available when I first started. I spent a lot of time fiddling with filters and was confused on why some weren't noticable at all. It took a lot of digging to find out why when you don't quite know the words to find the info. Now it is available in a succinct video.
 

bigjacko

Addicted to Fun and Learning
Joined
Sep 18, 2019
Messages
723
Likes
362
The key point in the video is how the equvalent rectangular bandwidth (ERB) grows as frequency grows which means that two sounds in high frequencies need to be further apart (in Hertz) than in lower frequencies for humans to be able to discriminate the sounds. However when we look at bandwidth in octaves, we'll see that the bandwidth decreases as frequency goes up. Combine this with the fact that frequency response graphs typically have logarithmic x-axis and the conclusion is that a peak or a dip in the frequency response graph of the same width will be more audible in high frequencies than in low frequencies.
Can I understand it like this? Log scale is pure mathematical scale that has no human perception involved. ERB is human perception, it does not necessary has to be same scale to log scale. So in the end when we compare ERB to log scale we found that they are different because that is how human perception is.
 

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,221
Likes
5,498
There seems to be some confusion here nobody has addressed yet. The key point in the video is how the equvalent rectangular bandwidth (ERB) grows as frequency grows which means that two sounds in high frequencies need to be further apart (in Hertz) than in lower frequencies for humans to be able to discriminate the sounds. However when we look at bandwidth in octaves, we'll see that the bandwidth decreases as frequency goes up. Combine this with the fact that frequency response graphs typically have logarithmic x-axis and the conclusion is that a peak or a dip in the frequency response graph of the same width will be more audible in high frequencies than in low frequencies. So the opposite of @amirm 's statement.

Here's the plot of ERB in octaves vs frequency
View attachment 125657
Moreover, Room Eq Wizard has an ERB smoothing function where the high frequencies are smoothed with narrower smoothing window than low frequencies: https://www.roomeqwizard.com/help/help_en-GB/html/graph.html

You're the guy we owe thanks for AutoEQ correct? Sorry for the off topic but thank you and Amir as well for the video.
 

Dzhaughn

Active Member
Joined
Mar 12, 2020
Messages
140
Likes
392
Hopefully someone else can answer this one because I can't. I hardly ever use REW but just had

You understood Amir's discussion correctly, however Amir got the concept mixed up. He claims people can discriminate frequencies better at low frequencies when the opposite is true (when looking at the bandwidth in octaves). Because people are in truth better at discriminating higher frequencies, the frequency response graph should be smoothed with smaller window (or frequency width as you put it) in the high frequencies.

That seems like a big mixup if so. Can you provide a source for what your claim about discrimination at high frequencies?
 

Andrej

Member
Joined
Mar 21, 2021
Messages
94
Likes
130
That seems like a big mixup if so. Can you provide a source for what your claim about discrimination at high frequencies?
I think both of us did, as well as Amir did in his article. It is just that Amir's interpretation was backwards. Amir claims (and I assume his facts are indeed facts) that at 300 Hz we smooth over a bandwidth of 60Hz, and 60/300 is more than 1.1kHz/10kHz, thus we are (human hearing based on these assertions) more sensitive at high frequencies. @jaakkopasanen also provided a plot which shows how this smoothig/sensitivity changes as you go from low to high frequencies.
 

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
7,020
Likes
6,882
Location
UK
To be honest, I'm a bit disappointed by this video now, earlier I had commented on it's usefulness & explanations, but it's come to light through comments/evidence from @jaakkopasanen & @Andrej that Amir's video is either containing wrong information/advice or instead is easily creating misinterpretations (I'm not sure which). To be honest, I'm a bit confused about this topic now, which I think a great deal many people will be too after having watched the video and come to the conclusion that peaks & dips in the frequency response graphs the further up the frequency range are less consequential due to ERB......however now it seems that's rubbish due the log scale of the x-axis not being taken into account and the "octave" discussion that jaakopasanen pointed out, but I don't know if that's the same as the "log scale of the x-axis" effect. To be honest I'm a bit confused by this now, not to mention that REW recommends using Var Smoothing rather than ERB Smoothing for equalisation, with Var Smoothing showing zero smoothing in the bass area where ERB Smoothing provides the most smoothing in the bass.......so I'm having trouble with this all adding up......it seems like there are many conflicts. I have to say I don't think this video helps at all now.

EDIT: I hope we can get to the bottom of this and provide some clear explanations/demonstrations of reading frequency response graphs & their applicability re EQ decisions.
 
Last edited:

Arc Acoustics

Member
Joined
Mar 17, 2021
Messages
75
Likes
53
Location
Japan
Yep, if there are peaking on 100 Hz and 10 kHz which has the same Q factor, but peaking on 10 kHz is GREATER BECAUSE IT HAS GREATER BANDWIDTH ON THE LINEAR AXIS.
It's sound just stupid as this, at least for me.
I strongly suggest retracting the video.
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,746
Likes
242,021
Location
Seattle Area
however now it seems that's rubbish due the log scale of the x-axis not being taken into account and the "octave" discussion that jaakopasanen pointed out,
What? The x scale has no bearing on this topic. It simply stretches the axis differently. It does not at all change the perceived amplitude (Y axis) which is dependent on ERB with respect to human perception.
 

jaakkopasanen

Member
Joined
Jul 12, 2020
Messages
87
Likes
344
The video skipped the logarithmic x-axis issue entirely, probably Amir assumed the audience to be familiar with it. Let's see if I can shed some light to this topic.

Human hearing is logarithmic in nature. This means that the ratio of frequencies is important for us, not the absolute change. Going from 100 Hz to 200 Hz is one octave (factor of 2) as is the jump from 1000 Hz to 2000 Hz. We perceive both jumps to be equally large even though one was +100 Hz and the other +1000 Hz (ten times as many Hertz). The ratio stays the same so the perceived distance stays the same.

We wish to display frequency response in a way that the graph matches how we perceive it, so we plot the graph on a logarithmic x-axis. Logarithmic axis is based on ratios instead of absolute values like linear axis. So the horizontal distance in the graph is the same for 100 Hz -> 200 Hz as it is for 1000 Hz -> 2000 Hz because the ratios are the same. Presenting the frequency response in this way is very convenient because now the visuals match the hearing perception. If we have a peak at 100 Hz which is 20 pixels wide it's going have the same perceptual width as a peak at 1000 Hz that is 20 pixels wide. Of course the peak at 1000 Hz is much wider in terms of absolute frequencies but that doesn't matter because we don't hear things like that.

Equally wide peaks in a frequency response graph with logarithmic x-axis are not still equal when they are at different frequencies. This is because our ability to discriminate frequencies depends on the frequency range. At 100 Hz our hearing averages the loudness in ~35 Hz range and at 1000 Hz the range is ~133 Hz. Clearly the range is larger at 1000 Hz but since our hearing is logarithmic, we should be looking at the ratios, not the absolute values. The ratio is 0.35 at 100 Hz and 0.133 at 1000 Hz, a lot smaller than at 100 Hz.

Let's put these two things together. A 20 pixels wide peak in our frequency response at 100 Hz is less audible than a 20 pixels wide peak at 1000 Hz. The 20 pixel peaks have the same perceptual width since our graph has logarithmic x-axis but because human hearing is better at discriminating in higher frequencies, the 1000 Hz peak will be more audible.

Finally, to understand visually how the frequency response will be perceived, we should smoothen the frequency response. The smoothing window size (the range in which the values are averaged) should be greater in low frequencies in octaves. We could start with 1 octave window size at the lowest frequencies and decrease the window size to 1/6 octaves at high frequencies. With a smoothing like this, the frequency response would visually roughly match how we hear it. Broader peaks and dips in the bass region and finer peaks and dips in the treble.
 

DerRoland

Member
Joined
Aug 5, 2020
Messages
71
Likes
100
Location
Germany
Perfect explained, thanks!

Here as a supplement the REW smoothing options:
- Variable smoothing applies 1/48 octave below 100 Hz, 1/3 octave above 10 kHz and varies between 1/48 and 1/3 octave from 100 Hz to 10 kHz, reaching 1/6 octave at 1 kHz. Variable smoothing is recommended for responses that are to be equalised.

- Psychoacoustic smoothing uses 1/3 octave below 100Hz, 1/6 octave above 1 kHz and varies from 1/3 octave to 1/6 octave between 100 Hz and 1 kHz. It also applies more weighting to peaks by using a cubic mean (cube root of the average of the cubed values) to produce a plot that more closely corresponds to the perceived frequency response.

- ERB smoothing uses a variable smoothing bandwidth that corresponds to the ear's Equivalent Rectangular Bandwidth, which is (107.77f + 24.673) Hz, where f is in kHz. At low frequencies this gives heavy smoothing, about 1 octave at 50Hz, 1/2 octave at 100 Hz, 1/3 octave at 200 Hz then levelling out to approximately 1/6 octave above 1 kHz.
 

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
7,020
Likes
6,882
Location
UK
The video skipped the logarithmic x-axis issue entirely, probably Amir assumed the audience to be familiar with it. Let's see if I can shed some light to this topic.

Human hearing is logarithmic in nature. This means that the ratio of frequencies is important for us, not the absolute change. Going from 100 Hz to 200 Hz is one octave (factor of 2) as is the jump from 1000 Hz to 2000 Hz. We perceive both jumps to be equally large even though one was +100 Hz and the other +1000 Hz (ten times as many Hertz). The ratio stays the same so the perceived distance stays the same.

We wish to display frequency response in a way that the graph matches how we perceive it, so we plot the graph on a logarithmic x-axis. Logarithmic axis is based on ratios instead of absolute values like linear axis. So the horizontal distance in the graph is the same for 100 Hz -> 200 Hz as it is for 1000 Hz -> 2000 Hz because the ratios are the same. Presenting the frequency response in this way is very convenient because now the visuals match the hearing perception. If we have a peak at 100 Hz which is 20 pixels wide it's going have the same perceptual width as a peak at 1000 Hz that is 20 pixels wide. Of course the peak at 1000 Hz is much wider in terms of absolute frequencies but that doesn't matter because we don't hear things like that.

Equally wide peaks in a frequency response graph with logarithmic x-axis are not still equal when they are at different frequencies. This is because our ability to discriminate frequencies depends on the frequency range. At 100 Hz our hearing averages the loudness in ~35 Hz range and at 1000 Hz the range is ~133 Hz. Clearly the range is larger at 1000 Hz but since our hearing is logarithmic, we should be looking at the ratios, not the absolute values. The ratio is 0.35 at 100 Hz and 0.133 at 1000 Hz, a lot smaller than at 100 Hz.

Let's put these two things together. A 20 pixels wide peak in our frequency response at 100 Hz is less audible than a 20 pixels wide peak at 1000 Hz. The 20 pixel peaks have the same perceptual width since our graph has logarithmic x-axis but because human hearing is better at discriminating in higher frequencies, the 1000 Hz peak will be more audible.

Finally, to understand visually how the frequency response will be perceived, we should smoothen the frequency response. The smoothing window size (the range in which the values are averaged) should be greater in low frequencies in octaves. We could start with 1 octave window size at the lowest frequencies and decrease the window size to 1/6 octaves at high frequencies. With a smoothing like this, the frequency response would visually roughly match how we hear it. Broader peaks and dips in the bass region and finer peaks and dips in the treble.
Thanks very much jaakkopasanen, I can definitely visualise all the effects you're mentioning there, and how it applies to reading the significance of the peaks & dips within the frequency response graph, I can also see why ERB smoothing works like it does there. As an extension on this topic, do you get involved much in Room EQ.....you'd think then this would make room EQ very simple - apply ERB smoothing and then it's super easy to iron out the dips & peaks because it smooths the bass so much.....is that the best approach to room EQ, because general advice seems contrary to that in being that you should have less smoothing on the bass for EQ purposes? In a few minutes I'll edit this post & attach a couple of room EQ measurement graphs of mine to illustrate the ease of room EQ using ERB smoothing vs no smoothing in bass (Var Smoothing):

Var Smoothing Measurement (no bass smoothing) & EQ Graphs:
Var Smooth Measurement.jpg

Var Smooth room EQ.jpg


ERB Smoothing Measurement & EQ Graphs:
ERB Smoothing measurement.jpg

ERB Smoothing Room EQ on top of Anechoic EQ.jpg


Yes, so you can see it's a lot easier in terms of using less filters and less sharp filters when using ERB Smoothing for roomEQ, but which is better.....you'd kinda think the ERB is more applicable due to it's perceptual nature, but why so much content out there saying that you shouldn't smooth the bass much (or at all) when doing roomEQ?

What? The x scale has no bearing on this topic. It simply stretches the axis differently. It does not at all change the perceived amplitude (Y axis) which is dependent on ERB with respect to human perception.
The x-axis definitely has an implication here in the reading of the significance of the dip or peak, a wide peak or dip will visually look more significant than a less wide peak or dip (talking pixels on the screen here).....so that would influence your intuitive EQ decision on whether or not to EQ that peak or dip. Jaakopasenen cleared up the points re the logarithmic x-axis & combined ERB influence, concluding in the relation of that to using ERB smoothing......which for me is a different conclusion than what your video implies.....maybe I had misinterpreted your video, but it seemed that one conclusion from your video is that when reading frequency response graphs (logarithmic x-axis) that you should be less concerned by peaks & dips the further up the frequency range you go, due to ERB reasons.....however that's in contrast to what jaakopasanen has explained above......it could just be that your video is easily open to misinterpretation, but I definitely understand it based on jaakopasnen's explanation above.
 
Last edited:
Top Bottom