• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Couldn't we technically standardize "perceived" frequency response of a loudspeaker system?

Joined
Jul 28, 2024
Messages
8
Likes
5
There are a ton of people on this forum smarter and more knowledgeable than me, so I'm hoping someone could explain why I'm wrong, why this is a useless idea, or on the flip side, inform me that this isn't a novel idea and that people are already working on this.

I have recently learned that there is no standard for the measured frequency response of a loudspeaker system from a listening position. While there are commonly cited "room curves" (Harman, B&W, etc.), equalizing your system to match one of these does not guarantee that your system's frequency response will sound anything like another system's frequency response using the same target curve.

As far as I understand, this is because measured frequency response doesn't tell you anything about the relationship between the direct sound and the reflected sound and this relationship greatly impacts how we perceive the frequency response. This relationship will be different from system to system due to things like difference in speaker directivity and room acoustics.

So then, couldn't we create a new way of measuring frequency response that takes into account how the reflections would alter our sense of the frequency response? Already it seems like REW measures useful data about the reflections themselves. All we'd need is an algorithm that takes the measured frequency response and the data about the reflections as input, and then creates a new "perceived" frequency response curve that takes into account how the reflections would affect our perception of the frequency response.

And then we could create a standard curve for that.

I will call this hypothetical way of measuring frequency response "perceived frequency response" or PFR for short because I can't think of a better name, but this is kind of a misnomer; the intention of PFR is NOT to literally take into account every single thing that could affect perception of frequency response like equal loudness contours and or individual ear canal shape, but simply to take into account just how reflections would affect how frequency response is perceived.

I am well aware that standardizing PFR wouldn't standardize everything about a sound system like the reflections themselves or the soundstage/stereo width, but this feels like it would be 100x more useful than having no standard whatsoever, whether you're a mastering engineer, an audio equipment manufacturer or a consumer. It would help prevent the "circle of confusion." Even if you're a consumer who doesn't think the standard sounds good and wants to do their own thing, wouldn't it be useful to know exactly how your tastes differ from the standard so that setting up a system in the future takes less trial and error?

The most practical use I can think of for this is that if PFR is implemented in receivers, it would make auto EQ functions work properly, no?

Just as the standardization of video calibration was a good thing for video, I feel like this kind of standardization would be good for audio even if it doesn't encompass every factor. Am I wrong?
 
One simple reason is that (so far) we listen with ears, not microphones or neural implants, and since ears vary in size and shape, perceived response can thus also vary greatly. The best we can hope to achieve with ears is subjective perception. Unless we clone humans. Then, we may be able to do it for a particular group of clones.
 
One simple reason is that (so far) we listen with ears, not microphones or neural implants, and since ears vary in size and shape, perceived response can thus also vary greatly. The best we can hope to achieve with ears is subjective perception. Unless we clone humans. Then, we may be able to do it for a particular group of clones.
But our eyes all see differently and yet we've created a standard for video calibration. How is that different?
 
But our eyes all see differently and yet we've created a standard for video calibration. How is that different?

It isn't, it's exactly the same. Unfortunately, I don't know enough about the way video calibration was standardized. It is certainly an interesting topic worth looking into.
 
Frequency response is only one of many parts of what we hear. For example, we also hear reflected decay time. And room modes. And and and……. And this points to elephant in the room: the room.

When we calibrate a display we put the sensor at the screen and calibrate it “anechoically”.

When we measure speakers now we do the same thing and, we tune the speaker to be flat in an anechoic environment. But when you put it into a room, things change. Not just frequencies. And we cannot electronically remove the room from the sound.

We do have standards emerging about how a reference room should sound. CEDIA RP22 is one such document. Dolbys studio guidelines is another. But those goals are achieved not by DSP but by the physical characteristics of the room in many case.

Will DSP one day be able to make any room sound just the way those reference rooms sound? Maybe. People sure are working on it. So far, all these room correction systems solve a few things but mostly in the bass (modal) regions and are a long ways from solving all the rest of the rooms influences.
 
I have recently learned that there is no standard for the measured frequency response of a loudspeaker system from a listening position. While there are commonly cited "room curves" (Harman, B&W, etc.), equalizing your system to match one of these does not guarantee that your system's frequency response will sound anything like another system's frequency response using the same target curve.
It WILL guarantee that the frequency response (pretty much) will be that at the measuring position. Anywhere else, and all bets are off.
 
It’s a good idea and I bet you that some of the odd looking speakers which sell well hit a desired target actual perceived sound.

The reason it probably doesn’t happen is a trade secret and competitive analysis and the in/room effect is very dependent on room size.

In contrast, video color fidelity isn’t dependent on room size since diffraction and reflection don’t “really” matter.

 
So then, couldn't we create a new way of measuring frequency response that takes into account how the reflections would alter our sense of the frequency response?
If I understand your question/proposal correctly, this is what the Klippel system does when it generates the Estimated In-Room Response, which is part of every review done by Amir and Erin (and probably others). Of course, it is only completely valid for rooms with the same dimensions, absorption, etc. as the hypothetical room used in Klippel's calculations. Even then, people have different preferences about how much slope the resulting line should have.
 
Frequency response is only one of many parts of what we hear. For example, we also hear reflected decay time. And room modes. And and and……. And this points to elephant in the room: the room.

When we calibrate a display we put the sensor at the screen and calibrate it “anechoically”.

When we measure speakers now we do the same thing and, we tune the speaker to be flat in an anechoic environment. But when you put it into a room, things change. Not just frequencies. And we cannot electronically remove the room from the sound.

We do have standards emerging about how a reference room should sound. CEDIA RP22 is one such document. Dolbys studio guidelines is another. But those goals are achieved not by DSP but by the physical characteristics of the room in many case.

Will DSP one day be able to make any room sound just the way those reference rooms sound? Maybe. People sure are working on it. So far, all these room correction systems solve a few things but mostly in the bass (modal) regions and are a long ways from solving all the rest of the rooms influences.
I want to reiterate that I understand that how a system sounds is more than just PFR, but I would think that standardizing one aspect of the sound is much better than standardizing zero aspects of the sound, and PFR would end up being a much more meaningful thing to standardize than regular old frequency response.
 
It WILL guarantee that the frequency response (pretty much) will be that at the measuring position. Anywhere else, and all bets are off.
True. Sorry if this was unclear, but what I meant was that even if the measurement looks the same across both systems, they won't necessarily sound to a human being like they have the same frequency response even when you're sitting right in the measurement position.
 
A couple of quotes from Floyd Toole's book:
(Automatic room correction) can yield improvements at low frequencies for a single listener, but above the transition frequency to claim that a smoothed steady-state room curve derived from an omnidirectional microphone is an adequate substitute for the timbral and spatial perceptions of two ears and a brain is absurd.

It can be said that humans have an ability to "listen through" rooms (to hear the sound of a loudspeaker while ignoring room acoustics).
 
A couple of quotes from Floyd Toole's book:
(Automatic room correction) can yield improvements at low frequencies for a single listener, but above the transition frequency to claim that a smoothed steady-state room curve derived from an omnidirectional microphone is an adequate substitute for the timbral and spatial perceptions of two ears and a brain is absurd.
I feel like multiple people don't quite understand what I am proposing. We can't standardize frequency response. I know this.

Sure a microphone measuring only frequency response does not match up with how our brain perceives frequency response, but if I may, as an amateur audio enthusiast, disagree with Toole on a purely semantic level without anyone busting out the pitchforks, I don't see why a microphone plus a really good algorithm couldn't match an ear plus a brain pretty much spot on. The algorithm, which takes into account the relationship between the reflections and the raw frequency response, would be playing the role of the brain.
It can be said that humans have an ability to "listen through" rooms (to hear the sound of a loudspeaker while ignoring room acoustics).
Yes, and I think we should figure out exactly how this changes the perceived frequency response so that we can use this for the algorithm for PFR.
 
There are a ton of people on this forum smarter and more knowledgeable than me, so I'm hoping someone could explain why I'm wrong, why this is a useless idea, or on the flip side, inform me that this isn't a novel idea and that people are already working on this.

I have recently learned that there is no standard for the measured frequency response of a loudspeaker system from a listening position. While there are commonly cited "room curves" (Harman, B&W, etc.), equalizing your system to match one of these does not guarantee that your system's frequency response will sound anything like another system's frequency response using the same target curve.

As far as I understand, this is because measured frequency response doesn't tell you anything about the relationship between the direct sound and the reflected sound and this relationship greatly impacts how we perceive the frequency response. This relationship will be different from system to system due to things like difference in speaker directivity and room acoustics.

So then, couldn't we create a new way of measuring frequency response that takes into account how the reflections would alter our sense of the frequency response? Already it seems like REW measures useful data about the reflections themselves. All we'd need is an algorithm that takes the measured frequency response and the data about the reflections as input, and then creates a new "perceived" frequency response curve that takes into account how the reflections would affect our perception of the frequency response.
You mean estimated in-room response.
1722210831916.png

It's calculated from the measured 3D sound field. Integrated over a typical set of room boundaries.

Alternately, you can use moving mic method to get the actual response in an actual room at the MLP, reflections and all. But that is now a measurement that is specific to the room.

And, if you are worried about which house curve to like, that is personal preference. In your post, it isn't clear to me if you are talking preference or measurement. But for sure the measurements exist.

I think you created a bit of a circular argument. Perhaps more useful to measure your room, know where the peaks and dips in the bass exist, and where you have resonances and poor control of reflections. Most of your in-room interactions are not really generalizable to a speaker review. They are unique to your room.
 
You mean estimated in-room response.
If estimated in-room response means the estimated frequency response that a microphone would pick up in a room, then no, that is not what I mean.

I mean a curve that shows frequency response from the listening position but modified to include how the characteristics of the reflections would affect your perception of the frequency response in a way that the raw frequency response from the mic would not show.

EDIT: I may have worded this badly. So when a mic measures frequency response, it does not distinguish between direct and reflected sound. It is all the same to the mic. But our brains do, and are actively trying to hear what the direct sound sounds like, but they can't do it perfectly. PFR would be akin to the frequency response after your brain is done interpreting it.
 
If estimated in-room response means the estimated frequency response that a microphone would pick up in a room, then no, that is not what I mean.

I mean a curve that shows frequency response from the listening position but modified to include how the characteristics of the reflections would affect your perception of the frequency response in a way that the raw frequency response from the mic would not show.

EDIT: I may have worded this badly. So when a mic measures frequency response, it does not distinguish between direct and reflected sound. It is all the same to the mic. But our brains do, and are actively trying to hear what the direct sound sounds like, but they can't do it perfectly. PFR would be akin to the frequency response after your brain is done interpreting it.
Estimate in room response is how it is likely to sound to our ears in a standardized room. This takes into account the impact of reflections on our perception of frequency response.

But you cannot reliably generate that response in reverse, ie, take a random speaker, run your audio through a DSP, and have that speaker sound like a "perfect" estimated in room response. Or, at least, no one has been able to do that yet, and the hurdles are pretty immense when many (or most) speakers on the market have inconsistent off axis response (which is where a lot of the secondary information for our brains comes from).
 
I think MAB has given the best answer. We do have a generalized idea. Also we do to a large extent hear thru the room. Just like in a live concert no two concert halls would measure the same at the audience position, but we can hear thru it somewhat.

Now in principle you could take the results from a Klippel measurement, and combine it with precise measurements or modeling of a room given its dimensions, absorption coefficients and speaker position to accurately predict speaker response which could be then EQ'd for optimum response at the listener's position. Mostly fixing the room below 500 hz and giving a good speaker design you should be getting similar results.

I've messed around with this some using different speakers and doing in room measurements. While I wouldn't say they were so identical you could tell no difference, they are not wildly different unless the speakers are wildly different. So maybe we are halfway to your ideal.
 
If estimated in-room response means the estimated frequency response that a microphone would pick up in a room, then no, that is not what I mean.

I mean a curve that shows frequency response from the listening position but modified to include how the characteristics of the reflections would affect your perception of the frequency response in a way that the raw frequency response from the mic would not show.

EDIT: I may have worded this badly. So when a mic measures frequency response, it does not distinguish between direct and reflected sound. It is all the same to the mic. But our brains do, and are actively trying to hear what the direct sound sounds like, but they can't do it perfectly. PFR would be akin to the frequency response after your brain is done interpreting it.
Measuring in room with REW does somewhat distinguish between direct and reflected sound. It is a gated response. You have a moving filter on the sweep. So it is ignoring reflections not near the sweep frequency. You may have to adjust the windowing of that filter for different size rooms. Our hearing also ignores early reflections which is why above 500 hz we hear mostly the direct frequency response like you would measure in an anechoic chamber.

In case that is not clear, imagine if I send out for a short period of time a 1 khz tone. If I synchonize and only measure that tone for 1 millisecond and then ignore everything after, then as long as the speakers are more than 1 ft from any walls, I will get the direct sound only and by the time any reflections come along I will have cut off the microphone and ignore them. REW and similar software is doing this dynamically as it sweeps thru from 20 hz to 20 khz. There are trade offs in resolution vs frequencies, but it does at least somewhat ignore reflections. As does your hearing.
 
If estimated in-room response means the estimated frequency response that a microphone would pick up in a room, then no, that is not what I mean.

I mean a curve that shows frequency response from the listening position but modified to include how the characteristics of the reflections would affect your perception of the frequency response in a way that the raw frequency response from the mic would not show.

EDIT: I may have worded this badly. So when a mic measures frequency response, it does not distinguish between direct and reflected sound. It is all the same to the mic. But our brains do, and are actively trying to hear what the direct sound sounds like, but they can't do it perfectly. PFR would be akin to the frequency response after your brain is done interpreting it.

My understanding is that there can theoretically be DIFFERENT paths to what is essentially the SAME Estimated In-room Response. For instance, Speaker A could have more top end energy on-axis and less top-end energy off-axis than Speaker B, even if the Estimated In-room Responses are essentially identical. And I think the two would indeed sound different.

Is this the sort of thing you're talking about? That is to say, a way to compare Speaker A to Speaker B which would account for perceptual differences that are not revealed by examining the Estimated In-room Response curves?

Seems to me the direct-to-reflected sound ratio would play a role, and that's going to change with listening distance, and from one room to the next, and would be somewhat dependent on the acoustic properties of the room surfaces, so I think that standardization might need to include specifications covering such variables.
 
Last edited:
There are a ton of people on this forum smarter and more knowledgeable than me, so I'm hoping someone could explain why I'm wrong, why this is a useless idea, or on the flip side, inform me that this isn't a novel idea and that people are already working on this.

I have recently learned that there is no standard for the measured frequency response of a loudspeaker system from a listening position. While there are commonly cited "room curves" (Harman, B&W, etc.), equalizing your system to match one of these does not guarantee that your system's frequency response will sound anything like another system's frequency response using the same target curve.

As far as I understand, this is because measured frequency response doesn't tell you anything about the relationship between the direct sound and the reflected sound and this relationship greatly impacts how we perceive the frequency response. This relationship will be different from system to system due to things like difference in speaker directivity and room acoustics.

So then, couldn't we create a new way of measuring frequency response that takes into account how the reflections would alter our sense of the frequency response? Already it seems like REW measures useful data about the reflections themselves. All we'd need is an algorithm that takes the measured frequency response and the data about the reflections as input, and then creates a new "perceived" frequency response curve that takes into account how the reflections would affect our perception of the frequency response.

And then we could create a standard curve for that.

I will call this hypothetical way of measuring frequency response "perceived frequency response" or PFR for short because I can't think of a better name, but this is kind of a misnomer; the intention of PFR is NOT to literally take into account every single thing that could affect perception of frequency response like equal loudness contours and or individual ear canal shape, but simply to take into account just how reflections would affect how frequency response is perceived.

I am well aware that standardizing PFR wouldn't standardize everything about a sound system like the reflections themselves or the soundstage/stereo width, but this feels like it would be 100x more useful than having no standard whatsoever, whether you're a mastering engineer, an audio equipment manufacturer or a consumer. It would help prevent the "circle of confusion." Even if you're a consumer who doesn't think the standard sounds good and wants to do their own thing, wouldn't it be useful to know exactly how your tastes differ from the standard so that setting up a system in the future takes less trial and error?

The most practical use I can think of for this is that if PFR is implemented in receivers, it would make auto EQ functions work properly, no?

Just as the standardization of video calibration was a good thing for video, I feel like this kind of standardization would be good for audio even if it doesn't encompass every factor. Am I wrong?

This sort of gets onto the edge of things.
Toole and others say that the frequency response is the post important attribute.
If some sets of speakers have identical frequency response then to first order they will sound the same.
But there is also phase and impulse response.
And people have mentioned dispersion and radiation pattern.

Even moving the same speaker in a room, say out from a wall, will change the sound. Some of that may be frequency response, but some if also near echos being moved further out.

Basically, at the edge first order, we then get the lesser orders becoming the most important left… or longest remaining pole in the tent.

In any case, back to impulse response… if the impulse response did not matter, then we should not be able to hear a difference between some EQ and say DIRAC live… but many people do hear a difference.
 
My understanding is that there can theoretically be DIFFERENT paths to what is essentially the SAME Estimated In-room Response. For instance, Speaker A could have more top end energy on-axis and less top-end energy off-axis than Speaker B, even if the Estimated In-room Responses are essentially identical. And I think the two would indeed sound different.

Is this the sort of thing you're talking about? That is to say, a way to compare Speaker A to Speaker B which would account for perceptual differences that are not revealed by examining the Estimated In-room Response curves?
Exactly!
 
Back
Top Bottom