Target Room Response and Cinema X-curve

amirm · Feb 14, 2016

There Is No "There" in Audio
By Amir Majidimehr

[Note: this article is slightly revised version originally published in the Widescreen Review Magazine.]

Readers of Widescreen Review Magazine no doubt are familiar with video calibration and its importance in getting proper images from their display. Using specific test patterns which were used to calibrate the production equipment, we are able to match the same standard and hence, achieve a picture that is close if not perceptually the same as what the talent and production crew saw. The only limit is the capabilities of our display, not the existence of a method to get there.

While many think otherwise, such a standard does not exist for audio. Other than setting levels with a simple SPL meter (which in itself generates approximate results), there is no method to match the sound that was heard in the production of movie soundtracks or music. We can guess, make assumptions, etc. but at the end of the day there is nothing remotely like video calibration when it comes to audio. Audio for the lack of a better word is "broken" as an architecture. It has always been and continues to be so. Both for music and video/movies. Yes, strong statements but all true as you will see in this article. Fortunately there is some help here from an unexpected direction but before we get there, let's examine the problem.

Frequency Response
The frequency response of a system is akin to color and grayscale of video. Change the frequency response enough and the timbre, or the tonal quality of the audio changes immediately and readily. Nowhere this is more obvious than the performance of our speaker and impact of the room on the same. While an amplifier can have ruler flat frequency response, even the best rooms can't remotely approach the same.

Even if the speaker has decent frequency response once you put it in a room, all bets are off. The low frequencies specifically are heavily influenced by the room and hence radically change the response of the speaker to anything but neutral. Figure 1 is an example this, showing the "raw" (unequalized) measurement of my Revel speaker in my personal theater. This is an excellently designed speaker yet has no choice but to have its response varied by more than 20 dB once placed in my room.

Figure 1: Measured response of the left channel speaker showing large variation in frequency response (1/12 octave smoothing).

As a good friend is fond of saying,"this is just physics."You could almost put any speaker in there and the low frequency variations will persist in a similar manner.

Perceptually 10 dB SPL is perceived as twice as loud so 20 dB SPL differential represents more than 4X in loudness variation. The variation is frequency specific and hence manifests itself as tonal change from flat response. If a note hits a peak it will sound much louder than if it lands in the valley. The impact will be coloration of the sound. Worse yet, if you move in the room or put the speaker in a different room, you get different colorations yet again.

Now let's go back to our goal of replicating the sound that was heard in the production of the movies or music we bought. For starters, let's agree that they are likely to be using a different set of speaker than we are. With thousands upon thousands of consumer speaker brands and models, the odds of a listener using the same speaker that was used to produce said content is next to zero. Even if you did try to match the brand and model of speaker, you still can't get there since there is no one speaker that is used in production of content.

So right away we start with the large variations in the speaker frequency response that must be dealt with or we get a different sound. Add to it the impact of the room and there is no way on earth, sans some controlling mechanism, that you can hear the same thing was heard during the production stage.

As I hinted in the beginning of this article, we do have some tools to deal with this problem. For example, we could resort to acoustic room treatments to lower the impact on low frequencies. In this article I will focus on another tool which is electronic equalization. I have covered their use in previous articles with respect to optimization of low frequency response. Here, I am expanding that concept to the full frequency response of the speaker and room in what we call a target curve.

Target Response Curve
The target response curve is the measurement that we like to achieve post equalize of the system. In real life I can never fully achieve it. But it is a "target" and with any target, the closer I get to it, the better I have achieved the goal. And the goal again is to hear what was heard in the production of content.

In theory then, if we all agreed on one curve and managed to get our systems to deliver that, then we would be hearing the same timbre (other aspects such as spaciousness and such are beyond this means and outside of the scope of this article).

Take care of the easy part, no such target curve exists at all for music. That unfortunately is the end of the journey for that type of content. Any frequency response could have existed in the production of music. And each production different yet again from another room/facility. Tonally then we can never know what we are hearing in our room is what was heard when the music was produced. Don't even dream about approximating the live scenario prior to the recording as that adds huge number of variables of its own. We are so far away from achieving that reality that any notion of thinking we can, falls in the realm of fantasy. But let me come back to that a bit later and talk about movie sound for now where there are some standards.

Cinema X-Curve Target Response
The standard practice for movie sound production is the "X Curve" (from the letter "X" in Experimental) which was determined and codified in ISO 2969 and SMPTE S202 specifications in early 1970s. Figure 2 shows what its target response looks like.

Figure 2: Target curve used in production of movie sound. Dotted lines show allowed deviations.

Measurement is performed using pink noise as the source signal and the frequency response is gathered at 1/3 octave resolution. This has the effect of heavily smoothing the response variations. While there are problems with such coarse measurements, it does allow one to see the overall tonal response of the system.

The X Curve looks kind of odd, no? It is hard to fathom why you would want to start to roll off your high frequencies at 2 KHz. The curve's origins is an experiment that shows such a differential between near field production monitors and that of a large theater. I will be covering this in much more detail later but for now, we have years of research post the creation of X curve telling us what good sound is in a room and that graph is not it.Whatever you do, do not, I repeat do not follow the X-curve for your home system.

Realizing the problem in practice, the X-curve has been messaged in various ways both in the form of allowing for margin of error as shown in Figure 2, and different drooping response at 2 KHz. Such variations means that even if the original concept was right, you still have no prayer of achieving the same thing that was heard because so much leeway is allowed.

Let's remember that the X curve like all such responses is a target curve. The reality is different as I menionted. The Audio Engineering Society paper, A Survey Study Of In-Situ Stereo And Multi-Channel Monitoring Conditions, shows how far out of compliance production systems can be. The paper has measurements of 250 Genelec speakers used in professional control rooms. Their composite statistics are shown in Figure 3. The poor X curve shown in solid black line is lost in a sea of wide variations no matter which statistical measure you use.

Figure 3: Distribution of in-room responses of 372 factory calibrated 3-way Genelec loudspeakers in 164 professional control rooms.

In reality the problem is much worse than it seems. The standard calls for measurements using 1/3 octave smoothing as you see in the Genelec data. Variations in room response in low frequencies require accuracy down to a single Hertz. No way can you average this to 1/3 octave and hope to get a view that perceptually matches what we hear. Optimization of low frequencies needs to occur at no more than 1/12 octave smoothing (hence the use of that smoothing in my measurement in Figure 1). A measurement with such light filtering would have shown much more variations than seen in Figure 3.

Don't think that is an isolated problem for Genelec speakers. Figure 4 shows the measurements of a number of Dolby certified dubbing theaters from another study.

Figure 4: Survey of 20 Dolby certified dubbing theaters showing wide variation in response.

These variations are certainly audible so even in the context of professional application of X curve the mission is failed. It is very unlikely that any two theaters would sound the same. Or even sound good due to forced use of 1/3 octave filtering.

Fortunately there is considerable activity in SMPTE and AES organizations to revise the production standard to hopefully bring it up to the state of the art in understanding of room acoustics.Until then, we really have no target "standard" to aim at in calibrating our home systems and certainly not one when it comes to X curve.

Preferred Target Response
Let's inject some common sense here. The ultimate goal is to have an enjoyable system. If we can achieve that, then perhaps it is not the end of the world that it does not match what the talent heard when they produced the content. It is a remarkable thing but we can actually tell when something sounds good even lacking a reference. You do that all the time when you audition speakers. You have no reference to compare those speakers to yet you are able to evaluate if what you hear is "right" or not. Dr. Toole says it best in his must-have book,Sound Reproduction Loudspeakers and Rooms:

"Descriptors like pleasantness and preference must therefore be considered as ranking in importance with accuracy and fidelity. This may seem like a dangerous path to take, risking the corruption of all that is revered in the purity of an original live performance. Fortunately, it turns out that when given the opportunity to judge without bias, human listeners are excellent detectors of artifacts and distortions; they are remarkably trustworthy guardians of what is good. Having only a vague concept of what might be correct, listeners recognize what is wrong. An absence of problems becomes a measure of excellence."

There is a great manifestation of this in the AES paper,The Subjective and Objective Evaluation of Room Correction Products, by Dr. Olive et al. This is a double blind preference study of room equalization systems as shown in Slide 1. The paper is a few years old so some of the equipment is dated. But the data still remains valid with respect to listener preference and target curve we may want to use for our playback systems.

Slide 1: Room equalization equipment tested for listener preference.

The listening room is shown in slide 2 although the speaker used for this test was the B&W 802N. Levels were matched across all equipment to 0.1 dB. Eight (8) trained listeners participated in the testing. "No Equalization" was a control and reference.

Slide 2: Listening room used by Olive et al. to evaluate listener preference for room equalization.

How did each system do? Slide 3 shows their overall rankings. We see three systems that outperformed doing nothing (#4 "No EQ") which is good news as far as our ability to better the system sound using electronic equalization.

Slide 3: Preference rating of the room equalization systems. We see two (#5 and #6) actually performing worse than no equalization!

Surprisingly two systems, #5 and #6, actually degraded the fidelity by garnering scores less than no EQ! This tells us that there is a right way to perform such equalization and a wrong way. Objective measurements tell us the potential reason for varying preference scores. The averaged frequency response across all 6 systems is shown in Slide 4. Notice that the two systems which underperformed no equalization both had rather flat frequency response.They also had responses that had substantial amount of dips/variations. The better performing systems on the other hand had smooth, sloped down responses. This runs counter to the common notion that flat frequency response must be right.

Slide 4: Listener preference rating strongly points to a frequency response that slopes down and is smooth.

Contributing to the poor performance of the worst system (in teal color and #6 in the previous graph) was a design weakness of the specific speaker used in the test. The B&W has a response dip around 2 KHz due to large disparity in the size of its drivers (the larger driver becomes too directional before the next smaller driver takes over impacting the tonal quality of the reflected sounds). That EQ system made things worse by applying a further dip in those frequencies causing a nearly 5 dB trough around 2 KHz. There is no science that backs having such a 5 dB dip is beneficial. Indeed research shows the complete opposite of needing to have smooth response. The poor subjective scores of this EQ system and B&W speaker confirm this understanding.

Going back to our X curve, the two systems at the bottom match it best by having a rather flat low to mid target frequency. The perceptual effect however was a negative one. Clearly then sticking with the notion of matching X curve is a bad idea.

The learning here is that we need to trust our ears and use a target response that is smooth but sloping down as we go from low to high frequencies.

Unfortunately the automated equalization in many mass market AVRs and processors do not allow any customization of the target response. In the case of the popular Audyssey for example, you need to opt for its Pro version to be able to customize its target response curve. The implementation in consumer AVRs without that is all or nothing. If the Pro version is available for your AVR, I highly suggest that you use it to modify the default response to be sloping down.

Figure 5 shows the user interface of the Audyssey Pro software and the default target response choices available. Note the checkbox "mid-range compensation." This is the reason there is a dip in the target graphs in red. Despite Audyssey's recommendation to the contrary, my strong recommendation is to leave this checkbox off. Or as a minimum, perform the calibration test twice and listen with or without and see which one you like better.

Figure 5: Audyssey MultEQ Pro target response graph selection and design. Note the flat response in low to mid frequencies and the dip in mid-range.

Fortunately there are systems with the right response curve as in the JBL Synthesis equalization in Figure 6 (not surprisingly, from Harman which performed the EQ blind tests above). The bright green is the measured response. It almost completely hides the target/desired response (a good thing since it is showing tight compliance with the target response). If you look carefully, you can see that it slopes down about 5 dB from 100 Hz to 20 KHz. That should be the starting point for your equalization efforts.

Figure 6: JBL Synthesis "ARCOS" room equalization target curve shown in gray dashed line. The light dark green is prior to application of the response curve/EQ and the bright green is post equalization measurement.

Summary
So as you see, we are driving blind to some extent in audio. We have no yardstick as we have in video to calibrate our systems. Instead, we need to start with the best practices in the industry and a target curve which slopes down. From there let your ears be the guide. Play some content and if the bass is too strong, it likely is. Reduce the slope some. Don't let anyone shame you into sticking with some reference as is commonly done in online forums. What sounds good is the right answer.

References
A Survey Study Of In-Situ Stereo And Multi-Channel Monitoring Conditions,Aki V. Mäkivirta and Christophe Anet, Genelec OY, AES Convention 111, November 2001

The Subjective and Objective Evaluation of Room Correction Products,Sean E. Olive, John Jackson, Allan Devantier, David Hunt, and Sean M. Hess, Harman International R&D Presentation

The Subjective and Objective Evaluation of Room Correction Products, Olive, Sean; Jackson, John; Devantier, Allan; Hunt, David, AES Convention 127, October 2009

A New Draught[draft] Proposal for the Calibration of Sound in Cinema Rooms,Philip Newell, AES Technical Committee paper, January 2012

Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms,Dr. Floyd Toole, 2008 [book]

Blumlein 88 · Mar 28, 2016

Purely anecdotal, but I read an interview with Peter Walker many years ago. He preferred a gentle slope of 3 db/decade. That actually works out not so bad, and is much better than the old X-curve in my opinion. It also isn't too far off from some more sophisticated suggested target curves. Some of those stay flat to around 200 hz and then more or less follow something near Walker's curve.

I do prefer EQ that lets you massage your preferred curve some. Maybe I just feel better than sticking with a stock one, but it works psychologically at least.

AJ Soundfield · Mar 28, 2016

amirm said:
Figure 1 is an example this, showing the "raw" (unequalized) measurement of my Revel speaker in my personal theater. This is an excellently designed speaker yet has no choice but to have its response varied by more than 20 dB once placed in my room.

Figure 1: Measured response of the left channel speaker showing large variation in frequency response (1/12 octave smoothing).

Amir, whats going on there at 480 Hz???

amirm · Mar 28, 2016

AJ Soundfield said:
Amir, whats going on there at 480 Hz???

Don't lose the forest from the tree....

Chuck Gerlach · Jun 5, 2016

In most home theater systems, it is not uncommon for the user to crank up the LFE channel to get more bass impact. And I have been doing that with my Audyssey based systems since I was never able to get the impact I wanted (without some other downside) using a modified target curve. Unfortunately, when you crank up the LFE channel, vocals can (and usually do) get way too chesty.

On some other forum, you posted this curve. I have played with it a bit and had some interesting results. Audyssey Pro does not have enough granularity in the curve editor to get good results. But Dirac (and I assume others) do. The issue, however, is that it is very easy to overdo this curve. In fact, doing an exact replica of this curve is a bit too much to my ears and to a few others who have tired it.

Since I now use a Datasat SSP which supports Dirac and a much superior target editor, I have done some more experimentation. To demonstrate my OCDness, I printed this on a piece of paper, drew more granular lines and used those point interesections to generate my target. Change the shape just a bit and it really is a good compromise - and works reasonably well for music also. Since my subs are only a few db down at 5HZ, I have to have a slightly different version but I really like this curve - for now!! By the way, what does the red line represent??

Fitzcaraldo215 · Jun 5, 2016

I used Audyssey Pro for awhile, as well. And, yes, its stock target was deficient, preferring flat response below the transition frequency among other major deficiencies at higher frequencies. They seemed to assume listening at cinema reference level, much too high for me, usually. Or, if at a lower level, you would be using their Dynamic EQ feature, which introduced a smooth low and high frequency boost to the curve, similar to Fletcher Munson, and proportionate in degree on your volume control setting below reference level.

Since music listening is vastly more important to me than cinema and video, and since there are no standard reference levels for music recordings, I did not find Dynamic EQ useful at all. I did not like it.

But, I do think part of the answer about the "right" degree of bass boost in the target is related to one's typically preferred listening levels and diminished sensitivity of our ears at lower than "standard" levels. That, and the degree to which one is merely a "bass freak". But, we are talking about psychoacoustics and arbitrary listener preferences.

I also use Dirac now myself. But, I have been delighted just using its unmodified, default target throughout the range, containing slightly rising bass and smoothly falling HF, with the greater downward rolloff above about 15k. I have not fiddled with deep bass response below 50hz, as the curves you provided show. There just is not a lot down there in music or on dialog with video. As to cinema effects, like cannon fire, earthquakes, helicopters, etc. I have no internalized idea what the specific examples of those on the cinema sound track are supposed to sound like. I have no reference in my head according to which I can tweak low bass levels to sound more like the "real thing". I like to think, perhaps falsely, I have a better reference for how classical music is supposed to sound from my frequent live concert attendance.

I am really quite happy with music. Organ music shakes the room with excellent dynamics and so called "quickness", plausibly, and without boominess or hangover, as necessary. Dialog articulation, even with TV broadcasts, is superb, a great pleasure to listen to. And, a cinematic earthquake sounds like a startling, shaking replica of an earthquake to me, even if it is not the same exact version of earthquake sound they put on the soundtrack. How would I know the difference?

NorthSky · Jun 5, 2016

Some of my best quality music recordings are from Blu-ray movies...in the closing credits...and they happen to be in multichannel.
And of course in some scenes of the best Blu-ray movies, be it from orchestral music composers or tunes from bands that were very well recorded and transferred in lossless DTrueHD or dts-HD Master Audio or multichannel PCM hi-res audio.

Music in movies (quality music, quality movies) is no different to me that staight stereo music from vinyl, CDs, SACDs, hi-res music downloads and streaming.

I hate dynamic range application, and I hate volume level compression; it kills the main music essence...it has no life no value no redemption.
_____

I am no expert in room's acoustics, in target music and movie curves, in equalizers, in filters.
I like what I'm reading, what I'm seeing, what I'm hearing. Above all else I love well reproduced music with a high emotional level, presence.
_____

Bonus: http://www.soundandvision.com/conte...-av-preamplifierprocessor#HU848OaZHfHir5KU.97
• http://www.whatsbestforum.com/showthread.php?18812-JBL-Synthesis-SDP-75

* It was Ron's birthday yesterday, Happy belated Birthday Ron!
And one of our cats is dead, most likely robbed of his life by an eagle...tears in our hearts dropping across the fields of destiny. He was a lovely cat.
And RIP also Ali.

RayDunzl · Nov 20, 2016

It seems to me that the X-Curve

or some variation of it, is commonly seen in audio recordings...

Source: Hank Williams Jr Those Tear Jerking Songs

Not always, but "most of the time".

amirm · Nov 20, 2016

Music in general has naturally decaying high frequencies so that slope on the high-end is typical. And low-end likewise represents lack of energy at those bass frequencies. That said, I think it is a coincident that the track you have is so close to it. I just took a random track in my library and got this:

So not close at all to X-curve.

fas42 · Nov 20, 2016

A recording of someone hammering the cymbals, alone, of a drum kit is a recording - which I'm sure will have an unusual spectrum. Which needs to be reproduced accurately to sound authentic - why does a "curve" have to be involved?

amirm · Nov 20, 2016

fas42 said:
A recording of someone hammering the cymbals, alone, of a drum kit is a recording - which I'm sure will have an unusual spectrum. Which needs to be reproduced accurately to sound authentic - why does a "curve" have to be involved?

Because without it, we find the high frequencies too bright.

fas42 · Nov 20, 2016

Too bright - or the distortion is too much ... ?

amirm · Nov 20, 2016

fas42 said:
Too bright - or the distortion is too much ... ?

With the same system, the dropping high-end sounds more natural to listeners.

Sal1950 · Nov 20, 2016

amirm said:
With the same system, the dropping high-end sounds more natural to listeners.

The "tizzyness" in the reproduction of the top end by most if not all of today's high frequency tweeters?

fas42 · Nov 20, 2016

The "tizzyness" is distortion, system generated - it's one of the measures of a 'competent' system that this quality ceases to be audible - real life instruments don't produce it, and a decent rig shouldn't either. And it has nothing to do with the construction of the tweeters, or FR - our ears hear something is wrong with the sound, and the usual band aid of dulling the treble is just avoiding a proper solution.

amirm · Nov 20, 2016

Sal1950 said:
The "tizzyness" in the reproduction of the top end by most if not all of today's high frequency tweeters?

No, I think that is even worse phenomena than what we are talking about

. The testing was done by Harman using B&W 801 speakers. So not the planar tweeters that are super bright.

amirm · Nov 20, 2016

fas42 said:
The "tizzyness" is distortion, system generated - it's one of the measures of a 'competent' system that this quality ceases to be audible - real life instruments don't produce it, and a decent rig shouldn't either. And it has nothing to do with the construction of the tweeters, or FR - our ears hear something is wrong with the sound, and the usual band aid of dulling the treble is just avoiding a proper solution.

So create your proper solution, then test flat (equalized) response to one drooping down. You will find the latter preferred.

Part of the explanation is that in real life sound will hit objects and high frequencies get absorbed the most. As a result, the sum total of what you hear, i.e. direct+indirect, will have its high frequency components reduced. If you then build a system that attempts to product a flat response, it will be amplifying the highs as to compensate for that drop and hence, will not sound natural.

As I mentioned, distortion is orthogonal to this topic and is kept constant.

Sal1950 · Nov 20, 2016

amirm said:
Part of the explanation is that in real life sound will hit objects and high frequencies get absorbed the most. As a result, the sum total of what you hear, i.e. direct+indirect, will have its high frequency components reduced. If you then build a system that attempts to product a flat response, it will be amplifying the highs as to compensate for that drop and hence, will not sound natural.

OK, that adds some logic to the situation. Otherwise I question the "preferred" slope as the listener finding it more accurate to live, that didn't compute.

fas42 · Nov 20, 2016

amirm said:
So create your proper solution, then test flat (equalized) response to one drooping down. You will find the latter preferred.

Part of the explanation is that in real life sound will hit objects and high frequencies get absorbed the most. As a result, the sum total of what you hear, i.e. direct+indirect, will have its high frequency components reduced. If you then build a system that attempts to product a flat response, it will be amplifying the highs as to compensate for that drop and hence, will not sound natural.

Yes, I'm sure people will have their own preferences. Personally, "bright" sound doesn't disturb me, but distorted versions of such irritate me intensely - I look for the reproduced sound to project the correct perspective when I'm well out of the direct path, around corners and such, when obviously the treble content is down enormously; and then still sound natural when also directly in the path of, and close to the treble driver. If correctly rendered this produces the same subjective intensity as being right next to live, acoustic sound happening - if one is not used to this sensation then I guess it would strike them as being "unnatural".

amirm · Nov 20, 2016

fas42 said:
Yes, I'm sure people will have their own preferences.

With respect to this topic, majority of people tend have similar tastes. Multiple companies have done this research and arrived at the same conclusion.

Regardless, this is something that I insist should be done to taste. This is what is great about DSP/EQ. There is no specific speaker "sound" that you must live with. You can change the curve and listen. Or have multiple one to select from depending on the music.

Target Room Response and Cinema X-curve

Founder/Admin

Grand Contributor

Major Contributor

Founder/Admin

Member

Major Contributor

Major Contributor

Grand Contributor

Founder/Admin

Major Contributor

Founder/Admin

Major Contributor

Founder/Admin

Grand Contributor

Major Contributor

Founder/Admin

Founder/Admin

Grand Contributor

Major Contributor

Founder/Admin

Similar threads