• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Studio Monitors & The Circle of Confusion- What We Know/Don’t Know

Correct. It's the room, not the speakers. At some level once the speakers are "good enough" (and the standard there is definitely lower than you think, +/-2dB on axis is plenty good) they stop mattering so much (and it comes down to preference at that point anyway, I personally have yet to hear a Genelec 83x1 series speaker that I actually liked in spite of their damn near perfect objective behavior), and all the stuff bouncing around the room (or isn't bouncing around) is what starts to matter.

On that note, as fun and massively more ergonomic than a DAW as they are large format consoles are an acoustic nightmare. You have this huge aircraft carrier of a mixer in front of you between you and the speakers and the desk bounce is just horrific. They resonate, too.


This, so much this. It's an art form that happens to use technology. Recordings that are not documentary-style classical recordings (where you basically set up 3 mics in a Decca Tree behind the conductor and that's it) are not real, they are "hyper-real". If you want to hear what pop recordings sound like when the producers and engineers don't try to make things sound "hyper-real", go listen to 50s rock n roll. It's decidedly wimpy sounding compared to what it would have sounded like in person.

Nobody listens to an instrument from 3" away, and yet it is entirely normal to place microphones that close to a source. Why? Because it sounds cool. Rule #1 of audio engineering: "if it sounds good, it is good".
Seems to me that this is part of the problem here. Some people are talking about acoustic/classical record production and others about modern pop music production.

From what I can gather though possibly done in the same studios, the production of the two are very different.
 
Not really, because plenty of people do love them.
Another reason why such choices should be based on an objective basis. In my own home listening environment I tend to add a little presence dip to loudspeakers which don't have such on and off-axis to make many recordings more enjoyable. Its all a old and still unsolved mess.
 
Seems to me that this is part of the problem here. Some people are talking about acoustic/classical record production and others about modern pop music production.

From what I can gather though possibly done in the same studios, the production of the two are very different.
Extremely different. But I specified documentary style for a reason, because you can absolutely record an orchestra in a hyper-real manner (see: most film scores).
This is instrumentation-wise a typical orchestra and choir, but pay attention to how the sounds move around and where they sit. There were, I guarantee you, dozens of mics used for this.


Actually, there's an interview from the engineer who recorded this, and he describes using close to 20 microphones. That isn't uncommon.

 
Last edited:
Sean Olive’s predicted in-room response model is based on a weighted sum of three components

+12% sound power
+44% early reflections
+44% on-axis response.

This makes the estimated in-room response a representation of a farfield listening environment. Not extremely farfield, but certainly beyond the nearfield. For nearfield listening, we can assume that mostly the early reflection response and direct sound response matter the most. So, for such setups, the sound power response may be irrelevant. There's no big research about this YET, so mine is just an assumption.
I am currently reading the page proofs of the 4th edition of my (now our, since Sean Olive and Todd Welti are contributors) book. I took time out to skim this thread, as it on a topic I address in the new edition - translation. There have been some positively thoughtful comments in this thread, including from Blockader, and some that need additional perspective. This is one that caught my eye.

First, the Predicted Room Response (PIR) was created by Allan Devantier, who was a contributor to the final format of the spinorama while he was working at Harman. It was a simplistic way to combine some spinorama curves to approximate the visual appearance of measured room curves above the transition/Schroeder frequency. Although it is a good portrayal of the total sound arriving at the listening position in a "typical" domestic room as measured by an omnidirectional microphone, it does not accurately portray what is perceived by the listener with two ears and a brain. In fact, the original basis of the spinorama concept is a crude modelling exercise I performed in 1985, where I think I was the first to approximate a steady-state room curve from anechoic data. The following figure is from any of the editions of my book, and from the JAES paper in 1985.

An aside: Ilpo Martikainan, the founder of Genelec, and I became friends at this time as he was pursuing the same goal. We would meet at AES conventions and he would proudly demonstrate his latest offerings, playing Mozart on small, medium and large loudspeakers that sounded better and more alike than most others on display at the time. That Genelecs are now highly respected is no surprise, and now, of course, there are others to choose among. Basically neutral loudspeakers are more numerous as time passes.

1751854670611.png

With modern measurements and processing we can do better, but the basic facts have been in evidence for 40 years. The bottom block (d) illustrates the relative contributions of various sounds in small rooms. This was done using the listening room I created at the National Research Council of Canada, which became the prototype IEC 268-13 recommended listening room.

To extrapolate from this to other listening environments one can imagine that as reflected sounds are attenuated, the contribution of the direct sound moves down in frequency, and early reflections contribute less. Because most loudspeakers have constant directivity at low frequencies sound power is a dominant factor, but the room curves tell us that room resonances/standing waves dominate what is heard. The transiton/Schroeder frequency is easy to see. Every room setup needs to be addressed differently. We learn later that this part of the frequency range accounts for about 30% of overall subjective sound quality evaluations, so it cannot be ignored.

Later, more data showed that the early reflections curve alone is a good predictor of room curves above the transition frequency.

1751855849818.png


This is all very interesting in an academic sense, but not very relevant, because other evidence has shown that listeners gravitate to subjectively preferring loudspeakers with flattish and smooth direct sound, starting with my own published research from 40 years ago. Much later, Sean Olive, in a very clever test using our trained listeners confirmed that what they were paying attention to was the direct sound, not early reflections or sound power - above the transition frequency. This should not be surprising, as all electronics and the best microphones all have flat and smooth frequency responses - why not loudspeakers? What was missing was the knowledge of what measurements to pay attention to.

Listening at a distance in acoustically dead rooms, or listening in the near field (close listening) in any "reasonable" room, both put listeners in a strong direct sound field. When these separate pieces of evidence are combined, it is clear that off axis performance of loudspeakers in well-designed professional or home listening spaces cannot be ignored but it is very much a secondary factor. Later evidence discussed in the 4th edition, indicates that the "large space" reflections in stereo and multichannel recordings themselves can perceptually diminish the importance of those in the listening space, as well as - horrors - make certain loudspeaker flaws more difficult to hear.

So, if one is in either of those direct-sound dominant circumstances, and one wants to modify the spectral shape to emphasize something in a mix that is being worked on - example, the much debated Yamaha NS-10M - it is not necessary to substitute loudspeakers. With modern equalization capabilities the direct sound of any loudspeaker can be imposed on that of a timbrally neutral loudspeaker Buy one very, very good loudspeaker and turn it into any number of "coloured" versions at the push of an icon, returning to neutral at the push of another.

But this requires two things: (1) a neutral loudspeaker and (2) anechoic measurements on whatever loudspeaker is to be imitated.

Spinorama data of hundreds of consumer loudpspeakers and a few professional ones are available on the internet, many of them from this forum, thanks to Amir. Intelligent inspection of these data is very informative. I wish the professional side of our industry exhibited as much enthusiasm for hard data as seems to be happening on the consumer side. The ANSI CTA 2034 loudspeaker measurement standard is leagues ahead of anything I have seen from the pro side of the industry - a few individual manufacturers excepted and praised!

It is possible from comprehensive and accurate anechoic measurements alone to identify loudspeakers which, if put into a double-blind multiple comparison test will likely yield a statistical tie. There are subtle differences - the interaction with individual programs causes small variations in sound quality ratings. Is this "good enough"? Perfection may never get to the level of electronics or the best microphones, but we are definitely getting close.

If everybody used such loudspeakers on the pro and consumer side would this eliminate the circle of confusion? I think it just might be a step in the right direction.

The remaining variable is the interaction with small room acoustics at low frequencies. Until these are controlled, and we know a lot about doing that, opinions are up for grabs.

Are loudspeakers still the weakest link? :)
 
Are loudspeakers still the weakest link? :)
No, and to be honest I don't believe they have been in a long time. Like, probably since the mid 90s to early 2000s we were getting to the point of "the speakers are getting out of the way". There's a reason Genelec 1031s became such a commonplace speaker - they're smallish, damn near indestructible, and neutral enough.

Most studio designs are, to be blunt about it, shambolic. Almost never enough LF absorption, haphazardly placed diffusion that doesn't actually do anything useful, and almost never enough attenuation of ceiling or side wall reflections. Let's not talk about the bad soffits that aren't built rigidly enough (for god's sake, don't build them out of plywood! concrete block only!) and resonate or aren't sized right for the speakers...
 
Last edited:
I continue to use a small pair of bass limited speakers to judge mix translation, to check if the bass is overwhelming for smaller systems or especially if it is disappearing due to a small system's limited bass response. Simple frequency response tailoring of a quality flat monitor system will not properly emulate a small speaker's overload characteristics. Nor will it emulate how small bass limited speakers (and cell phones) may distort in the low end. Both of these factors however often need to be taken into account when making a mix that will translate to bass limited systems. Often, for non-acoustic projects, a mix engineer (me included) will actually add harmonics to lower frequency instrumentation so small bass limited systems will not cause low frequency instrumentation to simply disappear in the mix. An average approximation of a bass limited system's distortion is often needed to better judge how much harmonic content will need to be added.
 
I continue to use a small pair of bass limited speakers to judge mix translation, to check if the bass is overwhelming for smaller systems or especially if it is disappearing due to a small system's limited bass response. Simple frequency response tailoring of a quality flat monitor system will not properly emulate a small speaker's overload characteristics. Nor will it emulate how small bass limited speakers (and cell phones) may distort in the low end. Both of these factors however often need to be taken into account when making a mix that will translate to bass limited systems. Often, for non-acoustic projects, a mix engineer (me included) will actually add harmonics to lower frequency instrumentation so small bass limited systems will not cause low frequency instrumentation to simply disappear in the mix. An average approximation of a bass limited system's distortion is often needed to better judge how much harmonic content will need to be added.
These are good points, and are additional considerations to those I mentioned in my post. For the 3rd edition I created a simple graphic addressing part of the concern.

1751900159259.png


There was a certain "tongue in cheek" quality to the graphic, but also some basic truths. You comment on distortion that can arise in small speakers. These days most of them are active, with integrated EQ and overload protection. As a result they often sound relatively neutral within their bandwidth, until the volume is turned up and then a protection algorithm kicks in and the loudness of the low bass is limited to prevent distortion and damage. So, for these speakers there is a "non-linear" linear (frequency response) distortion. It is impossible to predict what will be heard.

The addition of harmonics to bass frequencies is not a bad idea, and commercial algorithms exist to do this. It is based on the "missing fundamental" perceptual phenomenon, wherein the pitch is maintained by the harmonics even if the fundamental is missing, along with any physical accompaniments.
 
Last edited:
These are good points, and are additional considerations to those I mentioned in my post. For the 3rd edition I created a simple graphic addressing part of the concern.

View attachment 461756

There was a certain "tongue in cheek" quality to the graphic, but also some basic truths. You comment on distortion that can arise in small speakers. These days most of them are active, with integrated EQ and overload protection. As a result they often sound relatively neutral within their bandwidth, until the volume is turned up and then a protection algorithm kicks in and the loudness of the low bass is limited to prevent distortion and damage. So, for these speakers there is a "non-linear" linear (frequency response) distortion. It is impossible to predict what will be heard.

The addition of harmonics to bass frequencies is not a bad idea, and commercial algorithms exist to do this. It is based on the "missing fundamental" perceptual phenomenon, wherein the pitch is maintained by the harmonics even if the fundamental is missing, along with any physical accompaniments.
Barefoot has something like this; shame their speakers are hot garbage otherwise.
 
I remember listening to Barefoot's prototype speakers and having a delightful conversation with him at an AES just before his products entered the market. Barefoot's switch to emulate smaller less capable speakers has merit but lacks exactly what my above post discusses. The frequency reducing contour does not also emulate the artifacts created by a less capable speaker system including when it is pushed to its limitations in bass response and volume capacity. I use small unpowered speakers (sometimes more than one type) which are not "active with integrated EQ and overload protection" to approximate a limited reproduction system commonly used in the community as a realistic check for translation issues.

I do sometimes use plugins with algorithms using the "missing fundamental perception phenomenon" which by providing certain higher frequency harmonics fool our auditory system into perceiving clearly a fundamental frequency that may actually be quite faint or not there at all. Plugins of this sort can make us perceive that a bass limited system is actually reproducing bass significantly below its cutoff point. I also find that adding small amounts of lower order distortion can, when appropriate to the mix, often achieve similar results since the process (creating harmonically relate frequencies above the fundamental) is actually somewhat similar.
 
Barefoot has something like this; shame their speakers are hot garbage otherwise.
The recent upper series Focal 3-way monitors have also a switch where they become smaller 2-way.
 
The recent upper series Focal 3-way monitors have also a switch where they become smaller 2-way.
Again, rather a shame that they aren't very good.
 
Again, rather a shame that they aren't very good.
I could live with them for home music listening, similar as with other also not perfectly measuring monitors like ATC.
 
  • Like
Reactions: MAB
I could live with them for home music listening, similar as with other also not perfectly measuring monitors like ATC.
A lot more hash and resonances present than on any ATC I've ever seen, tbh, but yeah - it's not horrible, I just have not had good experiences using them where I have with ATCs.
 
I continue to use a small pair of bass limited speakers to judge mix translation, to check if the bass is overwhelming for smaller systems or especially if it is disappearing due to a small system's limited bass response.
Would it not be a viable, possibly better option to simply perform a spectral analysis of the track to check the amount of and distribution of low-frequency energy? The bass response of "smaller" loudspeakers can be quite variable, as can be the size of these loudspeakers within this category.
Simple frequency response tailoring of a quality flat monitor system will not properly emulate a small speaker's overload characteristics.
In your experience, after you have crafted a mix on "larger" speakers that don't overload so easily at an 85dB or so listening SPL at a 2–3m listening distance, does the mix overload the smaller speakers? If the artistic intent is to have stronger low-frequency content in the music, how does one go about changing the artist's intent without getting into trouble?
Nor will it emulate how small bass limited speakers (and cell phones) may distort in the low end.
In your experience, how common is it for a mix to cause small bass-limited speakers to distort? What is the lower frequency limit of bass content that is allowed through in order not to cause such distortion in the low end?
Both of these factors however often need to be taken into account when making a mix that will translate to bass limited systems.
The modification of a piece of music in order to "tailor" it to the perceived "needs" of less than stellar bass-limited systems seems to be an artistic nightmare for music creators.

Out of all the artists you have worked with, what percentage have actually asked to have implemented this type of distortion of their work?

I fully comprehend that removing unwanted/undesirable low-frequency energy/artefacts from a track can be a necessity. However, adjusting a track to suit indeterminate so-called bass-limited loudspeakers appears to be a solution in search of a problem. Prima facie, the approach as outlined would likely serve only to destroy the intent of a composition like Disc Wars (TRON Legacy) by Daft Punk.

1751937166976.png
 
Last edited:
In your experience, after you have crafted a mix on "larger" speakers that don't overload so easily at an 85dB or so listening SPL at a 2–3m listening distance, does the mix overload the smaller speakers? If the artistic intent is to have stronger low-frequency content in the music, how does one go about changing the artist's intent without getting into trouble?
Generally speaking this is pretty easy and it just involves filtering out the absolute lowest content on individual elements - however one has to be careful doing this, some genres really do need that bottom octave (see: hip hop, electronic, etc).
 
Last edited:
Thanks Witwald for your interest in my recent contributions here. The music mixing process is not something that can be described in just a few paragraphs. Mixing a music track for release is kind of like making a good soup for a large party of people. You are usually trying to make something that will be pleasing to everyone. It should not be too spicy for some yet not too bland for others. People have different tastes just like they have different music reproduction systems and musical expectations.

In making a recording and mix, the overriding goal for nonacoustic music on commercial labels is usually for the mix to translate well (and sound great) on as many different systems as possible, from top audiophile systems to simple clock radios.

Engineers use a variety of tricks to make this more likely to happen. I use small speaker systems as do several others as a reference check on how a mix translates to less capable systems. Yes some others may use some sort of spectral analysis, however many experienced engineers already do something of that sort by ear during the mixing process and find "a real world test" with a small speaker as a faster more direct last check for compatibility. It is also a much easier process, for anyone else in the studio, to hear and understand.

A mix engineer will of course adjust his / her mixing goals to the talent's or production team's intentions. However within that framework one of the most important goals again remains that the mix needs to translate well. The bass processing technique described in my previous post is mainly used when a low frequency instrument is hard to hear in a mix over bass limited reproduction systems because it lacks overtones above their cutoff frequency. This bass processing technique (to avoid misunderstanding) is not usually added to a whole mix but only added to a low frequency instrument so it can be perceived clearly on as large a variety of systems as possible.

As we have seen here on ASR all speakers have a certain level of distortion over their full range which generally increases in the lower frequencies. In sound recording and reproduction equipment the goal is usually to lower distortion as much as is possible and practical. However controlled tastefully added distortion can also have positive effects in nonacoustic music by helping some electronic instruments sound better or/and be heard more clearly in a mix, since their artificially generated sounds often lack the overtone harmonic complexity of acoustic instrumentation. The world famous Minimoog synthesizer produces especially rich sound because its analog filter is overdriven by design (although apparently this was an accident initially).

The real challenge for mix compatibility of course is that there is such a broad range of reproduction systems a mix will likely be played on that it is difficult to decide what kind of system to focus on most. The above referenced Disc Wars (TRON Legacy) by Daft Punk was conceived as a soundtrack and originally mixed to sound great on a large movie sound system with dedicated subwoofers, which is likely why it has such wonderfully extended low frequencies. On small basic sound reproduction systems the scale is significantly reduced because the lowest frequencies cannot be heard well, however there is still enough musical interest to make this track sound good even on these smaller bass compromised systems.
 
Last edited:
Sorry, I live in Central Texas and we have a family ranch on a river near Kerr County, we have been spared of nearly all of the calamity my neighbors have suffered, everyone including animals are safe and sound. We have some heavy equipment and our neighbors needed some help so I have been coordinating and prioritizing where to get who and what where the last few days.

Haven’t read much or the last few days, but from the couple of posts I think we need to take a short step back and, as see where we really are on the CoC. As the Thread says, CoC What Do We Know, and What DON’T WE KNOW.

First, the Circle of Confusion on the production part of the circle (call it 12 and 3 o’clock for connivence) is a hypothesis in terms of science/engineering. It is clearly described as such in all 3 current editions of Dr. Toole’s book and in Dr. Olive’s writings/blog.

There are no direct studies, and therefore no references, one way or the other on the impact of (to be as precise as possible) using natural sounding “monitors” xs. Un-natural monitors in recording or mixing on the final product. . If I am incorrect on this please let me know and cite the direct study. I will re-reference the indirect studies I think bear on the subject in next post.

That’s the part we don’t know.

Contrast that with the end user/consumer listening part of the CoC (6 and 9 o’clock) which has, as Dr. Toole says, over 5 decades of blind testing, and dozens of peer reviewed research papers behind it.

That’s what we do know about the circle of confusion, one half of it is well known.

Dr. Toole, and Dr. Olive, both very brilliant scientists, have hypothesized (read the language they use when discussing recording/production portion of the circle) that the same benefit of having neutral studio monitors would be realized as it does for end users.

As we all remember, a hypothesis (educated guess) is the 3d step (after observations, ask questions) in the scientific method, From there you design a valid, controlled experiment ( the most difficult part of the entire process), conduct the experiment, record results, and sometimes you win a prize.

Dr. Olive, as usual has been very generous with his time, has given me a couple of the indirect studies, which I have posted, but will post again. I will also repeat how I think we can get closer to knowing more from Dr. Corey’s work.

Part 2 Indirect studies in CoC, and practical considerations of how recording and mixing are done next post.
 
Not directly connected to the monitors as such, but here's a pretty interesting video where we hear how 8 different engineers mixed and mastered the same track (independently of each other).

 
Last edited:
Not directly connected to the monitors as such, but here's a pretty interesting video where we hear how 8 different engineers mixed the same track (independently of each other).

Love your contributions, and have one of your speakers on my list to buy in the future - but I have to say, can't watch it. Ouch.
 
Love your contributions, and have one of your speakers on my list to buy in the future - but I have to say, can't watch it. Ouch.

Thanks, but why can't you watch it?
 
Back
Top Bottom