• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Stereo vs mono - what is the ideal frequency response?

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,690
Likes
6,013
Location
Berlin, Germany
The basic trinaural equations are:
L' = L - R/2
R' = R - L/2
C = (L+R)/2
Setup is +-45° for the sides, not +-30° as with normal2-speaker projection.
Therefore, except for content excactly panned 50% to the side, all three speaker get signal any given time. The center image (mono) content is 6dB louder on the center than on the sides. L- or R-only get virtualized with center rendering one part of the signal and the sides working in push-pull fashion generationg the other part, creating a larger stable wave-field around the head that is less prone to collapse under lateral and rotational movement of the head.
There are comb filter effects (beginning right in all three speaker signals with time-difference based -- "AB", vs "YX" -- stereo content) , too, but they are perceptually more benign, actually increasing spaciousness by creating HRTF-triggered pseudo-3D phantom souce localizations, very noticable on reverb tails etc.
 

napilopez

Major Contributor
Forum Donor
Joined
Oct 17, 2018
Messages
2,109
Likes
8,420
Location
NYC
Specific EQ to compensate for the stereo-crosstalk dip will perhaps be a double-edged sword unless you assume that the head and/or the listeners remain in the same exact position at all times. That crosstalk-dip will change when you move your head around and be almost gone when you move off to the sides - where the mono-sound timbre will start to dominate.
The solution to this is sound from reflections filling in the gap, or a center speaker.

@napilopez was pleasantly surprised by the JBL Classic L100 that has this peak around 2k and theorized about that being the cause for his enjoyment of the speaker. Perhaps he can join in here and comment how those speakers sound like when moving out of the main LP.

I have been summoned!

The L100 Classics always sounded a little forward to me (which I liked), and definitely seemed more colored when outside the sweet spot. However, I always sit in the same spot of my couch when I care about the sound, so it didn't matter to me. And to @March Audio's point about hard-panned music not being subject to the crosstalk dip, while true, I definitely focus more on the sound of the center image more than anything else.

For me, and I assume a lot of audiophiles, the quality of the center image is part of the fundamental 'magic' of stereo. I've tried on and off to use center speakers, but it just becomes boring to me. If I can see the center speaker creating center sounds, the magic is gone. It's too hard for my little brain to not associate the center sound with a box placed right in front of me.

Anyway, it's definitely a complicated issue to perfect such fundamentally flawed playback method as stereo, and I do too wonder about how well such a dip gets compensated in the mix. Ultimately I think it's pretty obvious that speakers should be flattish and any compensation that needs to happen can happen on the user's end as the location of the dip will vary based on positioning and its amplitude based on the intensity of sidewall reflections.

But personally, I know I prefer the area around 2kHz to be a little elevated. Anecdotally speaking, I prefer adding a medium-low Q rise of 1 or 2 dB around 2kHz to almost every stereo pair. The D&D 8C and Neumann KH80, the two flattest speakers I've measured, clearly sound better to me this way. In the neumann's case, this applied to both a living room setup and my desk setup. However, this same boost sounds a bit worse when listening in mono, though it's a subtle enough tweak that I don't find it too offensive.

The way I see it, as almost no speaker is perfectly flat anyway (at least not in both the direct and off-axis), Id rather the deviations consist of a little boost in the crosstalk dip region. Equally important, I find speakers with built-in FR dips around this region to universally sound a little dull. There are very few speakers I feel have the right presence in the upper mids, and the L100 was one of them, even though I knew their FR was flawed (but also not terrible! Preference score ~ 7.3 w/sub).

But your mileage may vary. My individual preferences are of course not science, and I never recommend anyone try to boost this region unless they feel the sound is recessed themselves.
 
Last edited:

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,523
Likes
3,745
Location
Princeton, Texas
Occasionally they made tests in stereo and there was no preference difference between between which speakers that were preferred.

True, but in at least one such test (Rega vs KEF vs Quad), the relative score of the lowest-ranking speaker (the narrow-pattern, dipolar Quad) moved up considerably in stereo. This may be related to its narrow pattern, or to its dipole pattern, or both, or something else. The change in the Quad's score wasn't enough to change its ranking (though it apparently could have against different competitors), but it narrowed the gap enough to raise this question: Was the Quad unduly penalized by mono listening?

As an extreme example, suppose a Polk SDA (Stereo Dimensional Array) speaker was auditioned in mono. Would mono listening adequately predict its performance when used the way it was designed to be used - in stereo?

Toole reports that a study by Klippel found spatial quality to account for 50% of perceived accuracy, and 70% of listener preference. Can spatial quality be accurately evaluated in mono? I think it can if we're comparing apples to apples, but I don't think it can if we're comparing Polks to KEFs.
 
Last edited:

March Audio

Master Contributor
Audio Company
Joined
Mar 1, 2016
Messages
6,378
Likes
9,317
Location
Albany Western Australia
We are going in circles in the discussion without serious experiments made. We are also talking about on-axis deviations in the range of +/-1,5 dB or +/-1 dB. There are rather few speakers that match those numbers; a few of them vary with dips in the 2-4 kHz range, others with peaks in that range. The question then, is there a preference for one or the other, all other things equal? It is a simple experiment in theory, compare the same speaker with the Shirley and inverted Shirley frequency EQ curve, both conditions within +/- 1,5 dB, in stereo. Which one is preferred?

So its all rather moot then.

You are ignoring all the reasons explained why your proposal simply doesnt work. This has nothing to do with preference. If you try to fix this specific issue in the way you propose you you just screw up other things.
 
Last edited:

March Audio

Master Contributor
Audio Company
Joined
Mar 1, 2016
Messages
6,378
Likes
9,317
Location
Albany Western Australia
Oh, great. The included paper in the second post suggests that it may not be a simple matter of imposing the inverted Shirley curve: "2.3. Equalization Another way to fix the non-flat magnitude response caused by acoustic crosstalk is to apply inverse filters to the left and right signals [22]. However, the frequencies of the comb filter notches vary greatly depending on the relative positions of the speakers and listener. For example, the cancellation frequencies increase as the angle subtended by the speakers becomes narrower (such as when the listener moves further back) [9]. In addition, as the listener moves to the side and is no longer equidistant from the speakers, the notches move closer together and become different for each ear [12]. Without a good estimate of the relative positions, it would be impossible to accurately equalize the effects of the crosstalk." Here's the reference for 22: https://www.pearl-hifi.com/06_Lit_Archive/02_PEARL_Arch/Vol_16/Sec_53/AES 109/00042.pdf
This is one of the points I have been trying but failing to get across. You cant just invert any central image anomaly. Its simply not possible to fix the issue with a single EQ. It should not and cannot be addressed by changing the fundamental frequency response of the speaker.
 

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,690
Likes
6,013
Location
Berlin, Germany
For me, and I assume a lot of audiophiles, the quality of the center image is part of the fundamental 'magic' of stereo. I've tried on and off to use center speakers, but it just becomes boring to me. If I can see the center speaker creating center sounds, the magic is gone. It's too hard for my little brain to not associate the center sound with a box placed right in front of me.
Try harder ;-)
I can fully acknowledge that using a discrete center image speaker is somewhat distracting at first, it takes some time to overcome the aquainted way of how we listened for years or even decades. But after a while, you will wonder why we ever found phantom center any convincing at all when switching back to normal 2-speaker stereo with its rather diffuse and ill-defined center image, in direct comparison.
It is true that the center (and close-to center) images will be rendered a bit smaller and more pin-point which makes the stereo impression a bit flatter and also, it make you more aware of an annoying flaw in many recordings, which is panning all center content smack-on dead center. Some recording engineers seem to use the panpot as a switch, L, C, R with nothing in between, cluttering up the center once it is rendered faithfully.

As for the cancellation notch of 2-speaker stereo with its timbre and associated image height position problems (center is rendered at a higher apparant source position than the same signal coming from a single side speaker), you simply can't fix it, neither in the recording (with individual EQ of the center content) nor in the playback with a global EQ. But I agree that it might be a better trade-off to rather have a little bit too much energy in the 2kHz region than less than flat... then again this so much depends on radiation pattern, reflections and crossover design that it hardly should be seen as an universal fix.
 
OP
Thomas_A

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
So its all rather moot then.

I think it is an interesting matter not only from scientific perspective but since frequency response deviations are audible within +/- 1,5 dB.

You are ignoring all the reasons explained why your proposal simply doesnt work. This has nothing to do with preference. If you try to fix this specific issue in the way you propose you you just screw up other things.

Stereo is screwed up from the beginning, implying that stereo speaker within +/- 0,2 dB is a compromise, so is a stereo speaker within +/-1,5 dB. As pointed out several times, there is no experiment yet published (as I know of) that has dealt with preference rating for a linear vs "stereo optimised" speaker. The reasons given that you change the curve by altering the speaker-listener angle is obvious, but it does not change the question per se.
 
OP
Thomas_A

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
Try harder ;-)
I can fully acknowledge that using a discrete center image speaker is somewhat distracting at first, it takes some time to overcome the aquainted way of how we listened for years or even decades. But after a while, you will wonder why we ever found phantom center any convincing at all when switching back to normal 2-speaker stereo with its rather diffuse and ill-defined center image, in direct comparison.
It is true that the center (and close-to center) images will be rendered a bit smaller and more pin-point which makes the stereo impression a bit flatter and also, it make you more aware of an annoying flaw in many recordings, which is panning all center content smack-on dead center. Some recording engineers seem to use the panpot as a switch, L, C, R with nothing in between, cluttering up the center once it is rendered faithfully.

As for the cancellation notch of 2-speaker stereo with its timbre and associated image height position problems (center is rendered at a higher apparant source position than the same signal coming from a single side speaker), you simply can't fix it, neither in the recording (with individual EQ of the center content) nor in the playback with a global EQ. But I agree that it might be a better trade-off to rather have a little bit too much energy in the 2kHz region than less than flat... then again this so much depends on radiation pattern, reflections and crossover design that it hardly should be seen as an universal fix.

If we go to the height/rainbow effect of the center image, it can be a preferred in certain instances. I certainly find it a pleasant error, especially when listening/looking at my TV set which is at a higher position than the speaker centers.

And for the 1-2 kHz region; my preference is that this region should not be lower in level than the 2-5 kHz region. I prefer about 2 dB lower level 2-5 kHz compared to 1-2 kHz. The opposite, i.e. a 2 dB rise in the 2-5 kHz region sound harsh, bright and forward for some music content, IMO.
 

Pluto

Addicted to Fun and Learning
Forum Donor
Joined
Sep 2, 2018
Messages
990
Likes
1,631
Location
Harrow, UK
you will wonder why we ever found phantom center any convincing at all when switching back to normal 2-speaker stereo with its rather diffuse and ill-defined center image
I cannot agree with this one iota. Obviously I cannot possibly know of the speakers you use or the rooms in which you listen, but I have always used speakers that excel in providing a totally solid central image, solid to the degree that there might well be a speaker present in the centre, and indeed at any of the infinity of points that exist between the actual speakers.

I suspect the ability of a speaker/room combination to image well is one of those things that has fallen off the audiopile agenda to a large extent as there is no quick’n’easy spending solution such as “buy these new cables at $1k and all your problems will disappear”. I am not an habitué of audio shows so I don't get to hear lots of different brands in similarly poor hotel-room environments but from such events that I have attended, it is obvious that some speakers are quite clearly vastly better than others at the business of forming a good solid ‘phantom’ image. This is actually one of the reasons that I so value simple male speech as a speaker evaluation method.
 

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,690
Likes
6,013
Location
Berlin, Germany
I cannot agree with this one iota. Obviously I cannot possibly know of the speakers you use or the rooms in which you listen, but I have always used speakers that excel in providing a totally solid central image, solid to the degree that there might well be a speaker present in the centre, and indeed at any of the infinity of points that exist between the actual speakers.
While I understand one can certainly have a feeling of a rock-solid phantom center (or complete mono playback) with good speakers in a well treated and symmectrical room, this instantly pales in a direct comparision with a real center source as compared to a synthesized one. Not easy to try in real-time for most people unless you have three identical speakers and also enough amp channels and a source switch and all that, but when you do it is extremely convincing. The increased overall stability of the soundstage, with a way more relaxed sweet-spot requirement (not the normal "head-in-a-vise" type) is hard to beat and outweighs the few drawbacks of trinaural/trifield for me at least. Of course true multichannel is better when properly produced, and true wave-field synthesis is even better than multichannel....
 

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,272
Likes
9,786
Location
NYC
If we go to the height/rainbow effect of the center image, it can be a preferred in certain instances. I certainly find it a pleasant error, especially when listening/looking at my TV set which is at a higher position than the speaker centers.
Do you not think that the presence of the TV image is as biasing about center image audio as a center speaker?
I cannot agree with this one iota. Obviously I cannot possibly know of the speakers you use or the rooms in which you listen, but I have always used speakers that excel in providing a totally solid central image, solid to the degree that there might well be a speaker present in the centre, and indeed at any of the infinity of points that exist between the actual speakers.
I have a center speaker in place all the time and, yes, I do sometimes check to see whether it is reproducing a signal. On the other hand, this..........................
This is actually one of the reasons that I so value simple male speech as a speaker evaluation method.
Agreed. Tests with this source consistently confirm that having a discrete center channel (or just mono) is superior to even the best stereo center fill.
 
OP
Thomas_A

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
Do you not think that the presence of the TV image is as biasing about center image audio as a center speaker?

Yes of course. But even without TV, the image of the phantom center is elevated above speaker height.
 

restorer-john

Grand Contributor
Joined
Mar 1, 2018
Messages
12,579
Likes
38,278
Location
Gold Coast, Queensland, Australia
Agreed. Tests with this source consistently confirm that having a discrete center channel (or just mono) is superior to even the best stereo center fill.

What about two male speakers, side by side, in the centre alternatively speaking? How can a single mono speaker reproduce that with any accuracy? They will be on top of each other. ;)

(I do understand what you are saying, but the classic male voice single speaker realism test only works for one voice in one place wouldn't you say?)
 

Senior NEET Engineer

Addicted to Fun and Learning
Joined
Jan 6, 2020
Messages
538
Likes
591
Location
San Diego
Simple (static) re-matrixing 2-ch content to 3 channels with a "trinaural" or "trifield" matrix solves the timbre problem very elegantly and fixes some other issues as well. It is not fully compatible with all recording techniques, notably HRTF-based phantom-source projection doesn't work anymore as the wave-field situation has changed.
Image sources are created in a better or more benign way, the center content is more discrete and the (usually less important) side content gets more virtualized, sort of.

The timbre problem is already solved when the album is mixed.
 
OP
Thomas_A

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
The timbre problem is already solved when the album is mixed.

In such a case, why did not the Harman blind tests of stereo recordings using a single mono speaker notice this?
 

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,272
Likes
9,786
Location
NYC
What about two male speakers, side by side, in the centre alternatively speaking? How can a single mono speaker reproduce that with any accuracy? They will be on top of each other.
Granted. With two male speakers, side-by-side, each would be co-positioned with a mono speaker but each would be a phantom image between the center channel and its respective main channel in an L/C/R system. One could conceive of filling the entire space between L and R with an array of discrete channels to minimize any phantom imaging.

The value of the mono test with a single voice is the 1:1 mapping. It is a test and not a recommendation for mono systems. More practical is an L/C/R arrangement with a discrete center channel.
 

Pluto

Addicted to Fun and Learning
Forum Donor
Joined
Sep 2, 2018
Messages
990
Likes
1,631
Location
Harrow, UK
I have to ask – when positional mapping of, say, a single voice does not correspond exactly to full L, C or R, what subjective benefit does the listener gain from the presence of the C speaker? As I remember from ancient history, the purpose behind the introduction of the centre speaker was to provide a solid placement of main dialogue for listeners not in the hot seat. That would imply that, the moment a voice (or any point source) relies upon two or three speakers to localize it within the available field, any benefit of the centre speaker is lost.

Personally, I believe the very idea of tying the main dialogue to the centre has done film sound a huge creative dis-service.
 
OP
Thomas_A

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,422
Likes
2,407
Location
Sweden
I have to ask – when positional mapping of, say, a single voice does not correspond exactly to full L, C or R, what subjective benefit does the listener gain from the presence of the C speaker? As I remember from ancient history, the purpose behind the introduction of the centre speaker was to provide a solid placement of main dialogue for listeners not in the hot seat. That would imply that, the moment a voice (or any point source) relies upon two or three speakers to localize it within the available field, any benefit of the centre speaker is lost.

Personally, I believe the very idea of tying the main dialogue to the centre has done film sound a huge creative dis-service.

With more speakers the phantom images becomes split up between them, hence more precise localisation.
 

Pluto

Addicted to Fun and Learning
Forum Donor
Joined
Sep 2, 2018
Messages
990
Likes
1,631
Location
Harrow, UK
With more speakers the phantom images becomes split up between them, hence more precise localisation.
OK, lets think reductio ad absurdum. If you had an infinite number of speakers between L & R, you could localize an infinitely small image (i.e. a perfect point source) with perfect precision. But the moment that image moves, or is anything wider than a pure point source, you are reliant upon a perfect acoustic match between the speakers (and their interface with the room) or the quality will change as the source moves.

In other words, for consistent and convincing quality, fewer speakers are better. I think a variation of the Uncertainty Principle might be at work here. The more accurately you know the position within the field, the less you can rely on its quality... or something like that o_O

But as I've said before, I really don't believe there is a problem localizing a point with two decent speakers in a decent room!
 

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,272
Likes
9,786
Location
NYC
I have to ask – when positional mapping of, say, a single voice does not correspond exactly to full L, C or R, what subjective benefit does the listener gain from the presence of the C speaker?
Except for pure 2-microphone stereo recordings, having a center channel feed/speakers removes from the L/R speakers most of the center information that would otherwise be mixed into them. That mixing is never perfect and the most obvious improvement on going from 2channel to 3channel is usually an increase in soundstage width and clarity due to the deletion of that material. Improvement in center voicing is more subtle, imho. This is easily demonstrated with almost any of the RCA Living Stereo SACDs by comparing the stereo tracks with the 3channel tracks, all derived from the original master tapes.
With more speakers the phantom images becomes split up between them, hence more precise localisation.
Yeah, that too.
But as I've said before, I really don't believe there is a problem localizing a point with two decent speakers in a decent room!
Sure, fine for localizing it. Voicing is something else.
 
Top Bottom