• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Sonic impact of downmixing stereo recordings to mono

I don't perceive any definite evidence favoring your hypothesis over Toole's "relatively constant, or at least smoothly changing directivity."
Directivity has become a corner stone for all speakers, but without blind tested evidence of relevance, loughing out loud.
Just kidding, the issue is resolved by just thinking.

The stereo enthusiast logically will take a seat, head straight up at the right position, and will stay there in this posture for the full hour of a CD playback, say it's some Beethoven, but Wagner will take longer.

What is the relevance of the directivity then? Early, direct reflections are to be avoided (don't match phantom sources in the stereo panorama). The diffuse field does not depend on the details, that's why it is called diffuse. We have direct and diffuse, the in betweens, well, those are not at all defined.

What counts in regard to directivity is the grand total of sound power output in all directions. In which direction there is more or less is logically irrelevant! It may have nasty wiggles over all frequencies, beaming here and there, and what more.

That's the output side. What of the off-axis sound reaches the ears is filtered by the room a lot due to the necessarily many reflections to form a diffuse field. The nitpicking with directivity is only as effective as the room reverberation is controlled likewise.

Who ever analysed his room down to the very last dB in more and more tighter frequency bands? If presumably there's none, it is an utter waste of time to do that with directivity.

Edit: now I wonder how that relates to 'mono'?
 
Why not do it yourself? What is your own, personal impression, once you chose a proper setup? We should not overly emphasize big science. It always starts with humble little experiments after a thorough thinking process, of course. So, what are you after, actually, the quest.
While I still publish stuff scientifically now and then, audio is not my subject. So it would not be feasible at all. Audio is for me a fun hobby which includes buidling speakers, measuring and listening. And I am quite satisfied what I have today, after having fine adjusting crossovers and frequency response over 20-25 years playing with active and passive solutions. If I do the experiment as suggested blind, who will care? It’s just me listening anyway so statistics will mean nothing. As we know from current published research a basically linear on-axis response, even dispersion has shown good preference. Achieved by using the most sensitive method - listening music tracks played over a single speaker!
 
If I do the experiment as suggested blind, who will care?
You, I assume. O/k, I understand that a public discussion on the subject of undesired side effects of downmixing to mono, or just taking one of the two channels, well, it should be based on common wisdom. It is not, you're right.

As we know from current published research a basically linear on-axis response, even dispersion has shown good preference. Achieved by using the most sensitive method - listening music tracks played over a single speaker!
I agree fully, except for the latter. How do you say in English, the cat chases its own tail? The evaluation rounds at Harman resulted in preferring the objectively "better" speakers, as an engineer sees it: flat, all under control. Checked in mono, the more so. The argumentation resembles that from (traditional?) pharmacy, or medicine in general. The root cause is second, first the cure is to be proven. Only after the systematic analysis is undertaken.

What I'm missing here is the differential analysis of the underlying phenomena. For instance: By how far would we need to de-harmonize the directivity pattern to get an effect on preferrence? Or, on the very topic, that @Arindal raised lately: is the head's orientation towards the single speaker of some relevance?

On the latter: no, it is not. The timbre at the two eardrums differs vastly with different orientations. But the auditory system corrects for that. Wonders every day!
See (reiterated): section 2/2 of https://hauptmikrofon.de/theile/1980-2_Diss._Theile_englisch.pdf
That doesn't work with stereo as easily (lsitening stereo is an acquired skill) ... and maybe that's the root cause for having a more discriminating judgement with a single speaker.
 
yes :) my father was a sound engineer at SR (Swedish public service radio ) and later at SVT ( Public service TV ) and then had his own company .he later became an video editor.
I've worked with several engineers at SR and SVT. They were all very clever and engaging people. Also really straightforward communicators - I always learned something. Sweden has a fantastic public service broadcasting approach.

he told me about how they always checked for mono compatibility as most radio was consumed trough small mono radios. And for TV , the old TV sets only had mono speaker
Yes it's standard practice. For TVs it's essential. In many ways, the mono output is far more important than the stereo feed for most consumers.
 
I forgot to include the link to the post for the Salon 2 and Array 1400 measurements: https://www.audiosciencereview.com/...easurements-interpretation.27853/#post-966141

I cannot predict what a specific group of listener would express preference for, but, yes, that was pretty accurately the result of almost every listening test I have done myself or taken part in. For most of rooms, a constant directivity in that region (1-5K, if not overly deviating from the band one octave lower) is sufficient. If the indirect sound in the room might become dominant, some people who share the same idea even propagate a higher directivity from 0.5-2K compared to a lower one from 2-5K roughly.
I see. I took your statement "My hypothesis is that mono listening (both single speakers and quasi-mono signals in stereo) favors speakers with imbalanced directivity index, particularly those pumping too much of energy in the 1-2K band into the room while attenuating the 2-5K and higher bands" at its face, but rather, it seems to me that it should have been "My personal experience is..." since I tend to think of a hypothesis as a proposed explanation that allows for testing/prediction, i.e. falsifiability, not necessarily 100% but rather "My hypothesis is that a majority of listeners would rate speakers with imbalanced directivity more highly, etc." Would it be possible for you to demonstrate the results for other participants in the tests, along with the speakers in question and their measurements?
Blauert´s preferred bands were one starting point of this theory, the other being the equivalent loudness for direct vs. diffuse sound as described by Zwicker/Fastl in one of their standard books on psychoaoucstics.

The main idea is: If a sound event like a reverb pattern shows an emphasis in the 0.8-2k band, our brain tends to localize it as coming from the rear/sides. If the 2-5K bands dominate over the 0.8-2k, we get the perception ´coming from frontal angles´. For a mono listening setup, the former is presumably less annoying as it mildens the annoying directness of a mono real source and spreading the angles at which reverb is subjectively coming it, making the mono speaker ´disappear behind the screen´, as Dr. Toole has put it. The latter is in favor of a stereo setup in an overly lively room as it directs the additional reverb to the frontal listening window, ´hiding it´ behind the direct sound and making it more likely to blend with the uncolored reverb from the recording.
That's interesting. Presumably the relative timing of these spectrally colored reflections would matter with respect to the so-called precedence effect and fusion interval, since truly diffuse sound and reverberant field would seem to be very challenging, if not virtually impossible, to achieve in typical listening rooms.
The latter is also the reason why some loudspeaker manufacturers do this intentionally.
Could you please provide some examples?
The question would be which constant directivity or decreasing d.i. in the 1-5K bands loudspeaker without major flaws they have tested as a competitor, and if this has been done with judging tonality and imaging in stereo as well.
At this point, I rather doubt that anyone is really doing research in stereo speakers, so unfortunately, we're stuck with what we already have, and personal anecdotes don't necessarily constitute evidence, especially when described vaguely without reference to specifics.
I am not aware of such test, as the number of loudspeakers being specifically advertised and achieving CD over a broad band, only appeared after 2010. Around this time I had some eye-opening listening events and tests, some controlled and some not, which brought me to really being sensitive to loudspeakers with increasing directivity index.

As mentioned, this is a relatively young concept, as most of the technology to achieve it without major tradeoffs, did not exist before the era of DSP-controlled active speakers.

I encourage everyone to do a comparison between a true constant directivity speaker and one with ´smoothly increasing directivity´ in the 0.5-8K band. I do not see a hard threshold between the two, and there are some speakers with a slight increase in d.i. which one could EQ to satisfaction. I would say a constant plateau between 0.8-5K is the single most important thing with the neighboring bands not making a step up in d.i. But the moment the index is increasing over several octave-broad bands, particularly if the 3-5K band is already of higher d.i., makes me pretty cautious. I have had too many moments of disappointment, as the dull, rear-heavy reverb is also detereorating other aspects of sound quality in my understanding.
I have also wondered along somewhat similar lines about what I call stair step vs diagonal directivity (https://www.audiosciencereview.com/...-on-readings-of-lokki-bech-toole-et-al.27540/). My own speculation is that perhaps there might be a relatively critical range for frequency (and perhaps phase) in a somewhat lower frequency range, perhaps starting in the low hundreds and going up to the low-to-mid thousands in terms of Hz, but also that there may be different classes of listeners with relatively sensitivity and preference to specific attributes (it's interesting to compare concert hall frequency range measurements with inverted DI curve, see https://www.audiosciencereview.com/...cert-hall-acoustics-links-and-excerpts.51487/, also the relative difference in frequency range at fortissimo and the so-called smiley curve). There is an audio critic named Robert E Greene who seems to dislike the smoothly increasing directivity speakers like Kef (and probably Genelec) and calls them "dying swans."

What do you consider to be examples of "true constant directivity speaker"? I tend to think of Genelec and Kef as being diagonal/smoothly changing. D&D 8C, Kii Three, Salon 2 as stair step.
 
I'm a music and audio enthusiast. I listen to music all the time and I have some good gear to listen to it on

Despite this (and despite my time in studios), I've probably spent only about 5% of my entire listening life in the stereo sweet spot. I've heard lots of music the car, but I'm always a long way to the left or right. Most live bands are not set up to "recreate a 3D image". I've heard lots of music on mono portable devices. My guess is the majority of humanity experience the majority of music in mono or near mono.

This is why every music producer and broadcaster thoroughly checks to confirm the mono mix sounds good - they know that true stereo sweet spot listening is extremely rare.
 
I'm a music and audio enthusiast. I listen to music all the time and I have some good gear to listen to it on

Despite this (and despite my time in studios), I've probably spent only about 5% of my entire listening life in the stereo sweet spot. I've heard lots of music the car, but I'm always a long way to the left or right. Most live bands are not set up to "recreate a 3D image". I've heard lots of music on mono portable devices. My guess is the majority of humanity experience the majority of music in mono or near mono.

This is why every music producer and broadcaster thoroughly checks to confirm the mono mix sounds good - they know that true stereo sweet spot listening is extremely rare.
I wrote this in the Do you miss Knobs and Buttons thread - sadly a lot of pre-amps do not have L+R, amongst other things, anymore. In the studio it is (was) the button with a lot of wear. For a good reason. Though I have a feeling these days it is not as much a consideration as it was, back then.
 
You, I assume. O/k, I understand that a public discussion on the subject of undesired side effects of downmixing to mono, or just taking one of the two channels, well, it should be based on common wisdom. It is not, you're right.


I agree fully, except for the latter. How do you say in English, the cat chases its own tail? The evaluation rounds at Harman resulted in preferring the objectively "better" speakers, as an engineer sees it: flat, all under control. Checked in mono, the more so. The argumentation resembles that from (traditional?) pharmacy, or medicine in general. The root cause is second, first the cure is to be proven. Only after the systematic analysis is undertaken.

What I'm missing here is the differential analysis of the underlying phenomena. For instance: By how far would we need to de-harmonize the directivity pattern to get an effect on preferrence? Or, on the very topic, that @Arindal raised lately: is the head's orientation towards the single speaker of some relevance?

On the latter: no, it is not. The timbre at the two eardrums differs vastly with different orientations. But the auditory system corrects for that. Wonders every day!
See (reiterated): section 2/2 of https://hauptmikrofon.de/theile/1980-2_Diss._Theile_englisch.pdf
That doesn't work with stereo as easily (lsitening stereo is an acquired skill) ... and maybe that's the root cause for having a more discriminating judgement with a single speaker.
Anechoically or in the direct field the signal at the ear drums have quite large deviations though. The changes in timbre is not that large so any ”correction” may well stay within +/- 1 dB in the 1-8 kHz bands. As mentioned, I would like to see such studies even if they would give null results.
 
If such a signal would be used in a mono listening test of speakers in order to judge their tonal balance, it would lead to misjudgments. Speakers which compensate for this phenomenon inherent to stereo recordings this or that way, would be more likely to win this test, while speakers which deliver best tonal balance results in stereo tests would more likely to be perceived as colored or imbalanced.
If this was actually an issue, it would have shown up in the research when they were looking at the differences in listener preference between mono and stereo listening. The only difference turned out to be that listeners were less discriminating when listening in stereo, but which speakers they preferred did not change. So unless you can provide some evidence of this phenomenon and not just evidence-free thought experimentation, I'm afraid you are kind of just pissing in the wind here.
 
So unless you can provide some evidence of this phenomenon and not just evidence-free thought experimentation, I'm afraid you are kind of ...
That's a bit rough. And to all the others: did you read the scientific paper I linked to above? Section 2/2. Real science, not engineering, though.
:facepalm:
 
That's a bit rough. And to all the others: did you read the scientific paper I linked to above? Section 2/2. Real science, not engineering, though.
:facepalm:
I’ve read it. However it does not answer the question. (Note There are lots of claims - the phantom source coloration is well known among studio engineers and fixed in the mix vs there is no phantom color coloration due to brain averaging. I just have my personal 25 years DIY experience but it does not really count here other than creating hypotheses for further experiments made blindly)
 
If this was actually an issue, it would have shown up in the research when they were looking at the differences in listener preference between mono and stereo listening.

It would, if anyone would do such research with focussing on tonal balance preference with modern loudspeakers today. But that was never the case. Floyd Toole explained that the underlying research leading to his mono theory were complete in 1985 (presumably with more or less imperfect loudspeakers) and the most recent verification in 2008 was also comparing loudspeakers with significant flaws, focussing on the discrimination effect under which conditions the flaws were most likely to be detected (if I understood Dr. Toole correctly).

This is why every music producer and broadcaster thoroughly checks to confirm the mono mix sounds good - they know that true stereo sweet spot listening is extremely rare.

Cannot confirm this from decades of experience in pro audio, broadcast tech research and related fields. Mono compatibility checks are surely executed (for FM radio and vinyl mastering this is mandatory), but they are mostly meant to make sure a mono summation downmix would not lead to cancellation effects, phasing or lack of bass/lower midrange. It would come as a total surprise of mono monitoring would be taken as a reference to tonal manipulations, particularly above 1K, as this would almost certainly lead to degrading stereo tonality if any issues were found. You cannot have both perfect.

While it is true that music is vastly consumed in mono or under quasi-monaural conditions (such as portable bluetooth speakers), I don't think this represents critical listening or is to be taken as a reference. Most of people would not bother to hear slightly equalized vocals in mono, maybe would not even judge accurate tonality or notice the difference.

I forgot to include the link to the post for the Salon 2 and Array 1400 measurements:

The Array is definitely off the tolerance band that would define more or less constant directivity, expected to show exactly the outcome I have described (and experienced in exemplary manner with an Array 800 model back then).

Revel is difficult to judge, and I have never listened to it in a room that would allow any judgement or a definitive answer to the question if the outcome of uneven directivity could be equalized to satisfaction or not.

Would it be possible for you to demonstrate the results for other participants in the tests, along with the speakers in question and their measurements?

Such test should be doable and I would love to contribute details the test design. I would suggest to compare two speakers which are very similar in on-axis response, size, recommended listening distance and geometry while showing significant differences in off-axis behavior.

Presumably the relative timing of these spectrally colored reflections would matter with respect to the so-called precedence effect and fusion interval,

Certainly accurate. We should take into account that the precedence effect is already at play with the first sound events forming the direct sound, so our brain will compare the tonally of events to follow with the tonal signature of that first wavefront, if that makes sense. Depending on delay and meaningful tonal differences (meaningful in the way that either is matches the outcome of differently reflective surfaces or the HRTF with sound coming in at different angles), our brain would try to create a full picture of where the reverb came and how the room looks like. This is best judged with transient sound events in a certain time window after the initial event has decayed (so it would not continuously mask the reverb events making ´the reverb disappear´, as member @gnarly has put in when trying the test tracks I had suggested).

since truly diffuse sound and reverberant field would seem to be very challenging, if not virtually impossible, to achieve in typical listening rooms.

In studio control rooms, this turned out to be achieveable. Under home listening conditions it is a bit more difficult and might require further attention to speaker directivity, listening distance, diffusive elements on walls near the loudspeakers. The idea is not to attenuate the reverb in the listening room completely (as it was done in studios some decades ago), but to bring it to a level and congruence with the direct sound that it would not deteriorate localization stability nor perception of the reverb in the recording.

Could you please provide some examples?

MEG and PMC were actively promoting this concept, but it is not hard to see that other studio monitor manufacturers are seemingly following a similar idea, for example Adam Audio, Kii Audio, PSI, Eve and others.

What do you consider to be examples of "true constant directivity speaker"?

The most obvious examples are those actively promoting the concept or being founded on that idea: Kii Audio, GGNTKT, Dutch&Dutch, Linkwitz, the smaller MEGs and all those being into curved line sources (more or less following Keele´s ideas). Interestingly despite from Linkwitz there are quite some manufacturers of dipoles coming to a similar result solely in terms of directivity (I do not mean to recommend a specific brand or modeL): Martin Logan, Magnepan, Quad ESL, Spatial Audio, Ecouton.

Interestingly, some hi-fi or high end audio manufacturers seemingly follow this idea, at least in the most important frequency bands, without making much noise about it. As mentioned, for me the TAD Labs models with bigger coaxial driver were the initial eye-opener what this concept can really achieve and why I found it so superior compared to continuously increasing directivity models. There are actually lots of different speakers offering constant directivity in a limited band, e.g. 0.8-7K, leaving it to the listener to decide if the bands below or above can be equalized accordingly without overly compromising the direct sound tonality. Examples by Genelec, TAD Labs, Elac, Magico turned out to fall into this category.
 
There are actually lots of different speakers offering constant directivity in a limited band, e.g. 0.8-7K, leaving it to the listener to decide if the bands below or above can be equalized accordingly without overly compromising the direct sound tonality. Examples by Genelec, TAD Labs, Elac, Magico turned out to fall into this category.
Can you please post few spinoramas of such examples?
 
Can you please post few spinoramas of such examples?

That is actually a good idea. Did a quick check which models of the listed brands are available as spinoramas with a proper d.i. calculation and found some which I had the chance to EQ myself (except for the TAD which is the compact variant but coaxial unit is identical) so I can confirm that it works giving excellent results:

TADCR1.jpg

ElacAS61.jpg








MagicoA5.jpg




Gen8341A.jpg


Tried to highlight the frequency bands which are resembling a constant directivity from 1-5K and neighboring bands, if the CD is stretching beyond that range. Note that the rather slim models by Elac and Genelec show a decrease in d.i. towards lower frequencies which sets in a bit too high, around 1K. Under more or less nearfield conditions that is expected to be correctible by EQ, but the more the indirect soundfield dominates, you might end up with a lower-midrange-heavy tonality in the room giving the impression of ´unclear muddled mids´.

The new Ascilab speakers seemingly follow the CD concept as well, and on paper the results looks like very well-executed. I am reserved with a verdict before having extensively listened to one, though, as both the x-over point/lobing/localization issue and the lower d.i. in the midrange do not allow a judgment from afar.
 
On details regarding directivity, either constant or increasing with frequency or anyhow irregular

Can you please post few spinoramas of such examples?

If it is not connected to downmixing to mono, what do I miss?

As I said before, if one gets picky in regard to directivity, the unavoidable impact of room reverberation has to be acknowledged. The two, directivity and reverberation are logically connected, they work hand in hand by definition, but that is *always* ignored. With directivity every dB counts, the shape is discussed ad nauseam, but the room is described, if at all, by its reverberation time "T60" or so in octave bands only, wide deviations from some ideal are swept away with shrugging shoulders. Guys, dunno if it's funny or sad.

I’ve read it. However it does not answer the question. (Note There are lots of claims - the phantom source coloration is well known among studio engineers and fixed in the mix vs there is no phantom color coloration due to brain averaging. I just have my personal 25 years DIY experience but it does not really count here other than creating hypotheses for further experiments made blindly)

It does: "the brain" would not average, as you put it, it is about different signal processing chains. As soon as a phantom source is identified, the coloration due to direction dependent ear signals is - spontaneously ignored. If the phantom source collapses, the coloration is perceived. So far the direct observation. The modelling of the effect in the second part of Theile's piece is speculative, sure, but not without reason.

I would go even further in saying, the identification of phantom sources hinders the detection of coloration, may it originate in the HRTF or in the speakers.

Anyway, as you say, phantom is colorful, does it turn to grey once downmixed?

We should foremost acknowledge, that evaluating a speaker is a very unique mode of listening. I'm quite rarely into that, don't know about you :rolleyes:
 
The new Ascilab speakers seemingly follow the CD concept as well, and on paper the results looks like very well-executed. I am reserved with a verdict before having extensively listened to one, though, as both the x-over point/lobing/localization issue and the lower d.i. in the midrange do not allow a judgment from afar.
The Ascilab actually thanks to their low frequency crossover and large waveguide take the CD concept lower down in frequency compared to similar sized loudspeakers you mentioned above and have a higher DI there:

1749206397496.png

 
Last edited:
thanks to their deep crossover and large waveguide take the CD concept lower down in frequency compared to similar sized loudspeakers you mentioned above and have a higher DI there

Absolutely agree, on paper this is a very promising concept and well executed, very much likely to deliver superb results in terms of off-axis tonality.

It comes with implications, though: Partly the constant directivity in its lowest octave relies on vertical interference and lobing. This usually results in a slightly different tonality for vertical and horizontal dispersion influencing discrete reflection tonality as well. The second side-effect is the geometry of localization-relevant sources which are crossed over at around 1K being pretty far from each other. The impact on localization stability is unknown.

Grimm and Kii Audio models do basically the same thing at a slightly higher x-over point. In both cases I cannot deny the impression that the vertical tonality is seemingly dominating the reverb, so I feel the lowest octave of solely the tweeter dominating slightly. Maybe I am alone with that impression.
 
Absolutely agree, on paper this is a very promising concept and well executed, very much likely to deliver superb results in terms of off-axis tonality.

It comes with implications, though: Partly the constant directivity in its lowest octave relies on vertical interference and lobing. This usually results in a slightly different tonality for vertical and horizontal dispersion influencing discrete reflection tonality as well.
It should be said though that this vertical lobing is surprisingly low for a non-coincidential design, thanks to the very low and steep (even more for a passive design) crossover, so I wouldn't worry there as much a with most other similar designs:

1749207740307.png

The second side-effect is the geometry of localization-relevant sources which are crossed over at around 1K being pretty far from each other. The impact on localization stability is unknown.

Grimm and Kii Audio models do basically the same thing at a slightly higher x-over point. In both cases I cannot deny the impression that the vertical tonality is seemingly dominating the reverb, so I feel the lowest octave of solely the tweeter dominating slightly. Maybe I am alone with that impression.
I think that also depends on the listening distance and closeness to vertical reflections like for example desktop where I would still prefer a coaxial design but my guess is it is less of a problem compared to the typical crossing around 2-3 kHz.
 
Back
Top Bottom