• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Sonic impact of downmixing stereo recordings to mono

Seems like the plot is lost and you are speaking from lack of experience. I suggest you remedy the latter before arguing more. Turn off one speaker and listen.

Until then, my collection of test tracks is from what sounds excellent on both stereo headphone and speaker listening. Out of that large set, I have narrowed down the list to a handful of critical tracks for testing various aspects of speaker performance. Here is the key thing: on any well designed speaker, all 200 to 300 tracks sound wonderful in mono! I can't think of a single scenario where it is the aim of a mix/mastering engineering to produce poor tonality in one speaker as to then compensate with the other. I repeat: these tracks are hugely enjoyable listened to with one speaker. You need to try it and then come back and complain.

Then you have to understand what makes a good test track:

1. It has to test for tonality. For this, I mostly use female vocals that bring this to the forefront and per above, sound very nice on stereo system as well as single speakers. If the track you picked doesn't do this, then go find another. There are tens of millions of pieces of music out there in subscription services.

2. The spectrum needs to stay constant more or less so as you switch between speakers or in my case, switch between EQ setting, the content doesn't change. This is why some use pink noise. I don't because I don't like to listen to noise.

3. I have test tracks for specific things such as sub-bass performance. My chosen track has this spectrum in both channels so I don't need at all to play in stereo.

These are the critical things to test for in a speaker: what is the tonality and can it have full range and play loud enough. They are trivially tested in mono, far easier than stereo.

Stuff you talk about you is positioning and room dependent. Conveying that to others is useless as no one is going to be able to replicate your setting. Further, you don't know those elements for when the music was produced. So you have no idea if what you are determining is right anyway. A dipole speaker surely generates spatial effects that are not real and were never heard in the studio. Combine two such speakers nave you are just dealing with fantasies as opposed to mimicking what someone may have setup.

Bottom line is simple: you need to go and perform single speaker testing. Get some experience under your belt as those of us who have done it. The few who have done it on ASR, have become instant believers. Don't just throw lay intuition at us and arguments we already know. Physics says as you approach speed of light, time essentially stops for you and distances become incredibly small. There is nothing intuitive about this but it is a proven fact from Einstein. Please stop arguing from what you think happens and spend time learning the "new" science. We do mono testing because it works far better and is far cheaper. It is insanity to say just because you think this and that with nary any experience under your belt that we should not do this.
Great post Amir.
I never really cut it as an engineer at an academic level, but my practical experience when I did the product design and development at Amina for nearly 15 years (a brand you know well) was that I never considered subjective listening tests on stereo pairs. Intuitively it just made sense that to design a speaker that had some kind of cat’s chance in hell of sounding tonally correct when using a bastardised version of NXT tech, I had to hone in purely on that attribute. You may remember the AIWX series from the early 10s as they were the breakthrough invisible product range for the company and established them in the market. Developed using CLIO and mono listening tests.
 
Phase cancellations will cause changes. Note also that such cancellations are different in electronic domain (what you did) vs acoustic (in room with speaker playing).

My suggest is to NOT convert anything to mono. Create a single channel pink noise or just play one channel of the stereo version
I should have used one channel as presented at one ear. But it would still be different sounding. Edit: I have to look up if I actually did that. It was some time ago now.
 
Last edited:
Is it possible that when comparing two different speakers in mono, you decided which was the better sounding one, but then when playing in stereo you prefer the other you thought sounds worse?
 
Turn off one speaker and listen.

One speaker of the stereo pair, the 5.1, the 7.1, or the Atmos setup? ;-) I find it amusing that you speak of lack of experience with listening tests.

my collection of test tracks is from what sounds excellent on both stereo headphone and speaker listening...Here is the key thing: on any well designed speaker, all 200 to 300 tracks sound wonderful in mono!

That is hinting to a selection of tracks which are not capable of revealing flaws other than massive ones, and which do not allow a proper judgement of tonality, transparency and all imaging-related aspects.

I can't think of a single scenario where it is the aim of a mix/mastering engineering to produce poor tonality in one speaker as to then compensate with the other.

I can. One is called runtime stereophony and vastly popular among classical recordings of the last 25 years. For example recordings which are based on A/B main mic arrangements, should not sound wonderful in mono, particularly not downmixed to mono. Their whole stereo effect relies on masking by first wavefront preference. If you make that audible in mono, it usually sounds like a garage, not to speak of comb-filtering and cancellation effects.

Another main phenomenon of poor tonality in ´stereo>mono´ manipulated tracks is rooted in the different angles from which the sound is coming in, how it is affected by our own HRTF and how HRTF interaural crosstalk components are masked in stereo (but not in mono). You can easily do an experiment yourself by switching from true center to phantom center in a surround setup.

As mentioned, I am not against mono listening tests and for some very specific purpose it might be useful, as described by Floyd Toole, but then please use material being mixed and mastered in mono.

Then you have to understand what makes a good test track:

Thanks, I have been auditioning and selecting test tracks for various purposes of listening tests on a professional base. It is impossible to calculate but if I count the usual number of tracks being auditioned during the process of making a selection, and how often that happened in my previous jobs, I have auditioned something like several hundred thousands of tracks.

I mostly use female vocals that bring this to the forefront and per above, sound very nice on stereo system as well as single speakers.

In my understanding, test tracks are not meant to ´sound nice on every setup´, but to either reveal flaws, allow distinction of differences or represent a vast variety of mixing styles, tonality differences, difficulties for speakers, or a different mixture of recording techniques.

If you use close-mic´ed solo female vocals like audiophile jazz and folk mostly, you are most probably going to miss aspects which are revealed only for example by a male ´chorus latens´ or a 16-voice mixed choir, a contralto, some problematic 1950s opera live recordings, some Autotune-distorted R´n´B singing, heavy-metal-style falsetto or some guttural growl, and vice versa.

I have test tracks for specific things such as sub-bass performance. My chosen track has this spectrum in both channels so I don't need at all to play in stereo.

Testing sub-bass is definitely possible in mono, and I would not overly worry whether the material is made for that or not. It is more important to have a vast selection of tracks representing different dominating frequency bands and bass transient behavior. So I do not think a number of 300 tracks for bass and bass impulse judgement alone, is sufficient.

Stuff you talk about you is positioning and room dependent. Conveying that to others is useless as no one is going to be able to replicate your setting.

It is not about replicating a setting but about defining properties of the speakers and matching them with the room so everyone would be able to replicate the test results. That is doable in most of cases. If problems occur, for example with judging localization and ambience, go for a nearfield setup, get some constant directivity speakers as a reference or optimize the room.

Further, you don't know those elements for when the music was produced. So you have no idea if what you are determining is right anyway.

Yes, I do. At least for all the recordings I have been involved in the recording or production process and I know the concert hall really well. I have a selection from those tracks for listening tests as well, and I know from many listening tests with recording engineers that everyone has his or her favorites from own oeuvre. As mentioned, I prefer to use those recordings as a reference which I had the chance to witness both from the auditorium and the local broadcast control room.

A dipole speaker surely generates spatial effects that are not real and were never heard in the studio.

With my method and the right track to test depth-of-field, it would take me 10 seconds to identify this. I doubt that it is possible to test that in mono, and as member @gnarly has mentioned, there are situations in which reverb gets fully masked. Mono testing is one of them.

I am not a fan of dipole speakers, but having read a lot of papers by the late Siegfried Linkwitz, sharing a lot of common goals with him and his successors, I fail to understand why dipole speakers are said to sound ´not real´, but other speakers creating much weirder reverb and tonality effects due to kinked directivity, are not labelled as such. But maybe that is worth a separated thread.

Bottom line is simple: you need to go and perform single speaker testing.

I have done this numerous times, starting with experiments early in my career when reviewing codecs as a successor to MP3 was a big thing. The interesting aspect is, this idea of mono testing was never implemented, as the whole idea of testing and optimizing lossy codecs was found to work much better with headphones.

I do not see any point in testing speakers in mono, except for very specific questions (resonances, distortion, bass quality and alike). So far you did not bring up a valid point why it should be superior in a preference test. I have been explaining numerous reasons why is might lead to misjudgments and does not allow to judge important aspects such as imaging and ambience.

Please stop arguing from what you think happens and spend time learning the "new" science.

´Learning the new science´? Which ´old science´ are you planning to replace? And did I understand it correctly that the idea of mono speaker tests got promoted by Dr. Toole as early as in 1985, never being properly verified by any other scientific institution nor being adopted by anyone in the pro audio, recording community and researcher alike?

What is new about it? And what is scientific about a general technology that was replaced by a superior one as early as 1955, and not without good reason?

Despite its regrettable limitations (I personally prefer surround when it comes to ambience), I am still surprised how good stereo can sound and how well some pioneering recording engineers were using it from the very beginning. Occasionally, I implement the first stereo recording I regard to being well-executed, in a selection of listening test tracks, and people are usually amazed by the high standard of this one. Give it a try in stereo, as it creates good mood as a side effect:

Offenbach.jpg


We do mono testing because it works far better and is far cheaper.

Can tell you from quite some experience with Dolby Atmos and Auro3D setups and listening tests, that the difference in cost and effort between mono and stereo is negligible.

And I still did not read a single convincing reason why mono is better in your eyes. I have layed out numerous explanations what it cannot achieve and why it is prone to producing misjugdments.
 
One speaker of the stereo pair, the 5.1, the 7.1, or the Atmos setup? ;-) I find it amusing that you speak of lack of experience with listening tests.
One speaker always. Whether the application is one or 20 channels. Indeed, I have done 1 channel height listening, comparing the original track to Dolby encoded which clearly showed the artifacts Dolby codec had added. With all the channels whaling, you could not hear it.

Heck, I have done subwoofer tests with just the sub playing as well. So much easier to hear its artifacts.

Research indeed shows that the more speakers you have, the less discrimination you end up with:

index.php


So if you are listening to all the channels, you are in worse shape, producing even more unreliable observations than doing the same in stereo. How much reading have you actually done to ask me these questions?
 
Last edited by a moderator:
In my understanding, test tracks are not meant to ´sound nice on every setup´, but to either reveal flaws, allow distinction of differences or represent a vast variety of mixing styles, tonality differences, difficulties for speakers, or a different mixture of recording techniques.
Nope. You are confusing many factors, some of which do not matter here (recording techniques???). Read this thread I created on tracks which Dr. Toole/Olive have found to be most revealing of speaker tonality using statistical analysis in their research: https://www.audiosciencereview.com/...sic-tracks-for-speaker-and-room-eq-testing.6/

You will see graphs like this:

Program+Influence+on+Listener+Performance.png


We don't do things randomly or based on gut feeling. The work you are criticizing has come from decades of research, with secondary results of what makes good test tracks.
 
I can. One is called runtime stereophony and vastly popular among classical recordings of the last 25 years. For example recordings which are based on A/B main mic arrangements, should not sound wonderful in mono, particularly not downmixed to mono. Their whole stereo effect relies on masking by first wavefront preference. If you make that audible in mono, it usually sounds like a garage, not to speak of comb-filtering and cancellation effects.
Track please.
 
I do not see any point in testing speakers in mono, except for very specific questions (resonances, distortion, bass quality and alike). So far you did not bring up a valid point why it should be superior in a preference test.
Here is a reason for you:

ListenerPerformance.jpg


Folks like you, with beliefs you have, have been formally tested in how reliable they are in assessing speaker fidelity. All failed miserably compared to Harman trained listeners.

You fail because a) you don't know how to train yourself and b) how to listen. As a result, your assessments of fidelity are close to random noise.
 
Whatever happened to mutual respect and genuine dialogue? Instead of dismissing @Arindal, how about engaging with actual reasoning and explaining why they’re wrong? -Maybe with demo tracks or clips for readers to test out themselves why mono is preferred for this testing.

Because they are a long winded bore who seems to think that dismissing the opinions of e.g. Floyd Toole makes them look smart. If they have a real point they should make it with data. Dismissing Tracy Chapman as a good choice of listening track when “fast car” is literally proven to be a good discriminator is just farcical.
 
´Learning the new science´? Which ´old science´ are you planning to replace?
It is more akin to folklore that we are trying replace as calling it science would be a huge insult to the word!

I remember the old days where we had wild west. That any speaker is likely to be as good as any other since someone would like it. Dr. Toole literally created science in this region, taking opinion out of the equation. This has benefited wide swath of the industry. Countless companies follow his research even if they never say it, or publish an article.

Your view on the other hand is what? Trust me because I know?
 
Can tell you from quite some experience with Dolby Atmos and Auro3D setups and listening tests, that the difference in cost and effort between mono and stereo is negligible.
What are you talking about? It is much cheaper to send me one speaker to test than two. It is far less work to optimally listen to a single speaker than two. Performing controlled testing using moving platforms is far more complicated and proportionally more expensive than mono. Harman has mutli-channel shuffler but they don't use it because they don't need to:

index.php


This setup can swap entire 5.1 set of speakers for another. It was far more expensive to build than this one channel one I sat in:

index.php


Really, you have provided no evidence, no research, nothing other than lay audiophile intuition that stereo testing is better. Science and my own experience testing hundreds of speakers proves otherwise. Please don't waste our time with writing essays over and over again. We know what you are saying. It is just wrong.
 
In my understanding, test tracks are not meant to ´sound nice on every setup´, but to either reveal flaws, allow distinction of differences or represent a vast variety of mixing styles, tonality differences, difficulties for speakers, or a different mixture of recording techniques.
Not to step into the mono vs. stereo testing debate, but I have advocated this approach also. When I have needed to demo speakers, it was usually with some time pressure, so I found tracks that would immediately sound wrong in specific ways if there was something wrong.

In truth the test tracks I was using were all basically musical test tones, but just listening to sweeps and PN does not give a sense of "right" / "wrong" as easily, plus people tend to look at you weird if you play tones in crowded areas.
 
Appealing to authority has never been a valid way to shut down discussions when arguments start to run thin.
There is no appeal to authority. The authorities are here themselves, explaining their research! It doesn't get better than this.

When I talk about such research, I provide references, quotes, data. In contrast, our poster is just posting opinion after opinion, thinking persistence makes them convincing. You should be complaining about as I am pretty sure you learn nothing hearing him repeat those things.
 
Let’s try to keep the tone respectful -for the sake of learning and exchanging ideas.
Your own tone is not respectful. Start there if you care. You are creating noise and increasing tensions by taking sides. Member has had plenty of opportunity to speak his peace, yet he keeps repeating basically that he is right. This is not how we discuss things in this forum. If that is what you like, you two should go elsewhere.
 
I think some valid critical points have been raised, but I don’t feel they’ve been convincingly addressed. I also recall @SIY asking a question that never got answered, though I can’t seem to find it anymore.

To be honest, the thread has become difficult to follow -possibly due to edits or restructuring by @RickS. The link to the original thread is also missing, which just adds to the confusion.

Edit: Here’s the original thread:

We all know the original research has its limitations, and it can and should be interesting to discuss those. If you can dig out the constructive bits and reboot the thread that would be awesome!
 
This 5-minute video explains the tonality issue with phantom-centered sounds quite well, listen especially how much the pink noise changes when leaning to the side where the direct sound from a single speaker dominates, compared to how it sounds right in the middle where the phantom center is heard.

Most mixing engineers will compensate for the dull sound the comb-filtering is causing, especially for a center-panned vocal track as our hearing is extra sensitive to errors in human voices. This EQ compensation will and should make the voice sound exaggerated when listening outside the phantom center position, and the same goes for listening to a single loudspeaker, and especially so with the speaker right in front of the listener.

It is well-known yes. But it is much less known how the engineers do at the mixing table with this issue. They are not fixed mannequin heads but a free to move and turn. Also checks and final asjustments with both near and mid-field as well as headphones is another factor. And finally published work has shown preference for linear on-axis frequency response using single speakers despite the Shirley et al results. Double-blind experiments with this specific issue and a variety of music mixes has not been done+published and Floyd Toole has also said it might be done by someone else IF there is research interest and financing. Which will not happen. Most use is today with headphones where most of the research is going. It ends there, most probably. Whether some prefer +/- 0.1 dB frequency reponse and others +/- 1.5 dB with certain specific corrections 1-5 kHz will still be a preference and individual choice.
 
One speaker always. Whether the application is one or 20 channels. Indeed, I have done 1 channel height listening, comparing the original track to Dolby encoded which clearly showed the artifacts Dolby codec had added. With all the channels whaling, you could not hear it.

Heck, I have done subwoofer tests with just the sub playing as well. So much easier to hear its artifacts.

Research indeed shows that the more speakers you have, the less discrimination you wide up with:

index.php


So if you are listening to all the channels, you are in worse shape, producing even more unreliable observations than doing the same in stereo. How much reading have you actually done to ask me these questions?
What happened with eq B, where mch was most and mono least preffered?
 
What are you talking about? It is much cheaper to send me one speaker to test than two. It is far less work to optimally listen to a single speaker than two. Performing controlled testing using moving platforms is far more complicated and proportionally more expensive than mono. Harman has mutli-channel shuffler but they don't use it because they don't need to:

index.php


This setup can swap entire 5.1 set of speakers for another. It was far more expensive to build than this one channel one I sat in:

index.php


Really, you have provided no evidence, no research, nothing other than lay audiophile intuition that stereo testing is better. Science and my own experience testing hundreds of speakers proves otherwise. Please don't waste our time with writing essays over and over again. We know what you are saying. It is just wrong.
With shuffler - how does one make a fair comparation between Klipschorn, Quad esl and @SIY 's favourite NHT loudspeaker? Of course all of them would be rated worse than eg. Revel.
 
I have done 1 channel height listening, comparing the original track to Dolby encoded which clearly showed the artifacts Dolby codec had added. With all the channels whaling, you could not hear it.

That is kind of trivial and to be expected. Immersive information with Dolby Atmos is not channel-discrete, but encoded as a vector-panned difference signal which is subsequently undergoing a lossy encoding. The Dolby decoder is meant to calculate channel-discrete signals for the playback based on a Dolby-exclusive method of creating phantom localization in two dimensions typically involving 3 speakers (2 channel discrete on the horizontal plane plus the nearest immersive channel available). If you switch off two of them, well, you hear compression and cancellation artifacts combined which were estimated to be masked by the ´clean´ base plane channels.

It is the same with almost all lossy encoders: If you switch off or take away the fraction of the signal providing masking, you are most likely to hear compression artifacts. As mentioned, one of the first research projects I was involved in, were about improving lossy codecs. It was found that not only listeners with hearing loss were more likely to identify them correctly, but also listeners who switched off the midrange driver´s amplifier of their studio monitors. Kind of makes sense. So if you want to identify compression artifacts, I recommend applying a very steep highpass filter at 12K removing all frequencies below that.

This will increase sensitivity for a lossy codec discrimination test to astonishing precision. But it is to a degree of ridiculousness making it impossible to perform a preference test or say which codec would deliver natural tonality and imaging.

Apologies for the detours to all speaker-testing guys and gals! But I guess this extreme example explains why perfectioning a discrimination test is making a preference test more unreliable.

Research indeed shows that the more speakers you have, the less discrimination you wide up with:

Absolutely agree. The key word here is ´discrimination´. If this is the sole aim, you are right. My points were more referring to preference tests which allow verdicts to be applied on a vast number of different listeners, recording techniques, rooms and setups.

many factors, some of which do not matter here (recording techniques???)

Taking different recording techniques into account when reviewing speakers, and how they are compatible with the listening test method, does matter in my understanding, as chances are high that someone using the reviewed speaker on a professional base will apply this or that technique. Or a consumer will listen to the result expecting enjoyment or at least understanding of what the mixing engineer was aiming at.

Your view on the other hand is what? Trust me because I know?

Did I write anything hinting to that direction? I am not even aware of any verdict on products which people should trust.

I encourage everyone to do own comparison tests, understand their own preference, I can recommend recordings to verify my point and explain the contradictions I see when it comes to mono testing. For example it speaks louder than any other bold claim here, that you did not answer to my statement about phantom source tonality in stereo due to +-30deg speaker angle compared to 0deg with mono testing. It is such an obvious contradiction implying that the mono method is prone to misjudgments in terms of tonality.
 
But it is much less known how the engineers do at the mixing table with this issue.

This has been widely discussed in the pro audio community. It is more or less common agreement that with recordings based on two or more microphones capturing the events which will become exactly central phantom sources (such as main stereo mic arrangements in a concert hall), it is a more or less negligible problem (will not go into details of center elevation with such recordings). Narrow-banded cancellation effects caused by HRTF and interaurally identical crosstalk is, as mentioned in David's video, the main root of the problem, but by far not the only one related to HRTF and angles for central phantom sources.

If you look at the average 30deg az. eardrum FR relative to 0deg ahead, you notice an additional peak around 5K. This is another band which mixing engineers will address when panning lead vocals as central phantom source originating from a mono microphone signal which is rather unrelated to keeping the exact central position in a nearfield environment.

In any case, a dry mono signal from a microphone very close to a singer will call for applying EQ to sound more or less natural. No engineer will mix down this signal unchanged, and most will use stereo speakers positioned at +-30deg in a more or less untreated room to judge tonality. There is still a great variety of EQ curves to be expected, but reference point is the +-30deg of the stereo triangle.

If you now take one channel of any EQ-corrected recording and listen to it under 0deg mono, well, you will end up in applying a correction curve which is not meant for this angle and you will hear significant colorations. If a speaker sounds ´more natural´ under these conditions, you can be sure it is not neutral in terms of tonal balance, but doing something to reverse the ´wrongly applied´ EQ correction.

One might come to the idea that placing a mono speaker at -30deg left would solve this problem. Unfortunately it does not. On one hand you have the crosstalk to the right ear which in particular attenuates the aforementioned 5K band. On the other hand at -30deg the loudspeaker itself can be localized as a real source by our brain pretty precisely as coming in from 30deg left, so it would expect a different tonality compared to the central phantom source. And tonal balance is to a certain degree judged based on pattern recognition.

The only conclusion in my understanding: Judging tonal balance of speakers in mono can only be executed with designated material mixed in mono. Using stereo material of any kind, is prone to misjudgments particularly if these speakers will be used in a stereo arrangement later.

Whether some prefer +/- 0.1 dB frequency reponse and others +/- 1.5 dB with certain specific corrections 1-5 kHz will still be a preference and individual choice.

If on-axis tonality would be solely a matter of preference and individual choice, wouldn't it mean the end of any measurement-based verdicts of loudspeakers, research on preference, controlled listening tests and optimizing speakers as well as rooms? Your statements sounds as if you are promoting what Amir has called ´the Wild West´ where any reproduction curve as absurd as we can imagine, was to be accepted if only someone declare it to meet their taste.

In practice, the HRTF-related tonality issues with phantom source vs. real source, testing in mono vs. stereo, listening at 0deg or 30deg, are much much more pronounced than +-1.5dB. Intraaural difference within this listening window can be as high as +-11dB relative to the other variant in narrow frequency bands.

Interestingly, it is seemingly not a specific ´correction curve´ which introduces errors and misjudgments related to phantom vs. real source tonality. If I recall it correctly, Dr. Toole has confirmed in the parallel thread that such thing as a ´disappearing mono localization´ exists as a quality of certain speakers, preferred in a mono test. I can confirm that, and my hypothesis would be that it has to do with the indirect soundfield and the directivity of the speaker.
 
Last edited:
Back
Top Bottom