• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Why evaluating the sound of a single speaker is essential

Not some, but all speakers which are great in mono are great in stereo also

Disagree with that, particularly when it comes to imaging and ambience. I experienced such with speakers which sound pleasant in monaural by the fact that they do not present an overly precise and stable mono localization. Happens regularly but not exclusively with models having midrange and tweeter separated, employing rather big midrange drivers/horns/waveguides or tendency towards edge diffraction influencing the localization. Same with speakers showing deviations between several units in terms of acoustic phase or amplitude over frequency.

Such all regularly sound more pleasing, a bit more distant and ´natural´ in monaural which leads to on unstable, at times diffuse or blurred stereo image when used in stereo, particularly with two-channel recordings containing meaningful reverb pattern (such as classical recordings). Vice versa a speaker delivering excellent stereo performance in terms of stable imaging, precise, minimum-width localization and ambience, will in monaural most probably lead to an annoying ´minimum-sized phantom image´.

In my experience it is impossible to judge image stability, perceived distance of the image, depth-of-field and precision of localization in mono. These are all phenomena based on phantom localization which is not judgeable in mono.
 
Disagree with that, particularly when it comes to imaging and ambience. I experienced such with speakers which sound pleasant in monaural by the fact that they do not present an overly precise and stable mono localization.
I wouldn't call that mono speaker "pleasant sounding", but "broken sounding". Good single/mono speaker must have precise and stable mono localization, not "pleasing" effect,

Happens regularly but not exclusively with models having midrange and tweeter separated, employing rather big midrange drivers/horns/waveguides or tendency towards edge diffraction influencing the localization.
Large diffraction effects are known enemy of precise and stable image - both in mono and stereo.

Same with speakers showing deviations between several units in terms of acoustic phase or amplitude over frequency.
Deviation/difference between left and right speaker must be very small (ideally zero), to have stable and precise stereo image - that is known long time ago.


In my experience it is impossible to judge image stability, perceived distance of the image, depth-of-field and precision of localization in mono. These are all phenomena based on phantom localization which is not judgeable in mono.
In my experience it is the opposite,
 
Good single/mono speaker must have precise and stable mono localization, not "pleasing" effect,

That is a reasonable ideal, but as most of us are not used to listen to dedicated monaural mixes on a daily base, I do not think it is shared by many. Precise and stable mono localization making it likely to have stable stereo localization, sounds unnatural to most of listeners and will most likely not be preferred by a lot of untrained listeners. You surely could do such listening tests with educated people like recording engineers, telling them that solely localization stability and minimum width of the perceived source is ideal, but this is basically ruling out reasonable listening tests with other people who seemingly prefer the ´pleasing´ effect.

My former professor for recording called these monaural sources ´birds on the wire´, in opposition to natural instruments/voices recorded in their dedicated room. Most of people perceive these as ´unnatural´ even in a stereo environment with lower amount of discrete reflection.

Large diffraction effects are known enemy of precise and stable image - both in mono and stereo.

Certainly true, but while in mono a certain degree of diffraction and ´localization blur´ will in many cases cause a perception of slightly more distant, ´natural´ and pleasing imaging hence presumably preferred by untrained and unbriefed listeners, in stereo recordings containing a natural reverb pattern this effect is already part of the mix. So as a result the blur and perception of depth will double, causing annoying instability and overly distant imaging in stereo while pleasing minimum blur in mono.

A certain amount of diffraction or blur (due to broad speaker drivers, diffuse dispersion or alike), in mono is causing a similar perception compared to a stereo phantom source. It is not sounding ´broken´ as you suggested. The moment you transfer this to stereo meant to create a phantom image rooted in two signals, it will sound broken.

The additional problem with mono is that is cannot contain a meaningful reverb pattern giving the listener an impression of distance, depth-of-field and ´air´ around the phantom sources, if that makes sense. With completely dry recordings and solely the image stability, as you suggested, this might work.

The moment recordings are involved containing reverb pattern, it fails, as dominating reverb in mono sounds always weird and overly distant, lacking width for obvious reasons.

Deviation/difference between left and right speaker must be very small (ideally zero), to have stable and precise stereo image - that is known long time ago.

The phenomenon is known, but could you give any meaningful tolerance band for acceptable deviations between two speakers of a stereo pair? And if you say it should be ideally zero and it is that important, don't you think it should be tested with every single speaker review containing measurements?

And how do you imagine this to be quantified if we take phase differences, amplitude differences over tiny angle deviations as well as direct reflections in the room into account? We would need precise data for all of this, with the latter largely depending on the individual listening room, to have a rudimentary understanding about phantom source localization quality in stereo. Isn't it easier to just to a listening test in stereo?

In my experience it is the opposite,

Could you please elaborate on the measures with which you are reliably judging distance perception, depth-of-field, phantom source localization as well as stability in mono?
 
As may have been mentioned, early side wall reflections will be delayed quite a bit with mono vs stereo. Meaning the acoustic and psychoacoustic result will be different than stereo unless side wall reflections are well treated.

I personally use mono most of the time when I AB test speakers or drivers in speaker under development. But that's mainly because I can switch faster which is important in an AB test due to the audible memory.
 
I personally use mono most of the time when I AB test speakers or drivers in speaker under development. But that's mainly because I can switch faster

I guess we all agree on mono AB tests being advantageous for a lot of cases such as speaker development or single driver comparison. My understanding is that the more a listening comparison could be described as a discrimination test rather than a preference test, monaural is superior in detecting slight differences, whatever the imperfections might be. Even for whole systems which have undergone the final development stages already, the outcome of distortion, narrow-banded FR imperfections, dynamic compression or alike, is more likely to be identified in mono.
 
Precise and stable mono localization making it likely to have stable stereo localization, sounds unnatural to most of listeners and will most likely not be preferred by a lot of untrained listeners. You surely could do such listening tests with educated people like recording engineers, telling them that solely localization stability and minimum width of the perceived source is ideal, but this is basically ruling out reasonable listening tests with other people who seemingly prefer the ´pleasing´ effect.
I think you (and other untrained listeners) misunderstand the whole thing. Listening to a single loudspeaker (at a time) is used in evaluating/comparison/testing/reviewing of different loudspeakers. It is not intended for everyday listening (for pleasure), because everyone is using not one but two loudspeakers, in stereo.
So, untrained listeners should be educated to search for a stable and precise sound localization while testing single speaker, and certainly not praising "pleasing" effect. Actually, it is not "pleasing" and "natural" - it is wrong!

My former professor for recording called these monaural sources ´birds on the wire´, in opposition to natural instruments/voices recorded in their dedicated room. Most of people perceive these as ´unnatural´ even in a stereo environment with lower amount of discrete reflection
Did your professor refer to casual listening to monoaural source (single loudspeaker) for pleasure, or testing/reviewing single loudspeaker? Very different!

The additional problem with mono is that is cannot contain a meaningful reverb pattern giving the listener an impression of distance, depth-of-field and ´air´ around the phantom sources, if that makes sense. With completely dry recordings and solely the image stability, as you suggested, this might work.
I didn't suggest that. Recording which contain reverb (natural or artificial) will gave depth-of-field even in listening to single speaker.

The moment recordings are involved containing reverb pattern, it fails, as dominating reverb in mono sounds always weird and overly distant, lacking width for obvious reasons.
But expert listener should seek and praise "overly distant" sound while testing/reviewing single loudspeaker! I don't care (and you shouldn't also) about wrong opinion ("weird") of untrained listeners!

The phenomenon is known, but could you give any meaningful tolerance band for acceptable deviations between two speakers of a stereo pair?
Yes, I could - it should be less than 0.2 dB for excellent pair of loudspeakers. Good stereo loudspeakers routinely have less than 0.5 dB difference between the two speakers.

And if you say it should be ideally zero and it is that important, don't you think it should be tested with every single speaker review containing measurements?
I really don't know how you missed the plethora of tests/reviews which contain frequency response measurements of both loudspeakers from the stereo pair?! Here is the test of medium-priced KEF Q11 Meta loudspeakers from Hi-Fi News magazine, with measured frequency responses of left (black curve) and right (red curve) loudspeaker:
KEF Q11 Meta_Response.jpg


As you can see, black (L) and red (R) curves are practically overlapping from 20 Hz to 60 kHz!

And here is frequency response of pair of much more expensive Piega Coax 411 G2:
Piega Coax 411_Response.jpg

Here difference between the two speakers is obvious (in spite of high price).

And how do you imagine this to be quantified if we take phase differences, amplitude differences over tiny angle deviations as well as direct reflections in the room into account? We would need precise data for all of this, with the latter largely depending on the individual listening room, to have a rudimentary understanding about phantom source localization quality in stereo. Isn't it easier to just to a listening test in stereo?
I don't have to imagine anything - measuring the anechoic frequency response over all angles (it is called SPINORAMA) is the accepted golden standard in the science community. We don't "need precise data" with included "direct reflections in the room" at all. We need just anechoic SPINORAMA.
And no, it is not "easier to just to a listening test in stereo" - it is just plain wrong. Or not discerning enough, at best.
 
Last edited:
Listening to a single loudspeaker (at a time) is used in evaluating/comparison/testing/reviewing of different loudspeakers.

FYI, i am a more than trained listener with decades of experience in comparing, evaluating and adjusting recordings, loudspeakers and rooms alike. And I exactly understood what you meant by using single loudspeaker setups for comparison. For evaluating certain criteria of a loudspeaker, I agree that monaural tests are very useful and preferable. But I do not see any valid argument that this is the case for judging a speakers stereo imaging, localization and ambience.

Actually, it is not "pleasing" and "natural" - it is wrong!

As mentioned, even trained listeners are not familiar with the sound of monaural, rather dry material, so they have to be told in advance what you define as ´wrong´. We certainly agree on the fact that blurred, more distant or unstable localization is a flaw and to be avoided. My point is that all these would be easier to identify and evaluate with phantom source localisation in a stereo setup using existing recordings containing a lot of reverb, rather than monaural ones. I have done such tests with numerous recording professionals, who usually prefer to use their own material as a reference point. And none of them showed up with a mono recording, yet.

Did your professor refer to casual listening to monoaural source (single loudspeaker) for pleasure, or testing/reviewing single loudspeaker?

None of those cases, it was about intensity-stereophony panning from closely mic´ed tracks and how to avoid this ´mono speaker´-like impression in a stereo mix.

it should be less than 0.2 dB for excellent pair of loudspeakers.

You mean 0.2dB max deviation peak-to-peak in any narrow band, like +-0.1dB tolerance band? Could you name an example of a commercially available loudspeaker meeting this criteria please? The KEF you were mentioning as ´practically overlapping´ seems to exceed this tolerance band by far, although there is a certain limitation of the graph making it impossible to really read the deviation precisely.

And how about attenuating discrete early reflections, should they also stay well unter 0.2dB max interaural deviation from each other? I guess it would be difficult to find an existing room meeting these standards.

I really don't know how you missed the plethora of tests/reviews which contain frequency response measurements of both loudspeakers from the stereo pair?

Not reading hi-fi news, sorry. In the pro world some reviewers with excellent lab facilities did it in the past, but I do not really see it on a regular base.

measuring the anechoic frequency response over all angles (it is called SPINORAMA) is the accepted golden standard in the science community.

Could you elaborate on the method with which you are evaluating solely from a Spinorama set of data how localization stability, imaging, perceived distance and depth-of-field of a given loudspeaker will be rated? I have never read about a model allowing such prediction, but maybe you know one.

We don't "need precise data" with included "direct reflections in the room".

I would be really interested how you evaluate the aforementioned criteria of localization and imaging of a speaker with neither performing a stereo listening test, nor taking reflections into account. So, if I name you two different speakers with a published set of measurements which are at least as good as the KEF in your example regarding inter channel tolerance band, could you tell which one will be superior in terms of localization stability, which one will provide a more distant or near imaging, a wider ambience or a flat staging? I would be very interested in the model behind such calculation.
 
FYI, i am a more than trained listener with decades of experience in comparing, evaluating and adjusting recordings, loudspeakers and rooms alike. And I exactly understood what you meant by using single loudspeaker setups for comparison. For evaluating certain criteria of a loudspeaker, I agree that monaural tests are very useful and preferable. But I do not see any valid argument that this is the case for judging a speakers stereo imaging, localization and ambience.
I am wondering why you, as a trained listener, continue to beat that dead horse? Of course you can't see any valid argument, if you intentionally chose to be blind to all audio science evidences.

My point is that all these would be easier to identify and evaluate with phantom source localisation in a stereo setup using existing recordings containing a lot of reverb, rather than monaural ones.
Your point is wrong.

None of those cases,..
So, irrelevant to this discussion...

You mean 0.2dB max deviation peak-to-peak in any narrow band, like +-0.1dB tolerance band?
+/- 0.2 dB.

Could you name an example of a commercially available loudspeaker meeting this criteria please?
Yes - KEF Q11 Meta. And many active studio monitors.

The KEF you were mentioning as ´practically overlapping´ seems to exceed this tolerance band by far,...
It seem to me you are wrong.

I would be really interested how you evaluate the aforementioned criteria of localization and imaging of a speaker with neither performing a stereo listening test, nor taking reflections into account.
Simple - with listening to single loudspeaker in a room, taking (subjectively) reflections into account.

So, if I name you two different speakers with a published set of measurements which are at least as good as the KEF in your example regarding inter channel tolerance band, could you tell which one will be superior in terms of localization stability, which one will provide a more distant or near imaging, a wider ambience or a flat staging? I would be very interested in the model behind such calculation.
If the measurements are similar, than both speaker models are equal in localization stability. For evaluating if the image is more distant or near (or flat), you have to listen/evaluate/review single loudspeaker. You can't calculate that, you have to listen to a single loudspeaker (not stereo pair!).
 
As they say- theory and practice are the same only in theory. Just ask the scientists in Harman. ;)
Harman showroom story
As they say - in theory a reasonable man will embrace evidence against his previous wrong believes, accepting the reality. But in practice...
Quote from the link above:
" The space was designed to showcase their full range of consumer and audiophile equipment, the street level would be devoted to Harman’s AKG, Harman Kardon, JBL, and other respected brands. The lower level would feature a screening room and, behind a discrete sliding door, a private listening room outfitted with Harman’s premium Mark Levinson amplifiers, and award-winning Revel Salon2 Speakers, a CD/SACD Disc player, and a turntable for vinyl aficionados. "
So, that Harman showroom is just that - showroom/showcase where visitors can listen to Harman equipment.
That showroom is not intended to test/review different loudspeakers (the audiophile hobby). For that purpose you should listen to a single loudspeaker at a time, comparing different loudspeakers - just ask a scientist in Harman, like Floyd Toole, who said exactly that numerous times, in this thread also.
 
As they say - in theory a reasonable man will embrace evidence against his previous wrong believes, accepting the reality. But in practice...
Quote from the link above:
" The space was designed to showcase their full range of consumer and audiophile equipment, the street level would be devoted to Harman’s AKG, Harman Kardon, JBL, and other respected brands. The lower level would feature a screening room and, behind a discrete sliding door, a private listening room outfitted with Harman’s premium Mark Levinson amplifiers, and award-winning Revel Salon2 Speakers, a CD/SACD Disc player, and a turntable for vinyl aficionados. "
So, that Harman showroom is just that - showroom/showcase where visitors can listen to Harman equipment.
That showroom is not intended to test/review different loudspeakers (the audiophile hobby). For that purpose you should listen to a single loudspeaker at a time, comparing different loudspeakers - just ask a scientist in Harman, like Floyd Toole, who said exactly that numerous times, in this thread also.
Sure, that's why they finally outsourced acoustician.
 
Sure, that's why they finally outsourced acoustician.
Harman is all in the field of designing audio equipment (loudspeakers, amps, etc.). Designing room acoustics is very different field of science.
You really didn't know they are different fields of science (and business)?! :facepalm:
 
Last edited:
Your point is wrong.

As you prefer to post apodictic claims instead of explanations, I see very little evidence for that.

So, irrelevant to this discussion

Actually very relevant, as it describes the subjective impression people get when doing such experiments.

KEF Q11 Meta. And many active studio monitors.

Could you name any of the active ones please? I am not familiar with the KEF, but I guess there is no point in naming a speaker as an example for imaging which does not meet basic standards of on-axis linearity and off-axis linearity alike.

Simple - with listening to single loudspeaker in a room, taking (subjectively) reflections into account.

That is not the same, and you probably know it. Reflections (particularly the early ones affecting localization) in a room originating from a single speaker reproducing monaural content, from the perspective of our ears always support the localization of the true existing source, the mono loudspeaker. Even if they are dominating.

That is not comparable to their influence in a stereo setup where they would also help the ears revealing the true localizable sources - the two loudspeakers - while the intended localization is the phantom source somewhere in between the two real sources. The result in terms of localization is not predictable or transferrable from the monaural experiment.

If the measurements are similar, than both speaker models are equal in localization stability.

I do not disagree, but that is a theoretical scenario. If you measure interchannel differences in a real room taking the reflections into account, you will always end up with differences way greater than your proclaimed threshold. It might vary from speaker to speaker as completely identical driver layout and directivity alike are quite rare to find (even in a series like Genelec 83xx which were designed to provide such).

Knowing that there are differences exceeding your threshold would not help. I do not see a way how you would be able to predict which speaker offers which imaging, localization and ambience characteristics from these measurements. If you have a model to predict exactly that, it would be helpful to know how it works.

For evaluating if the image is more distant or near (or flat), you have to listen/evaluate/review single loudspeaker.

No, you cannot. In a monaural scenario, you have a completely different way of reverb pattern contained in the recordings being reproduced, as these are also monaural (if we are not talking about a completely dry recording, which would be senseless to judge distance or depth-of-field). In a two-channel scenario, a good portion of the reverb pattern is relying on phantom source principles just like the direct sound affecting the localization. How these play together with the reflections in the listening room, is key to how we perceive depth-of-field and distance. You cannot judge that in mono at all.

The other aspect you seemingly ignore is localization stability. In a mono setup, you deal solely with real sound source localization. Stereophony relies on phantom source localization, with little interaural tolerance being one of the factors, but by far not the only one. Results of an experiment done according to one of these principles is not transferrable to the other.

To doublecheck this, I recommend to listen to speakers with an MTM arrangement dubbed d´Apollito or virtual pointsource. Preferably with bigger midrange drivers, greater distance between the latter and higher x-over freq, preferably in a near-field environment. Something looking like this:

Burmester_D.jpg


In mono, these will present, if well-designed, an almost perfectly stable localization, as all localizable frequencies seemingly originate from one point, the tweeter axis. The way we perceive localizable content from the midranges, can be compared to phantom source localization (although it is not exactly that). In contrary, when listened to in a stereo environment, we have to deal with six real sources supposed to form one stable phantom localization, which is prone to fail particularly under nearfield conditions or when direct reflections are involved revealing the direction of the real sound sources (sidewalls, mixing console, desk or alike).

One might say this is an extreme example, but something like that on a lower degree is at play basically with most of non-pointsourch speakers.
 
Last edited:
In conventional stereo systems, multidirectional loudspeakers seem to benefit this kind of music, but also even pop recordings with hard-panned L and R images - monophonic sounds. It is often considered more pleasing if the instrument or group of instruments does not localize to a single point in space. Local early reflections generate acoustical interference - the "dreaded" comb filtering - but there are many such filters even in small rooms, so they are rarely audible problems. Concert halls are just enormously complicated comb filters. Fortunately, humans have binaural hearing and instead of hearing coloration from the measured combs, we hear spaciousness and it is desirable.
This seems to be the experience that I have every time I go to one of my friends with a set of Focal Alto Utopia BE. These speakers have much wider dispersion than my DIY of a KEF R900.
The wider dispersion definitely adds a much more spacious sound, where my narrower dispersion KEF speakers, have a lot more focused image.
We both have multiple subwoofers, global EQ below Schroeder and above Schroeder, we use quasi anechoic in-room measurements, to find linear deviation in the horizontal plane to slightly EQ them to be smoother than original.

A question though. I just listened to "I heard it through the grape wine". Did they mic/mix music differently back then? It's as though the panning between left and right, is much more pronounced? Older music tend to have some sound/instruments coming purely from one speaker and sounding like the speaker they mixed on back then, had a different design overall?
 
Focal Alto Utopia BE. These speakers have much wider dispersion than my DIY of a KEF R900.
The wider dispersion definitely adds a much more spacious sound, where my narrower dispersion KEF speakers, have a lot more focused image.

It is not only about wide or narrow dispersion, but also the actual angles count at which an SPL comparable to the on-axis response is retained, as well as the directivity over frequency.

Of particular importance is the frequency band 2-4K and the neighboring bands. It is not only the band signaling more or less frontal angle of attack of the direct sound when being boosted slightly in level (see HRTF and Blauert´s directional band theory), it is also the band underrepresented in perception of the diffuse soundfield.

You have chosen two extreme examples of loudspeakers concerning these frequency bands: The Focals seemingly have a very broad-dispersing tweeter and a convex baffle hence very low d.i. in this freq band and lots of SPL at angles causing direct reflections the most. The KEF on the other hand show a step up in directivity index somewhere in this region due to the tweeter placement and huge midrange cone acting as a waveguide. So you compare one speaker with significantly overrepresented 2-4K band in the indirect sound field (Focal) and one with significantly underrepresented (KEF).

I personally perceive both as flawed and would prefer a more neutral one. The overall level of reverberation added in the listening room and d.i. is a separate question.

I just listened to "I heard it through the grape wine".

The original Marvin Gaye recording?
 
Please remind me what is the name of dr Toole's book?
I am happy to do it - the name of the book is "Sound reproduction: The Acoustis and Psyhoacustics of Loudspeakers and rooms".
Got it? Sound reproduction! The rooms and rooms acoustics do not reproduce sound. The loudspeakers do. The sound we hear is coming from the loudspeakers, shaped with the room acoustics.
Field of science research of how we hear loudspeakers through the room acoustic (i.e. Psychoacoustics) is not the same as the field of designing room acoustics. You really don't know they are different fields of science (and business)?! :facepalm: You really don't know the loudspeakers are the Harman's business - not designing and building room acoustics?! :facepalm: I am sure Floyd Toole have more important things to do in his life than designing acoustic of that particular showroom.
 
Harman is all in the field of designing audio equipment (loudspeakers, amps, etc.). Designing room acoustics is very different field of science.
You really didn't know they are different fields of science (and business)?! :facepalm:
Dunno if it is so, but do you, in American English, take the word 'science' for engineering and vice versa?

On topic, I've got a few questions.

The mono, how is it made? Is it just one channel, or the two of stereo combined? Any problems with special artifacts from either method?

A mono speaker is going to energize the room with a lots of standing waves (aka resonances), and no second speaker would help out. How is the crucial bass quality maintained?

As far as I understood, people are more consistently able to detect the presence of resonances in comparison when listening to a mono speaker. What does detection mean in this context, and how does that translate to a preference score?

If stereo equalizes the preferences for, in parts vastly, different speakers, why bother with optimising for mono. The use case is stereo, so what for?

By what criteria are speakers equivalent under the test methods used?

In science there's a certain paradigm that if one finds an experimental contradiction to a theory, the latter has to be adapted, or gets dumped alltogether. I understand that speaker assessment isn't strict science in this philosophical sense for better reasons. But for the fun of it, what does a hypothesis look like, that could be falsified with an experiment? I think I read it once, namely "People in general are more discriminative, and over the board more consitent, in assessing speakers for quality when listening mono, while the verdict is based on subjective preference." Can that be confirmed, is this the underlying hypothesis?
 
Back
Top Bottom