What exactly was the methodology used for assessing sound quality (which I presume to be tonal balance and absence of distortions)?
Was this single-speaker/mono testing only or also stereo, and were the speakers positioned in a way which respected the requirements of their topology? In other words, were the listeners sitting on the correct axis for that particular speaker and was the speaker located in the room for best balance as per per design/manufacturer requirements?
You seem to be assuming that "neutrality/absence of resonances" is the prime concern for most audiophiles. Can you claim with certainty that it isn't "soundstage and imaging" or even some other nebulous perceptual characteristic that is hard to pin down (let's call it "engagement")?
Some types of distortion produce perceptual effects which are pleasing to the ear, such as the BBC dip ("distant" perspective), floor-bounce cancellation ("fast/tight" bass), an exaggerated top octave ("air") or even the limited low end extension of small standmounts. How the speakers interact with the room in the midranges and treble is a matter of preference and some people seem willing to trade "envelopment" for a flat frequency response or "sharper" imaging. (even though I don't have scientifically gathered statistical evidence to back me up, I am nonetheless relying on over a decade of observing audiophile behaviour in several forums where I have or currently participate in the US, UK, France, Spain and Portugal)
Tuga asked several important questions, all of which have been answered in great detail in my JAES papers starting 39 years ago, and my two “Sound Reproduction” books, the 3rd edition in 2017. I won’t attempt fully verified/referenced answers, but I will add a few comments.
“What exactly was the methodology used for assessing sound quality (which I presume to be tonal balance and absence of distortions)?”
The motivation for me to get involved in scientific research into sound reproduction was a blind listening test conducted in 1966 on a few loudspeakers that measured (and consequently sounded) very different from each other. All were highly regarded “HiFi” products and my question was simply “how can loudspeakers that are so very different all be considered examples of the state-of-the-art?” The result of that rudimentary exercise was that there was a clear consensus among the listeners, some of whom owned the loudspeakers under test, as to which ones were most preferred. But there was a second factor at play, because I had done on- and off-axis anechoic measurements on the loudspeakers (see Figure 18.1 in the 3rd edition of my book). The products that were highest rated exhibited the most appealing frequency response curves if one places value in flatness and smoothness. Reliable (blind or double-blind, equal loudness) subjective and (anechoic on- and off-axis) objective data evidently were useful information. Both sets of data were, and largely remain, absent from casual, and even “professional” assessments of loudspeakers. Decent measurements are becoming more common, but definitive subjective evaluations remain largely elusive. A lot of "hand waving" substitutes. The listening methodology was described in my JAES paper: Toole, F. E. (
1982). “Listening tests – turning opinion into fact”, J. Audio Eng. Soc.,
30, pp. 431-445. It has, of course, evolved since then, but the fundamental tenets are intact.
Olive, S.E., Castro, B and Toole, F.E. (
1998). “A New Laboratory for Evaluating Multichannel Audio Components and Systems”, 105th Convention, Audio Eng. Soc., Preprint 4842.
“Was this single-speaker/mono testing only or also stereo, and were the speakers positioned in a way which respected the requirements of their topology? In other words, were the listeners sitting on the correct axis for that particular speaker and was the speaker located in the room for best balance as per per design/manufacturer requirements?”
Both stereo and mono tests were done, but in terms of evaluating sound quality mono tests were by far the most reproducible and consistent across populations of listeners. Loudspeakers that rated highly in mono tests always rated highly in stereo tests, even when there was an attempt to separate the tonal and spatial aspects of what was heard. See Section 7.4.2 for a detailed discussion of a serious stereo vs. mono comparison test (published in JAES in 1985-86). As for listening configurations, the world still awaits such guidance from most manufacturers. Trial and error seems to be the near universal advice to new owners.
“You seem to be assuming that "neutrality/absence of resonances" is the prime concern for most audiophiles”.
I don’t think I ever attempted to assume anything on behalf of “most audiophiles”, a population of limitless variability. All that has been presented are the results of tests conducted on, by now, hundreds of loudspeakers by hundreds of listeners. The first published subjective/objective evaluations showed clearly that listeners responded most favorably to loudspeakers that exhibited the least evidence of resonances. Toole, F. E. (
1985). “Subjective measurements of loudspeaker sound quality and listener preferences”, J. Audio Eng. Soc.,
33. pp. 2-31. Toole, F. E. (
1986). “Loudspeaker measurements and their relationship to listener preferences”, J. Audio Eng. Soc.,
34, pt.1, pp. 227-235, pt. 2, pp. 323-348.
“Can you claim with certainty that it isn't "soundstage and imaging" or even some other nebulous perceptual characteristic that is hard to pin down (let's call it "engagement")?”
Pages 174-186 in the 3rd edition discuss the importance of spatial dimensions of perception, showing evidence that they are comparable with sound quality. My personal attitude is that if something doesn’t sound timbrally correct I don’t much care about space – especially now that relatively neutral reproduced sound quality can be achieved. Failure to radiate relatively neutral sound is evidence of engineering incompetence or of caring less. As for nebulous factors, the fact that two channel stereo with its inherent timbral and spatial flaws is found to be a satisfactory end goal by so many is one to be sure.
“Some types of distortion produce perceptual effects which are pleasing to the ear, such as the BBC dip ("distant" perspective), floor-bounce cancellation ("fast/tight" bass), an exaggerated top octave ("air") or even the limited low end extension of small standmounts.“
These are factors that include the program material, human hearing, the specific loudspeakers and rooms, none of which are consistent – in addition to personal preferences. When the issue can be addressed by tone controls or easily-accessed equalization, one can argue that there is virtue in starting with broadband neutral loudspeakers, or adding the missing bass to small neutral bookshelf loudspeakers. Why would one deliberately not reproduce the bass drum or organ pedals? Oh yes, “tight” bass. Each to his/her own. BTW low bass performance accounts for approximately 30% of one’s overall assessment of sound quality (Section 5.7 in the 3rd edition).
“How the speakers interact with the room in the midranges and treble is a matter of preference and some people seem willing to trade "envelopment" for a flat frequency response or "sharper" imaging. (even though I don't have scientifically gathered statistical evidence to back me up, I am nonetheless relying on over a decade of observing audiophile behaviour in several forums where I have or currently participate in the US, UK, France, Spain and Portugal)”
We each work from our own sources of information. I put my trust in data that are reproducible from time to time, place to place, over a statistically interesting proportion of the population. Accurate measurements qualify, as do controlled double-blind listening tests. I have been fortunate over my career to have had access to both for 26 years at the National Research Council of Canada, and 16 years at Harman International.