Toole's findings are generally about mass market gear in normal reflective rooms. And so on. Situations beyond the ordinary haven't received much attention. Are we ordinary listeners, or more than that?
What people often don't know - because they haven't read the papers or my books - is that at the very outset of my research the key set of experiments were done in collaboration with the Canadian Broadcasting Corporation, a massive nationwide organization that was seeking to upgrade their monitors, large, medium and small for different circumstances. The goal was to find similar sounding, and timbrally neutral, loudspeakers for their needs. The listeners were a mixture of their professional recording engineers from across the nation and local audiophiles that I had sought out. I had earlier given up on professional musicians as critical listeners of sound quality - most of them pay more attention to the music, seeking "valid interpretations". Others have found the same thing. There were no "people off the street".
Toole, F. E. (
1985). “Subjective measurements of loudspeaker sound quality and listener preferences”, J. Audio Eng. Soc.,
33. pp. 2-31.
Toole, F. E. (
1986). “Loudspeaker measurements and their relationship to listener preferences”, J. Audio Eng. Soc.,
34, pt.1, pp. 227-235, pt. 2, pp. 323-348.
There were two important findings: pro engineers and audiophiles liked and disliked the same loudspeakers in double-blind, equal loudness comparison tests. Except for a sub-group of the engineers who had trouble delivering consistent ratings of loudspeakers when they heard them repeatedly in the randomized presentations. It turns out that these individuals had suffered significant hearing loss, which is an occupational hazard in pro audio. They were not hearing all of the sounds, good or bad, and therefore could not be as reliably analytical. Yet, they were creating recordings! How many like them are out there in the music and movie industries? A lot.
All of this was described in JAES publications in 1985-86, and is in my books.
The next phase was to see if the impressive agreement among listeners extended to "ordinary folks" - and, surprise, surprise, it did. The obvious question is: How could they possibly know what good sound sounded like? It turns out that everyone seems to be able to recognize aspects of reproduced sound that are not "natural". In particular, all listeners objected to persistent audible resonances - booms, honks, nasality, shrillness, etc. The absence of resonances in transducers and enclosures turns out to the fundamental requirement for a "neutral/accurate" loudspeaker. General spectral humps, dips and tilts are there of course, but they tend to be correctable with tone controls or simple equalizers. These are the principal variations in recordings and at the basis of "personal preference", but, that said, most listeners prefer smooth, flat direct sound (on-axis performance). Trained listeners are simply those practiced in the detection of resonances. They deliver the same sound quality ratings as others, but they do it quickly and more reliably.
Olive, S.E. (
2003). “Difference in Performance and Preference of Trained versus Untrained Listeners in Loudspeaker Tests: A Case Study”, J. Audio Eng. Soc.,
51, pp. 806-825. Sean has found the same thing in headphone evaluations done with many, many listeners around the world, eliminating yet another suspected factor: national or cultural bias.
So, from this perspective the ideal loudspeaker is one that is fundamentally neutral, and this is something that we can determine with impressive precision from measurements alone (the spinorama being one example). With this as a starting point simple tone controls or equalizers can create whatever "sound" the listener personally prefers, with any recording (they vary). It is always possible to return to "neutral" to hear what was created for us to enjoy.
However, bear in mind that the quality and quantity (primarily extension) of low frequencies account for about 30% of one's overall assessment of sound quality. The room itself (with loudspeaker and listener locations) is the dominant factor below about 200 Hz. This is something that cannot be generalized - it is totally dependent on individual circumstances. There are ways to address this problem, as discussed in my books.
In reality, stereo itself is the most serious impediment to getting the level of sound and spatial quality we seek. Two channels are not enough, and ALL phantom images between the loudspeakers are corrupted by acoustical interference - each ear hears two sounds, one delayed. But humans are remarkably adaptable, and forgiving.