Thank you for your replies.
I understand the need for blind testing as a means to remove bias.
I have found that familiarity provides the highest level of discrimination in a listening assessment, and that is achieved over long term listening with familiar music and in a familiar system/room, and in that way assessing only a single variable/change to the system/sound.
Is this something that you considered when developping the methodology for your tests and testing facilities?
I have stressed the importance of familiarity but another extremely important factor in my view is the adequate set up of speakers in the room for best bass frequency response (and your research indicates that "low bass performance accounts for approximately 30% of one’s overall assessment of sound quality") and room "activation" (how differences in directivity interact with the room for "spatiality" effects, such as dipoles and omnis), and axial positioning or toe-in (e.g. Dali speakers are designe for flat response 30° off-axis).
Would you confirm that neither aspect was/is addressed in the shuffler rooms?
If that was not the case then some speakers were listened in sub-optimal conditions and thus unfairly handicaped.
Although I understand that providing optimal conditions for testing would have been impractical.
In regard to the mono vs stereo performance, your research shows that "spatial quality" ratings when listening in stereo improve for narrowing directivity monopoles (Kef) and for dipoles (Quad).
This seems to indicate that wider directivity is less important in stereo pairs, which is how speakers have been used for 50 years, and thus my interpretation of the data would have led me to dismiss mono as a means to assess "spatial quality".
Also, I would expect that anomalies in both the axial response and also the quality and amount of bass would still have influenced the listener preference when assessing "spatial quality" (as you've mentioned, "if something doesn’t sound timbrally correct I don’t much care about space"). Were these aspects corrected through optimal positioning and/or high-passing and the use of EQ?
I understand that again the question of practicality arises. It would be difficult to find even a pair of prototypes of the same speaker one with wide- and the other with narrowing-directivity, but that in my view would have prevented the introduction of other variables which undermine the effectiveness of the tests. Once we introduce more than one variable we no longer know for certain what the listeners are reacting to (is it the different bass response, a dip in the presence region, or how the speakers interact with the boundaries is generating a more pleasing level of "envelopment" and "spaciousness"?).
For the reasons highlighted I feel reluctant to agree with your interpretation of both the adequacy of mono testing for "spatial quality" assessment and, consequently, of the results which resulted from that testing.
Harman's "target curve" research also seems to indicate that untrained listeners prefer a lot more bass and more treble (sloping upwards with frequency). This is also my impression from observing people's reports on different speakers and show systems.
Isn't this indicative that people have different preferences when it comes to tonal balance?
I think most would agree that the ultimate goal of a playback system is to provide listening enjoyment to the end user. And because we have different tastes in music and in "presentation", and different rooms, perhaps creating a standard on-size-fits-all kind of speaker is a disservice to the community.
There's no doubt that the research you conducted was pioneering and produced a significant amount of valuable data. But as mentioned earlier I have some reservations in regard to both some of the metodology used in the tests as well as the interpretation of some of the data, but I am just a curious and inexperienced amateur with a lot of questions.
(I wish I could have expressed myself more clearly and eloquently but my means of expression is the drawing not the word, and English is not my first language)
Tuga said: “I understand the need for blind testing as a means to remove bias.
I have found that familiarity provides the highest level of discrimination in a listening assessment, and that is achieved over long term listening with familiar music and in a familiar system/room, using only a single variable. Is this something that you considered?”
Of course it was considered. It is called the “single stimulus” method of evaluation, which should also be done blind to avoid bias. The “take it home and listen to it” style of evaluation is a default, done in the absence of anything better. It is the norm among subjective reviewers. Comparison tests are generally more revealing of differences, but even then, if there are only two products and they share a defect, it may go unnoticed.. We found that comparisons among 3 or 4 randomly presented sounds (loudspeakers) was extremely revealing of differences that went unnoticed in prolonged exposure to single sounds, where adaptation (a profound capability of humans) is a major factor. We learn to “listen through” many kinds of technical peculiarities and flaws to be able to enjoy the music. What you are describing as your preferred method is not blind (and therefore subject to bias) and generously allows for adaptation. Familarity with the music is not necessary, only the ability to discern aspects of its reproduction that are not natural or pleasing. An important aspect of listening blind to several versions of the same program is that one quickly identifies the timbral features associated with the individual loudspeakers from those that are “constant”, namely the program itself and the room. Not all programs are equally revealing of differences: Olive, S.E. (1994). “A Method for Training Listeners and Selecting Program Material for Listening Tests”, Audio Eng. Soc. 97th Convention, preprint 3893.
Over the many years of doing these evaluations we have encountered numerous people who shared your view – including virtually all money-earning subjective reviewers. When they experienced the double-blind multiple-comparison test they performed no better than “ordinary” people: Olive, S.E. (2003). “Difference in Performance and Preference of Trained versus Untrained Listeners in Loudspeaker Tests: A Case Study”, J. Audio Eng. Soc., 51, pp. 806-825.
The reviewers often commented that they wished they had the facilities we had, but lacking them, did what they were able to do and felt comfortable with. You seem to be in that camp when you say: “familiarity provides the highest level of discrimination”. How do you know?
BTW, people who think they have “tin ears” usually turn out to perform normally. Those listeners who are distinctive or indecisive in their opinions usually have hearing loss. Musicians do not appear to have advantages, in fact many rationalize – I recall one saying about a mediocre loudspeaker that: “it is a valid interpretation of a cello”. He was listening to the music.
Tuga said: “In regard to the mono vs stereo performance, your research shows that "spatial quality" ratings when listening in stereo improve for narrowing directivity monopoles (Kef) and for dipoles (Quad).”
Here is a Figure 7.14 summarizing that research:
It shows that in terms of sound and spatial quality ratings, mono evaluations were much more revealing of differences. The additional binaural effects in stereo recordings were perceptually rewarding to be sure, but in ways that disguised timbral differences between loudspeakers. The recordings themselves were the dominant factors. Spatial Effects in mono reproduction? Yes, that surprised us, but when listening it was clear that the best, most neutral, loudspeakers came closest to “disappearing” behind the visually opaque screen, revealing depth information in recordings. Further analysis revealed that the mono pattern of ratings was closely replicated in multi-mike pan-potted popular recordings in which hard-panned sounds are reproduced by single L or R loudspeakers. Classical recordings were quite inconsistent.
What were you referring to?
“Were these aspects corrected through optimal positioning and/or high-passing and the use of EQ?”
Of course we did not play with filters or EQ – we were evaluating speakers as they were manufactured. I would not know what “optimal positioning” is for a loudspeaker of unusual or aberrant design unless the manufacturer specified it. If it was specified, it was obeyed – e.g. Allison boundary-friendly designs. At Harman we were most interested in evaluating competing products. We did not have a budget to purchase off-beat, small-distribution products, whatever possible virtue they may have had. Still the list of products evaluated in detail is long.
“Harman's "target curve" research also seems to indicate that untrained listeners prefer a lot more bass and more treble (sloping upwards with frequency). This is also my impression from observing people's reports on different speakers and show systems.
Isn't this indicative that people have different preferences when it comes to tonal balance?”
Of course people can have preferences in spectral balance. This is why I poke fun at “High End” products that don’t have tone controls. Why not let customers buy broadband, neutral, loudspeakers and let them compensate for program deficiencies or indulge personal preferences, always with the ability to return to neutral. In some of the tests you refer to loudness was not compensated for, and it was sometimes thought that the bass and treble boosts used by younger, inexperienced, listeners could just have been a way to turn the volume up. That said, it is obvious that some categories of contemporary music thrive on exaggerated bass.
“ I am just a curious and inexperienced amateur with a lot of questions.”
Curiosity is a good thing – it is what drives research. I too was once an amateur, but rose above that station by applying the scientific method to my activities. The world and internet forums are filled with amateurs with opinions formed under circumstances where bias and adaptation are possibly as important contributing factors as the physical realities. If you are unable to apply the scientific method, the next best approach is to study that which as been done - which is the process you are in right now. So, I will terminate this personal tutorial and let you get on with reading more of the science. Keep an open mind, and enjoy.