A nice attempt to add clarification, but there is more. The reduced ability to hear resonant colorations in loudspeakers when more than one channel is operating is true only when multiple channels are radiating simultaneously. It is the large-room spatial information in recordings (envelopment - the sense of being in a large space) that is the distraction, not just the fact that there are multiple sources of sound. Listening room reflections are highly correlated with the loudspeakers located in the room. Recorded "reflections" of real or synthesized spaces are abstract quantities having no relationship to the space the eyes see. But these perceptions are what make stereo and multichannel so pleasurable.
There is no "envelopment" in mono signals. But, mono signals exist in multichannel recordings whenever there is a hard-panned image. The most blatant example is the center channel in movies, which does most of the important work, delivering most dialog and much on-screen action sounds. Solo instruments also appear in single loudspeakers. These are monophonic listening opportunities and this is why, even though there is an overall degradation in one's ability to hear resonance colorations in loudspeakers in stereo and multichannel recordings, in the end, listeners still prefer the most neutral loudspeakers. Hence, the recommendation to identify neutral loudspeakers in mono listening - or by interpreting useful anechoic data - and then enjoy them in stereo and multichannel. The more channels there are, up to a point, the more persuasive is the sense of envelopment. The added channels can deliver the appropriate long delayed large room reflections from appropriate directions. This is why loudspeaker directivity matters less in multichannel systems than it does in stereo or mono.
Does this help?