It seems to me that stereo illusions would get smeared if all the spectral cues were preserved in the reflections.
The ear/brain system primarily derives it directional cues from the first .68 milliseconds of a sound, but derives its "sense of space" cues from all subsequent reflections as well. Strong early reflections can influence the directional cues, as when strong early same-side-wall reflections widen the soundstage. But imo these strong early reflections DO smear the image precision, so it's a tradeoff.
I'm not sure whether spectrally equivalent early reflections have a greater or lesser effect on the apparent source width. Too much dissimilarity would be disable the ear's ability to correctly identify reflections as such. It is not clear to me that too much similarity would be detrimental because I think spectral similarity (or equality) would maximize the ear's ability to correctly identify reflections as such. (The ear/brain system looks at the spectra of incoming sounds in order to classify them as reflections or as new sounds, and if the latter, the ear computes their arrival direction largely from the arrival time differential between the two ears.)
Performance spaces are specially treated to prevent that, especially because the delay would usually be larger. The downward slope maybe gives a different enough signature that our brains interpret it as “reflection” and handle it correctly. But when smooth it avoids certain large sections of the spectrum calling attention to themselves in the reflection.
I understand that the conventional wisdom holds the downward-sloping spectrum of typical off-axis energy to be a feature rather than a bug because it approximates the downward-sloping spectrum of the reflection field in a large concert hall.
Imo there is a factor which this approach doesn't take into account, and that factor is... wait for it... wait for it... just a bit longer...
time.
Imo the in-room reflections arrive far too early for a significant downward-spectral-tilt to be natural-sounding. My understanding is that the perceived spectral balance is a weighted average of the spectra of the direct sound and the reflections, and my experience (in a home audio setting) has been that a significant discrepancy between the two can result in listening fatigue. But this is just an opinion.
I would think that if first reflections are such a key part of the image, the symmetry of the room (physically and in the frequency domain) would become critical.
With conventional wide-dispersion speakers, and without suitable room treatment, yes. Symmetry in the early reflections matters.
There are ways to mitigate the influence of the first reflections, both from a spatial and from a sound-quality standpoint, in situations where the room is not physically symmetrical. These include what might be called compensating room treatment, and avoiding illuminating the typical first-reflection zones in the first place (the latter being the approach I use).
I recall that Toole was interested in spectral similarity but not spectral equality.
That is my recollection as well. I'm not aware of any studies which advocated aspiring to spectral equality. But I think that's because spectral equality is quite rare and relatively expensive and/or impractical to achieve, and not because it's an inherently bad idea.
If you are open to anecdotes I'll put on my asbestos suit and spout a few.
Rick “back to sound board duties” Denney
Duke "experiencing acute middle-name envy" LeJeune