The in-room reflections of the direct sound are not detrimental as long as they don't arrive too early, but ime it is the reflections of the reverberation tails that can really make the difference if they are presented effectively.
For instance, I've heard the spatial presentation transition from "I'm in a normal room and the soundstage extends a few feet behind the speakers" to "I'm in a huge space and the soundstage extends to about as far away as the musicians would have actually been" when the playback room's package of spatial cues gave way to the recording's package of spatial cues due to a change in speaker placement. No way this perceptual difference could arise from a slight modification of the playback room's signature; the only information available that could have conveyed enormous depth and spaciousness was the reverberation tails on the recording, and the speaker placement change resulted in the venue signature on the recording becoming perceptually dominant.
Yes, but most of that spatial information comes with the direct sound of accurately positioned loudspeakers in an acoustically treated and well-behaved listening room. The late reflections that build up the diffuse sound in the listening room, which in turn adds to the sensation of envelopment, is just a "soup" of everything in the recording, both the recorded direct sounds and the recorded reverb tails without any distinction, but that doesn't matter much as it all will add to the sensation of envelopment anyway (as long as it is delayed enough vs the direct sound and is treated as a separate sound to our sense of hearing).