Hi, how was your listening position with this experiment? did you keep it stationary relative to room or speakers?
I have hypothesis that the front wall distance doesn't matter much at all regarding this effect, and all that matters is how far the listener is from speakers. Basis for the hypothesis is that front wall is one source of early reflections and there are many in a room. Reducing listening distance changes all early reflections in relation to direct sound, not just that from the front wall but floor and ceiling and all others. So if listening spot is kept stationary, all reflections change when speakers are brought further from front wall = closer to listener.
The front wall will have some effect for sure, so just trying to get some extra info from your experiment.
Ime the front wall distance matters a LOT with dipole speakers, which direct just as much energy towards the front wall as towards the listening area.
I would have kept the same listening triangle geometry but I do not have an actual recollection of moving my chair. The listening triangle that I preferred put my chair about eight feet (about 244 cm) from the plane of the speakers. These were seven foot tall (line-source-approximating) electrostats so their direct field extended further than it would have with conventional speakers.
You didn't mention Griesinger's "proximity" in so many words, but imo "proximity" and "envelopment" are not necessarily mutually exclusive in a playback setting, the key arguably being a significant time gap between the direct sound and the strong onset of reflections... which happens to be something big electrostats can do if given sufficient distance from the wall. I noticed the transition to "you are there" at 5 feet (about 152 cm) from the wall, and 7 feet (about 213 cm) seemed to be "optimum" to my ears, but unfortunately was not practical.
I still find it unlikely that we will be able to hear the recorded reverb tails in the late-arrival diffuse field generated by the late reflections in the listening room.
I have read - but unfortunately do not remember where - that the ear/brain system can detect reflections down into the noise floor, and it does so by following the overtone patterns, which would be another argument for their preserving their spectra.
... So with the above levels in mind
I cannot comment on the relative levels of direct and reflected sound assumed in your post; I simply do not know what would be "normal". If I understand correctly, you're saying that the reverberation tails on the recording are effectively inaudible in the playback room. Is this correct?
I think most (if not all?) we hear from those late reflections and the diffuse field (that adds the envelopment) is highly dominated by the direct sounds in the recording, and not much (or any) from the reverberation tales in the recording.
The direct sound in the recording contains no venue spatial cues, therefore reflections of the recording's direct sound in the playback room would ONLY have the playback room's spatial cues. So, the in-room reflections of the recording's direct sound cannot convey a "you are there" spatial quality.
The reflections (including the reverberation tails) in the recording are what contain the venue spatial cues, so any plausible "you are there" spatial quality must come from the reflections in the recording (for unprocessed two-channel).
Obviously the venue spatial cues are in the direct sound from the speakers. Does "you are there" arise solely from the direct sound from the speakers? Not in my experience, nor do I recall anyone describing a "you are there" experience listening to a normal speaker setup in an anechoic chamber. And my recollection from Toole's book is that the WORST direction for reflections to arrive from would be the exact same direction as the first-arrival sound.
So by process of elimination,
the only thing left that could effectively enable a "you are there" spatial presentation are the venue spatial cues being carried by the playback room's reflections, and arriving from many different directions.
Now I could be wrong about it being the reverberation tails which enable "you are there"; maybe it's the earliest venue reflections arriving from many directions which enables "you are there", but the imo preferred course of action remains the same: Disrupt the "small room signature" of the playback room by minimizing the earliest reflections, while simultaneously promoting the effective presentation of the venue spatial cues via the later-arriving reflections.