I'm not going to engage in what seems to be a matter of personal preference - that always wins, for the best of reasons. However, there are no mysteries about how the binaural hearing system constructs a soundstage from two channels. It is not explained by geometric triangulation. Sound images perceived across the front soundstage are the result of mono left, mono right, and phantom images created by double-mono amplitude panned (by attenuators - a pan pot - or by directional coincident stereo mics). All phantom images perceived between the loudspeakers are the result of two sounds arriving at each ear, not one, and as a result there is an inherent comb-filtering issue, most obvious for a center image. The image shifting is illustrated in the first phase of the precedence effect shown in my book in Figure 7.23(a), and the comb filter is shown in Figure 7.2. Ambiance and "air" come from less correlated content in the two channels - either reflected sounds picked up by mics and mixed in or added by an electronic simulator. For stability of localization and for accurate timbre of the featured artist a center channel is advantageous, and this was known in the 1930s, but the limitations of the LP (only two modulated groove sides) prevented more than two channels from the outset, and the rest is history. It has absolutely nothing to do with the two-ears/two-channels relationship - except in binaural/headphone recordings.
The dominant missing ingredient in stereo is envelopment - the feeling of being in the venue with the performers. Delivering this does not require many extra channels. In fact, when examined in detail, it turns out that, for a single listener, adding only two more in the right locations can get one remarkably close to being surrounded by up to 24 channels/speakers. See Section 15.7.1. The present infatuation with "immersive" sound, employing many more channels is driven mainly by movies, and there the desired illusion is one of auditory objects that can be localized in many different well defined places. These are gee-whiz sound effects, not at all the uncorrelated spaciousness of envelopment, although that too is possible - with, as I said, many fewer channels. If one wishes to entertain more than one listener, more channels are advantageous, and this is the case in movies and in most home theaters.
The sounds that are delivered to the additional channels to create envelopment is the remaining choice, and one can either concoct a simulated sound field using a synthesizer, or extract the uncorrelated information from the recording itself and deliver it by an upmixer.
All that said, there is much pleasure to be gained from simple stereo - it is massively superior to mono, especially when it is well mixed. In my perusal of the recorded repertoire on TIDAL it is evident that stereo recordings are often amateurish, even for some well recognized performers.