If you've ever recorded live music from a normal seating position, there is too much "room noise" (reverb, etc.) when played back on a pair of speakers (or headphones) at home. It just doesn't "sound right" in a small room with all of the reverb (and any audience noise) coming from the same direction (the speakers).
Well that’s certainly part of it but only a part. If you’ve ever recorded live music from a normal seating position, there is virtually always too much “room noise” however, that’s typically on the recording itself. So, the question is: If that’s the real/actual sound that existed in that seating position why did we not hear it as having “too much room noise” at the time? The answer is “perception”, as we’re listening to the live performance of the orchestra/ensemble, that’s what our brains’ are focused on and in order to do that, it reduces the perceived level of everything else that’s getting in the way of what we want to hear, such as the reflections/reverb and constant audience background noise. A process similar to the “cocktail party effect”. We typically perceive this additional reverb when listening to a recording at home, even with headphones (and therefore no home room noise), because now we are sitting in a living room focused on a recording rather than focused on an orchestra while sitting in a concert hall. This brings me back to what I stated before and what we’re after as engineers; is it the actual sound which existed at a seating position in the concert venue or the sound we would probably have perceived in that seating position? As it’s the latter, that means a recording with more clarity and less reverb and noise than actually existed.
The only way we get close to stage 1 or 2 is using some sort of minimalist recording techniques.
But the reality is that just mostly doesn't work in the real world.
Again, I’m assuming everyone is referring to is the reality of the actual sound that existed as opposed to the “reality” of what one would most likely have perceived in that location? If that’s the case, then we do not want to get close to stage/classification #1 or 2! If we’re talking about getting close to the “reality” of what someone would most likely perceive, then multi-mic’ing techniques are the only way of achieving that (with the potential exception of binaural recordings).
One thing I'd like to add here is to be sure whatever we're discussing, 2ch or multich recordings, High Fidelity is still job #1.
It makes no difference if the engineer has chosen to put the sound of the piano at front center stage, or up on the ceiling in the rear left corner (LOL), it still needs to sound exactly like a piano as much as possible.
But that brings up another issue directly related to the above and the fact that we don’t want “High Fidelity”. For example, in the case of the piano (or pretty much any acoustic instrument) we mic the instrument quite/very closely because as mentioned above, we’re going to need more clarity and less noise (and in an ensemble, less “spill”) than if we mic’ed from a position more representative of where the audience will be seated. However, that means we will get a much higher peak level and dynamic range, as well as pick up a lot more mechanical noise and high frequency content. Indeed, most instruments sound significantly different from a few inches or feet away than from a more typical audience distance of several/many tens of feet. Therefore, your assertion is effectively a contradiction! Do you want high fidelity OR, do you want the instrument to “sound exactly like” that instrument? Typically we would choose the latter and forsake some of the fidelity, by applying some compression and/or a gentle LPF (or high shelf) to reduce some of the high freqs which wouldn’t make it to the audience and cause it to not sound exactly like that instrument.
Obviously we want the highest fidelity we can get when reproducing a recording but that is not the primary concern when creating the recording, what subjectively sounds right/good from an audience perspective takes precedence.
I'm not sure exactly why stage 4 simulacra have taken over musical "soundtracks" for movies and TV series.
Because the point of music for movies and TV is to aid the storytelling, everything else, including reality and fidelity is at best secondary and often lower even than secondary!
Mercury managed to get a good capture of the sound of large orchestras with three spread omnis back in the late '50s, early '60s.
There were experiments in the late 1920’s with multi-micing. EMI, Decca and others were advancing the practical use of multi micing in the early 50’s, before stereo was even released to the public. By the late 50’s they had already progressed further, beyond the famous “Decca Tree” (3 carefully spaced/arranged Neumann M50s) by adding a couple of outrigger mics. By the 1960’s, using a Decca tree (or a specific two mic main array) plus some outriggers was pretty much standard for large ensembles, by the 1980’s various spot and room mics were standard and more recently, more complex arrays for surround. Typically we would use around 40 or more mics for a symphony orchestra these days.
G