Since English is not my native language, I hope you can understand.
First question:
The sound will be more natural, just like the sound in reality.The tone of the higher resolution music will be more real, for example, the tone of singer will be more "flat", I mean the tone of one singer of the same piece of music will be vary less.
Upsampling doesn't make sound more 'flat' nor more natural. It just produces sample values in between the not yet upsampled values.
It adds no data that has not been in the original file. Just calculated values.
2:
The most important : dac, audio source ,the music must be high resolution ,and a balanced headphone or speaker.
Important : amp
Less important cables power and other factors
You mentioned recording devices which is not the DAC but ADC. It usually is of much higher resolution than the final 'product' you buy which can be CD or some other type of file usually at a lower resolution than the recording.
Most important: recording technique, engineer, microphones and placement, mixing and mastering engineer. After that the most important is the headphone/speaker, after that the ears and brain of the listener.
I think almost all ADC's and DAC's these days perform (much) better than our ears.
The choice of a DAC should be based on the formats you want it to play, which input connections it has (USB/SPDIF/Optical) and other functionalities, looks and how it interfaces with the following gear.
Balanced or not is not of any importance for home playback.
Only interesting for studios and live performances where long audiolines run alongside mains cables for lighting etc.
Amps should be able to properly drive the headphone/speaker with some extra headroom which signifies its importance.
For playback 96/24 is more than enough, why would 192kHz be needed ?
3 :
Lower than 192khz. I am not sure how much the different bitdepth will affect the sound.
Bit depth is more important than samplerate when it comes to resolution.
Why would the sound be Lo-Fi below 192kHz ?
Can you hear up to 60-80kHz ?
Would 192kHz and 12 bits enough ? 16 bits ? 24 bits ? 32 bits ?