You are probably aware of a rather long thread discussing, among other things, the technical aspects of hi-res:
https://www.audiosciencereview.com/...qa-creator-bob-stuart-answers-questions.7623/.
I narrowed down that thread to another one, and then this newer thread to a single post, which I believe captured the core issue:
https://www.audiosciencereview.com/...higher-sampling-rates.7939/page-2#post-194272
IMHO, the core issue is that, formally speaking, a limited-duration non-perfectly-periodic piece of music can't be Fourier-transformed to a finite spectrum representation, and thus the
Nyquist–Shannon Sampling Theorem, which relies upon Fourier transform, strictly speaking is not applicable to music.
Thus, we must assume that a sampled digital representation of real-life music contains distortions, as compared to the original analog sound. Mathematically, increasing the sample rate and number of bits per sample makes this distortion lower.
Most of the time listeners can't perceive the difference of distortions between 44/16 and, say, 192/24. Sometimes they can:
A Meta-Analysis of High Resolution Audio Perceptual Evaluation.
The 5% figure keeps coming up. Like: the difference can only be perceived by 5% of listeners, on 5% of music. A naive approach is to multiply these probabilities: the thinking goes that on average ~0.25% of listening sessions would be affected by the regular vs high fidelity differences.
But, and there is a big but! For a particular listener in the 5%, whose favorite music genres happen to fall in the 5% too, the number of affected listening sessions can be much higher, closer to 100% actually.
And vice versa, for a listener in the 95%, or for a listener whose favorite music genres happen to be in the 95%, the advantages of hi-res are immaterial. For them, the promise of hi-res is 100% snake oil.