Simple proof: Listen to the attached files.
Suppose we are downsampling a file to 12kHz sample rate, then the file should be filtered in a way that there is no content >= 6kHz before decimation. Due to the low 12kHz sample rate, everyone must be able to hear a faint tone in the attached "impulse long linear.wav" and "impulse long minimum.wav", but not in "impulse short linear legal.wav" and "impulse short linear illegal.wav".
"impulse short linear illegal.wav" is named as such because the stopband is >= 6kHz before decimation. After decimated, aliasing will occur down to about 5kHz. I aware that some people tend to use imaging and aliasing interchangeably, it can cause confusion, so here is an explanation:
https://www.audiosciencereview.com/...u-m4-audio-interface-review.15757/post-504008
However, 44.1kHz sample rate is the lowest norm for music distribution, so people needed to be able hear the same effect at around 22kHz, and the audio content itself also need to contain strong signal at this frequency region. Also, the "long" files can only happen on something like a chord DAC, for typical DAC chips, the "short" files are the norm.
Therefore it is all about frequency domain.
In the case that anyone interested in trying these things out with real music rather than an impulse,
@KSTR provided an interesting listening test here:
https://www.audiosciencereview.com/...ike-if-we-were-bats-listening-examples.23776/