So, basically what you say is that the clipping ("clicking" or "crackle' whatever we call it) occurs not due to the H mode (or power-down or power-saving mode) activation, but due to the gain change scheme of the CS chip. Right?
And this gain change should be part of the dynamic range enhancement (DRE). So, essentially it is a side effect of their DRE implementation. And the
Russian article attributing the clipping behavior to the H mode activation is not accurate. Right?
By the way, I was able to reproduce your wideband (BW 96 kHz) noise vs. output level. That pattern is mainly due to the level change of ultrasonic noise shaping. The noise vs. output level that I observed reflected just 20 Hz to 20 kHz, which is a separate phenomenon showing the DRE effect. These two phenomena may be related in some way, though.
Lastly, the multitone signal used in the Russian article to show the clipping behavior does not seem to include a tone below 20 Hz. I guess the key is how shortly a signal is presented after the gain is adjusted. OR we may be talking about two different kinds of clipping caused by separate things?