A quick “alternative” look at the CD (16/44ksps [22kHz audio]) vs MQA 1st unfold (to 88ksps/44kHz). It maps codec’s “information capacity” onto the “amount of data needed to represent the music”. And is based on
Meridian's own data kindly pointed at by
@amirm:
The blue line is the measured music spectrum “envelope” (throughout the entire song, the energy at each frequency is somewhere between its noise floor and this max). The brown line is measured noise floor (constant through the song, but would vary between recording setups). The black line is the minimum level a human ear can hear this frequency (MAF [minimum audible field]. See how it peaks at 20kHz!)
On top of Meridian’s measurements, I’ve superimposed (1) the [solid] green area depicting the amount/type of information stored in a Redbook (“CD”) encoded file - its 16 bits offer 96dB of dynamic range and 44ksps translate to 22kHz of audio bandwidth; (2) the solid red area is MQA’s 44kHz audio band – now reduced in dynamic range - as a few bits are “borrowed” to carry the 1st unfold (22-44kHz band) information. [Here, let’s just assume that such a computational transformation does exist and can be implemented.] With the help of the dither technique, the [quantization] noise floor can be further lowered – for both CD and MQA – that is represented by bottom dashed areas.
Now, a couple of observations. If the three curves measured by Meridian (the MQA folks) – the music spectrum envelope, music noise floor, and our hearing ability –
do represent the real world, then (1) CD’s 16/44 (the green area) is fully adequate and sufficient to deliver everything – both dynamic and frequency ranges – that a human can hear. Well, duh. (2) While there
is [inaudible, “ultrasonic”] energy above 22kHz in the spectrum of real music – as many sound sources and their harmonics do not have our ears’ limitations – (a) it is inaudible to humans, (b) a 16-bit MQA hardly has the dynamic range to accurately represent it (3 bits only provide 20dB, so need at least 24-bit MQA), and (c) if after all this we still want this inaudible to us (!) 22+kHz band to be played-back by our audio systems, a “normal hi-res” (24/96+) format provides both unrestricted dynamic and frequency ranges to deliver all the music information in there.
And a couple of questions I still have: (1) How accurate is the above Meridian data? (Eg, are there recordings requiring substantially higher dynamic ranges (both above and below 22kHz)? Is the MAF hearing-threshold curve valid?) (2) Can the MQA actually [dynamically, recording-based] “split” the base band (<22kHz) 24 bits [of each sample, in time domain] into these “non-MQA backward compatible” baseband portion and the “fold(s)” info? (Can not only the dynamic ranges but also absolute levels be represented? And is there a proof it actually works?) (3) Finally, again why do we care about inaudible ultrasonic band?