• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

MQA Deep Dive - I published music on tidal to test MQA

Status
Not open for further replies.

RichB

Major Contributor
Forum Donor
Joined
May 24, 2019
Messages
1,946
Likes
2,611
Location
Massachusetts
I'm really sorry about the term "luminaries", used in a hurry and with English as my second language. You may call them "brilliant engineers", "world class engineers", or just "smart guys". Please choose the one that is less disturbing to you. In the same fashion, what I called "hunch" you may better call it "hypothesis".

Audio Engineering Society, probably the world's most important organization in audio engineering, in almost 75 years of history has awarded 35 people with their Gold Medal Award. You may know some names among them: Georg Neumann, Willi Studer, Claude Shannon, Ray Dolby, Floyd Toole, Rudy Van Gelder. Among them it is also the name of Michael Gerzon, the "father" of most of the foundational patents regarding MQA, and I believe also postumely named in the MQA patent. He was a recognized genius at his time, that sadly died at the age of 50 because of a health condition.
The buddy of Gerzon was Peter Craven (the one that people here seems to believe has lost every remaining neuron in his head, with what surely must be a kind of contagious disease), and with whom he co-authored most of those patents. Both were the main core of the audio engineering department of Oxford University, one of the world's most renowned research institutions in audio.

The one in the center of this photo of the early 70's is Ray Dolby (*), surrounded by Craven and Gerzon in their twenties, already famous for their achievements in audio research. Dolby went to Oxford to discuss with them his patents about its surround sound systems, of which Gerzon and Craven would have been participants if the British government wouldn't have cut research funds at that time. By that time they had already made key research about noise shaping, digital systems analysis, and developed the ambisonics field recording technology and invented the first ambisonics microphone.
(*): another excuse: I called him "Thomas" instead of "Ray" in a previous post.

View attachment 126468

I think most concede that these are smart people and MQA is in some ways brilliant.
It does implement cripple-ware, it can degrade audio, it makes proprietary what was open, and refuses to provide the access necessary to validate its claims surrounding temporal blurring.

So, let's say I am the best shot in the world, no one can match me. Does that make me good, I suppose it depends on what I am shooting at.
N'est pas?

Many well respected people question what MQA is shooting at ;)

- Rich
 
Last edited:

TurbulentCalm

Member
Joined
Mar 18, 2021
Messages
82
Likes
196
Location
Australia
I am offering my personal opinions. However, I feel that I should identify my affiliation with the magazine.

John Atkinson
Technical Editor, Stereophile

John, thanks.

Do you know of any information on how the MQA algorithm actually folds the high frequency data into those 3 least significant bits?

I just can’t get my head around how 3 bits at the lowest sampling rate of 44.1 or 48 thousand bytes per second can come anywhere near to storing the data that has been lost in the folding process.

And considering MQA claims to be able to fold a number of times (such as reducing a 384/24) to 44/16 FLAC file, my confusion grows with each fold. Let alone how MQA compresses 24 bits into 3 bits along with the above folded high frequency data.

I confirmed that MQA can reduce a 24 bit sample to 16 bits, by checking the 2L downloadable files.

B4C35BBB-5871-426E-8887-5F8A20CFD4EC.jpeg


You can see the first track has been recorded at least to 24/352.8 @ 330 MB and somehow MQA squashed this down to 16/44 @ 19 MB.

The first thing that comes into my mind, when first seeing this, is just a plain, ‘How!’, and then, ‘That’s not possible!’. Then again I thought the same when I started seeing JPG files appear on the bulletin boards of old for the first time and again when highly compressed video became available.

Thanks in advance for your help.
 

KeenObserver

Member
Joined
Feb 19, 2020
Messages
81
Likes
140
I'm really sorry about the term "luminaries", used in a hurry and with English as my second language. You may call them "brilliant engineers", "world class engineers", or just "smart guys". Please choose the one that is less disturbing to you. In the same fashion, what I called "hunch" you may better call it "hypothesis".

Audio Engineering Society, probably the world's most important organization in audio engineering, in almost 75 years of history has awarded 35 people with their Gold Medal Award. You may know some names among them: Georg Neumann, Willi Studer, Claude Shannon, Ray Dolby, Floyd Toole, Rudy Van Gelder. Among them it is also the name of Michael Gerzon, the "father" of most of the foundational patents regarding MQA, and I believe also postumely named in the MQA patent. He was a recognized genius at his time, that sadly died at the age of 50 because of a health condition.
The buddy of Gerzon was Peter Craven (the one that people here seems to believe has lost every remaining neuron in his head, with what surely must be a kind of contagious disease), and with whom he co-authored most of those patents. Both were the main core of the audio engineering department of Oxford University, one of the world's most renowned research institutions in audio.

The one in the center of this photo of the early 70's is Ray Dolby (*), surrounded by Craven and Gerzon in their twenties, already famous for their achievements in audio research. Dolby went to Oxford to discuss with them his patents about its surround sound systems, of which Gerzon and Craven would have been participants if the British government wouldn't have cut research funds at that time. By that time they had already made key research about noise shaping, digital systems analysis, and developed the ambisonics field recording technology and invented the first ambisonics microphone.
(*): another excuse: I called him "Thomas" instead of "Ray" in a previous post.

View attachment 126468

That is an interesting look back at history, but is says nothing of the matter at hand.
 

samsa

Addicted to Fun and Learning
Joined
Mar 31, 2020
Messages
506
Likes
589
Do you know of any information on how the MQA algorithm actually folds the high frequency data into those 3 least significant bits?

I just can’t get my head around how 3 bits at the lowest sampling rate of 44.1 or 48 thousand bytes per second can come anywhere near to storing the data that has been lost in the folding process.

I highly doubt that there is any ultrasonic information preserved in the 3 LSBs of a 16-bit MQA file.

Your same question, however, holds for the 8 LSBs of a 24-bit MQA file. There, I have read two (mildly incompatible) answers.

Answer 1:
  1. After splitting the data into low and high frequency parts, reduce the bit-depth of the former to 16 bits and then pad with zeroes (to increase the sample-size "back" to 24 bits).
  2. Reduce the bit-depth of the high frequency part even more.
  3. Huffman-code (losslessly compress) the latter.
  4. Insert the compressed data in the 8 LSBs of the "24-bit" file.
Answer 2: Step 3 above is replaced by some psycho-acoustic lossy compression scheme.

If answer 1 were true, I don't see why MQA would screw up on the test tones @GoldenOne threw at it. All of those will compress just fine. This tends to make me think that Answer 2 is correct.

All of which is to say that seeing how MQA screwed up @GoldenOne's files is definitely of value. It can tell us a lot about what MQA is doing.
 

mansr

Major Contributor
Joined
Oct 5, 2018
Messages
4,685
Likes
10,700
Location
Hampshire
Your same question, however, holds for the 8 LSBs of a 24-bit MQA file. There, I have read two (mildly incompatible) answers.

Answer 1:
  1. After splitting the data into low and high frequency parts, reduce the bit-depth of the former to 16 bits and then pad with zeroes (to increase the sample-size "back" to 24 bits).
  2. Reduce the bit-depth of the high frequency part even more.
  3. Huffman-code (losslessly compress) the latter.
  4. Insert the compressed data in the 8 LSBs of the "24-bit" file.
Answer 2: Step 3 above is replaced by some psycho-acoustic lossy compression scheme.
The encoding must by mathematical necessity be lossy. Since the output (8 bits per sample) is always smaller than the input (16 bits per sample or more), there are fewer possible outputs than there are inputs. It trivially follows that some distinct inputs must be encoded identically. This is known as the pigeon-hole principle.
 

samsa

Addicted to Fun and Learning
Joined
Mar 31, 2020
Messages
506
Likes
589
The encoding must by mathematical necessity be lossy. Since the output (8 bits per sample) is always smaller than the input (16 bits per sample or more), there are fewer possible outputs than there are inputs.

Hence Step 2 above: reduce the bit depth to something (much) less than 16 bits. The resulting (reduced bit-depth) data can then be Huffman-coded using (less than) 8 bits/sample.
 

RichB

Major Contributor
Forum Donor
Joined
May 24, 2019
Messages
1,946
Likes
2,611
Location
Massachusetts
The encoding must by mathematical necessity be lossy. Since the output (8 bits per sample) is always smaller than the input (16 bits per sample or more), there are fewer possible outputs than there are inputs. It trivially follows that some distinct inputs must be encoded identically. This is known as the pigeon-hole principle.

MQA has moved beyond many such outdated concepts:
  • Nyquist
  • Lossy and lossless
  • the pigeon-hole principle (that's a new feature)
- Rich
 

John Atkinson

Active Member
Industry Insider
Reviewer
Joined
Mar 20, 2020
Messages
165
Likes
1,022
Do you know of any information on how the MQA algorithm actually folds the high frequency data into those 3 least significant bits?

I just can’t get my head around how 3 bits at the lowest sampling rate of 44.1 or 48 thousand bytes per second can come anywhere near to storing the data that has been lost in the folding process.

Forget the “3 bits” that people are mentioning. It’s a red herring (or maybe the sound of axes grinding). I already described in message https://www.audiosciencereview.com/...-music-on-tidal-to-test-mqa.22549/post-759747 how, with a digital audio recording of actual music, it is possible to create a hidden data channel in the least significant bits without losing resolution or “bits.” So forget about MQA for now and consider the following thought experiment (which has nothing to do with “deblurring,” “leaky” reconstruction filters, B-splines, etc):

Imagine that I have a 24-bit audio file of the music from which I extracted that room tone recording mentioned earlier, sampled at 2Fs (88.2kHz). I would like to create a version of that file that will play with a baseband sample rate (44.1kHz) in systems with antique D/A converters but also play at the original 88.2kHz sample rate in my big rig.

I take that 24-bit file and using a complementary pair of low- and high-pass digital filters, I split it into two 24-bit files: one containing content below 22.05kHz so that it can now be considered as having an effective sample rate of 44.1kHz; the other containing content from 22.05kHz to 44.1kHz. As long as the filters used are of a specific type, the band splitting will be transparent.

I examine the spectrum of the background analog noise in the baseband file and calculate that I can create a hidden data channel in the 5 LSBs (bits 19-24), which are 2 bits (12dB) below the lowest amplitude of the audio data. I then examine the spectrum of the 2Fs file. I find that, as expected, the ultrasonic content both has a self-similar spectrum that declines in amplitude with increasing frequency and is at a low level. The level is so low, in fact, that the actual quantization is close to 5 bits.

So, if I encrypt the 5-bit/2Fs data as pseudorandom noise with a spectrum identical to the background noise in the baseband file, I can bury those data in the hidden 5-bit data channel. I now have a single 24-bit file sampled at 44.1kHz that will playback with the same audio quality as the original file (other than the low-pass filtering at 22.05kHz).

For playback in the big rig, a flag that I have embedded in the file’s metadata tells the D/A processor that it has to extract and de-encrypt the 5-bit audio data in the hidden channel. It then upsamples those data to 2Fs, attenuates the data to the level in the original file – this pushes the 5-bit quantization noise below the original background noise floor – and adds the result to the baseband file that has also been upsampled to 2Fs.

In theory, I am playing back the 24-bit baseband file as if it were a 2Fs file with no loss of bandwidth or information or resolution or “bits.” (That would be the “thought experiment” equivalent of the “MQA Stereo original resolution” files in the 2L screenshot you included in your post.)

The devil, of course, lies in the details. How do I encrypt the low-bit-rate content between 22.05kHz and 44.1kHz so it resembles pseudorandom noise? I have no idea, even though I had discussions with the late Michael Gerzon about this back in the day. What if the starting point is a 16-bit file, where there is much less information space in which to embed a hidden data channel beneath the analog noise floor? (That is the “thought experiment” equivalent of the “MQA-CD” files in your 2L screenshot.) Again, I don’t know. What if the statistics of the original audio don’t conform with the self-similar spectrum that I am expecting? That, of course, is how GoldenOne “broke” the encoder.

But again, to talk about “losing 3 bits” or “truncating” the audio data is incorrect.

John Atkinson
Technical Editor, Stereophile
 

mansr

Major Contributor
Joined
Oct 5, 2018
Messages
4,685
Likes
10,700
Location
Hampshire
@mansr Where do the “16 bits per sample or more“ come from? (Be patient with me :) )
The input to the encoder core is 24-bit PCM at 96 kHz (or 88.2 kHz, but let's keep it simple). Such a signal can be split into two parts, one containing the frequencies below 24 kHz, the other those above. Since the bandwidth of each part is 24 kHz, the sample rate can be reduced to 48 kHz, making the total bit rate unchanged. If done right, the high and low bands can be recombined, recreating the original signal without loss. Many codecs include such band splitting. This is nothing new. What MQA does is (try to) compress the, still 24-bit, high band signal into 8 bits per sample so it can be placed in the low bits of the 24-bit baseband signal. As I said above, this step cannot be lossless for arbitrary inputs, though it could be for a restricted subset. If this is the case, MQA has failed to communicate the requirements.

Repurposing the low bits obviously takes dynamic range away from the baseband signal. To partially compensate for this, a peak extension mechanism is used. Compressed data for this is stored in the 9th bit from the bottom along with some basic parameters used by the decoder.

To avoid the coded data creating unwanted spectral patterns, it is scrambled, and the baseband PCM data is dithered at 15 bits, leaving about 13.5 bits of audio data accessible without a decoder.
 

Raindog123

Major Contributor
Joined
Oct 23, 2020
Messages
1,599
Likes
3,554
Location
Melbourne, FL, USA
The input to the encoder core is 24-bit PCM at 96 kHz...

Thanks. Yes, at some point I realized that I was thinking reconstruction/decoding while you guys were describing encoding. (So withdrew my question, but thanks again!)

What you describe is consistent with what John just posted... Wish he had answers to his own questions at the bottom...
 
Last edited:

John Atkinson

Active Member
Industry Insider
Reviewer
Joined
Mar 20, 2020
Messages
165
Likes
1,022
John, “3 bits” come from 16/44.1 MQA files. Do you have a description for how those are handled?

As I wrote, if the starting point is a 16-bit file, there there is much less information space in which to embed a hidden data channel beneath the analog noise floor? I don’t know how that is handled. But, of course, if the starting point is sampled at 44.1kHz or 48kHz, there are no ultrasonic data to embed in a hidden data channel.

John Atkinson
Technical Editor, Stereophile
 

Sir Sanders Zingmore

Addicted to Fun and Learning
Forum Donor
Joined
May 20, 2018
Messages
947
Likes
1,899
Location
Melbourne, Australia
For me, the main issue with MQA is probably unrelated to this thread: it's their ability to effectively impose DRM if they so choose. You can argue technicalities but if/when MQA controls authentication at all levels of the chain, then they control the quality of what you hear and what they can charge for it.

In terms of this thread, one of the things I love about ASR is how it holds manufacturers to account against their own claims. Again, we can argue technicalities but this thread clearly shows that MQA's claims don't stand up to testing. They should be called out for that in exactly the same way any other manufacturer would be on ASR
 

Raindog123

Major Contributor
Joined
Oct 23, 2020
Messages
1,599
Likes
3,554
Location
Melbourne, FL, USA
As I wrote, if the starting point is a 16-bit file, there there is much less information space in which to embed a hidden data channel beneath the analog noise floor? I don’t know how that is handled. But, of course, if the starting point is sampled at 44.1kHz or 48kHz, there are no ultrasonic data to embed in a hidden data channel.

John Atkinson
Technical Editor, Stereophile

Thanks. Begs the question if there is any benefit using the MQA encoding scheme in this case...
 
Last edited:

mkawa

Addicted to Fun and Learning
Joined
Sep 17, 2019
Messages
788
Likes
695
i think the lesson we can all take away from this thread is that none of us can wait until spotify goes lossless. this will all be a dark memory and that will be that.
 

levimax

Major Contributor
Joined
Dec 28, 2018
Messages
2,348
Likes
3,462
Location
San Diego
For me, the main issue with MQA is probably unrelated to this thread: it's their ability to effectively impose DRM if they so choose. You can argue technicalities but if/when MQA controls authentication at all levels of the chain, then they control the quality of what you hear and what they can charge for it.

If you read their patents they describe some of the DRM capabilities but the company has since claimed they will never use them. Personally I don't trust them and am continuing to build my personally library with cheap used CD's. My guess is streaming is going this route (DRM, higher prices, less choice, etc.) whether it is MQA or something else.
 

TurbulentCalm

Member
Joined
Mar 18, 2021
Messages
82
Likes
196
Location
Australia
Forget the “3 bits” that people are mentioning. It’s a red herring (or maybe the sound of axes grinding). I already described in message https://www.audiosciencereview.com/...-music-on-tidal-to-test-mqa.22549/post-759747 how, with a digital audio recording of actual music, it is possible to create a hidden data channel in the least significant bits without losing resolution or “bits.” So forget about MQA for now and consider the following thought experiment (which has nothing to do with “deblurring,” “leaky” reconstruction filters, B-splines, etc):

Imagine that I have a 24-bit audio file of the music from which I extracted that room tone recording mentioned earlier, sampled at 2Fs (88.2kHz). I would like to create a version of that file that will play with a baseband sample rate (44.1kHz) in systems with antique D/A converters but also play at the original 88.2kHz sample rate in my big rig.

I take that 24-bit file and using a complementary pair of low- and high-pass digital filters, I split it into two 24-bit files: one containing content below 22.05kHz so that it can now be considered as having an effective sample rate of 44.1kHz; the other containing content from 22.05kHz to 44.1kHz. As long as the filters used are of a specific type, the band splitting will be transparent.

I examine the spectrum of the background analog noise in the baseband file and calculate that I can create a hidden data channel in the 5 LSBs (bits 19-24), which are 2 bits (12dB) below the lowest amplitude of the audio data. I then examine the spectrum of the 2Fs file. I find that, as expected, the ultrasonic content both has a self-similar spectrum that declines in amplitude with increasing frequency and is at a low level. The level is so low, in fact, that the actual quantization is close to 5 bits.

So, if I encrypt the 5-bit/2Fs data as pseudorandom noise with a spectrum identical to the background noise in the baseband file, I can bury those data in the hidden 5-bit data channel. I now have a single 24-bit file sampled at 44.1kHz that will playback with the same audio quality as the original file (other than the low-pass filtering at 22.05kHz).

For playback in the big rig, a flag that I have embedded in the file’s metadata tells the D/A processor that it has to extract and de-encrypt the 5-bit audio data in the hidden channel. It then upsamples those data to 2Fs, attenuates the data to the level in the original file – this pushes the 5-bit quantization noise below the original background noise floor – and adds the result to the baseband file that has also been upsampled to 2Fs.

In theory, I am playing back the 24-bit baseband file as if it were a 2Fs file with no loss of bandwidth or information or resolution or “bits.” (That would be the “thought experiment” equivalent of the “MQA Stereo original resolution” files in the 2L screenshot you included in your post.)

The devil, of course, lies in the details. How do I encrypt the low-bit-rate content between 22.05kHz and 44.1kHz so it resembles pseudorandom noise? I have no idea, even though I had discussions with the late Michael Gerzon about this back in the day. What if the starting point is a 16-bit file, where there is much less information space in which to embed a hidden data channel beneath the analog noise floor? (That is the “thought experiment” equivalent of the “MQA-CD” files in your 2L screenshot.) Again, I don’t know. What if the statistics of the original audio don’t conform with the self-similar spectrum that I am expecting? That, of course, is how GoldenOne “broke” the encoder.

But again, to talk about “losing 3 bits” or “truncating” the audio data is incorrect.

John Atkinson
Technical Editor, Stereophile

John.

Thank you for providing such a detailed reply. I’m not sure that I’m going to understand it all but I’ll be trying really hard to do so.
 

restorer-john

Grand Contributor
Joined
Mar 1, 2018
Messages
12,579
Likes
38,280
Location
Gold Coast, Queensland, Australia
Begs the question if there is any benefit using the MQA encoding in this case

Only to Bob Stuart, by butchering those original recordings and distributing them to hapless consumers who'll lap up those mutilated and contaminated files believing they are being sold something special.

It borders on criminal because the premise of "master quality" of those original "early CDs" is a blatant lie. Actually a complete bare-faced lie. In this country it fits neatly into a breach of Australian Consumer Law as the goods are "not as described".

1619477224821.png
 

restorer-john

Grand Contributor
Joined
Mar 1, 2018
Messages
12,579
Likes
38,280
Location
Gold Coast, Queensland, Australia
John.

Thank you for providing such a detailed reply. I’m not sure that I’m going to understand it all but I’ll be trying really hard to do so.

Note @John Atkinson makes no claims as to the efficacy of the entire convoluted MQA process in bringing about any measurable parameter improvement in the delivery of digital audio to consumers.
 
Status
Not open for further replies.
Top Bottom