• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

MQA creator Bob Stuart answers questions.

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,181
Location
UK
This appears to be the case, for me, in the context of the audio systems I had access to, and genres of music I ever cared about. 192/24 is 100% for me, to the best of my knowledge, as of today.
You seem to have shifted from 192 kHz PCM being theoretically and objectively 100% capable of capturing audio (page 11 or thereabouts), to some sort of subjective judgement - that I'm not even sure you can make without comparing it to the original event in some sort of controlled trial.

I think we came too close to letting the cat out of the bag that MQA was not necessary, and you have for some reason had to change tack in order to maintain the mystery, fear, uncertainty, doubt...
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Again, you are imagining infinitely sharp transients. These don't exist in music. What exist is sudden jumps in amplitude which are called in transients. Think of something as hitting a piano key. That is a transient. And of course very much audible.

I agree that nothing infinite exists in music. There is a twist though. Think about the perfect reconstruction procedure, per the Sampling Theorem: the value of the analog output signal at a given analog point is calculated as a sum of values of Sinc functions. Sinc function has conventionally continuous derivatives of any order for non-zero values. Thus, a sum of a finite sequence of values of such functions at such analog point shall have continuous derivatives of any order as well.

But if the signal exhibits the sudden jump you mentioned, even its first derivative isn't continuous anymore. There could also be less visible to a naked eye jumps in the first derivative itself, which will make the second derivative discontinuous, and so on. Basically, the reconstruction math doesn't apply to such functions, unless we take a sum over an infinite sequence of sinc values, which we can't do, because a music track has a finite duration.

Lossy compression codecs have trouble with these transients and create pre-echo which can be quite audible if the bit rate is low. I spent a lifetime with my team minimizing loss of fidelity in handling them. Uncompressed PCM music has none of that pre-echo. So you are really chasing an invisible flaw.

I agree with you if what you say implicitly includes "Uncompressed PCM with sufficient sampling rate and bit depth". The more I think about the 44/16, the more I realize that the most common historically ubiquitous media format that the CD beats 100% on all parameters is a cassette tape.

192/24 provides 6.5x more bits per second than CD. You know what other comparison of "delivery media formats" that affect human perception could be interesting to ponder on - not as entirely irrelevant to music sampling rates and bit depths as it may appear at a first sight :)

~6.5% alcohol:

achouffe-chouffe-bok-6666-beers-photo-1


6.5 x 6.5 = ~42% alcohol

Proof-2-523x330.jpg
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,811
Likes
242,872
Location
Seattle Area
I agree with you if what you say implicitly includes "Uncompressed PCM with sufficient sampling rate and bit depth".
No it doesn't. My hearing is (trained) to be especially tuned to hear pre-echo and there is none of it at 16/44.1. Non trained listeners usually can't hear pre-echo in 4:1 lossy compression let alone uncompressed. As I said, you continue to be chasing ghosts.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,811
Likes
242,872
Location
Seattle Area
I agree that nothing infinite exists in music. There is a twist though. Think about the perfect reconstruction procedure, per the Sampling Theorem: the value of the analog output signal at a given analog point is calculated as a sum of values of Sinc functions. Sinc function has conventionally continuous derivatives of any order for non-zero values. Thus, a sum of a finite sequence of values of such functions at such analog point shall have continuous derivatives of any order as well.
They are infinite because the filter is deemed to have 100% brick wall response. We don't need a filter that sharp.
 

flipflop

Addicted to Fun and Learning
Joined
Feb 22, 2018
Messages
927
Likes
1,243
This appears to be the case, for me, in the context of the audio systems I had access to, and genres of music I ever cared about. 192/24 is 100% for me, to the best of my knowledge, as of today.
I'd think so too, if all the tracks I cared about, including Gamelan and Prog Rock (I think I could survive without Mariachi for a while :)), was available in 192/24. But it isn't :( Perhaps that's why I'm so passionate about this whole Hi-Res thing: I want more music in 192/24!
I'm willing to bet serious money that 48/16 is "100% for [you]".
 

Jim777

Active Member
Forum Donor
Joined
May 28, 2019
Messages
124
Likes
204
Location
Greater Boston
No it doesn't. My hearing is (trained) to be especially tuned to hear pre-echo and there is none of it at 16/44.1. Non trained listeners usually can't hear pre-echo in 4:1 lossy compression let alone uncompressed. As I said, you continue to be chasing ghosts.
So true, I also did work on pre-echo (in AAC) and 4:1 probably requires something like applaud or catanets to maybe hear a degradation (lossless is approximately 2:1).
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,181
Location
UK
Could I suggest an analogy?

If I want to display a beautiful, clean rendition of a typeface on a high resolution screen or laser printer, I presume it is best to use splines or similar to define the characters. I can scale them to any size I like, and they can incorporate 'infinitely sharp' corners and edges.

If I decide to use a frequency domain-based system to represent the image (and I'm thinking that some variant of JPEG coding might represent this), my beautiful typeface is spoiled: the sharp corners and edges are softened and have visible ringing. By throwing 'bandwidth' at it I can probably keep improving the result to any arbitrary degree, but it will never be as good as the original spline-based representation.

Does the PCM + sinc filter system suffer from this same inability to represent sharp edges cleanly, especially when the sinc filter is not infinite?

@Sergei, any good?
 
Last edited:

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
You seem to have shifted from 192 kHz PCM being theoretically and objectively 100% capable of capturing audio (page 11 or thereabouts), to some sort of subjective judgement - that I'm not even sure you can make without comparing it to the original event in some sort of controlled trial.

What interests me is: finding out why individuals have such different preferences for the parameters of the music delivery chain. As I mentioned before, I don't doubt that for some humans 44/16 feels sufficient, for others 48/24, for yet others 96/24, and for others like me 192/24. Not long ago I've learned that some already moved on to 384/24 - but I personally don't sense an urge to do so.

It is sad really that so many people believe that major hearing system parameters are shared among individuals. I believe it stems from the fact that cochlea is hidden inside a scull bone, and thus not easy to observe in vivo. For other variations in perceptual sensitivity, there is no such controversy, as their basis is much easier to measure objectively: e.g. https://www.scientificamerican.com/article/super-tasting-science-find-out-if-youre-a-supertaster/.
I think we came too close to letting the cat out of the bag that MQA was not necessary, and you have for some reason had to change tack in order to maintain the mystery, fear, uncertainty, doubt...

Fear? Not so much. Yet there is still mystery, uncertainty, and doubt. Not about MQA specifically, but about the facts and theories of Hearing Science. As you can read even on this thread, quite a few of widely referred studies have been either completely disproved by later examination, or rendered inconclusive. We have to maintain the vigilance - and therefore the mystery, uncertainty, and doubt are going to be healthy companions in this intellectual quest.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Could I suggest an analogy?

If I want to display a beautiful, clean rendition of a typeface on a high resolution screen or laser printer, I presume it is best to use splines or similar to define the characters. I can scale them to any size I like, and they can incorporate 'infinitely sharp' corners and edges.

If I decide to use a frequency domain-based system to represent the image (and I'm thinking that some variant of JPEG coding might represent this), my beautiful typeface is spoiled: the sharp corners and edges are softened and have visible ringing. By throwing 'bandwidth' at it I can probably keep improving the result to any arbitrary degree, but it will never be as good as the original spline-based representation.

I like it. Perhaps a counterpart of such splines in the music world is the original MIDI, and its more evolved extensions and versions (e.g. https://en.wikipedia.org/wiki/MIDI#MIDI_2.0).
Does the PCM + sinc filter system suffer from this same inability to represent sharp edges cleanly, especially when the sinc filter is not infinite?

@Sergei, any good?

Some of the math is common between the audio, image, and video - just parameterized by the number of dimensions. I expect the image processing experts here to answer your question more authoritatively.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,181
Location
UK
Some of the math is common between the audio, image, and video - just parameterized by the number of dimensions. I expect the image processing experts here to answer your question more authoritatively.
Well I know that the standard PCM system can't capture or produce an infinitely sharp edge because of bandwidth limiting - it's an essential requirement of sampling at a finite rate, otherwise we can't know what's supposed to be between the samples.

But once we have captured a waveform using the standard PCM system, perhaps at 768 kHz, there is nothing to stop us encoding and reproducing it using a spline-based system. Is this what MQA is about?
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,916
Likes
37,977
What interests me is: finding out why individuals have such different preferences for the parameters of the music delivery chain. As I mentioned before, I don't doubt that for some humans 44/16 feels sufficient, for others 48/24, for yet others 96/24, and for others like me 192/24. Not long ago I've learned that some already moved on to 384/24 - but I personally don't sense an urge to do so.

It is sad really that so many people believe that major hearing system parameters are shared among individuals. I believe it stems from the fact that cochlea is hidden inside a scull bone, and thus not easy to observe in vivo. For other variations in perceptual sensitivity, there is no such controversy, as their basis is much easier to measure objectively: e.g. https://www.scientificamerican.com/article/super-tasting-science-find-out-if-youre-a-supertaster/.


Fear? Not so much. Yet there is still mystery, uncertainty, and doubt. Not about MQA specifically, but about the facts and theories of Hearing Science. As you can read even on this thread, quite a few of widely referred studies have been either completely disproved by later examination, or rendered inconclusive. We have to maintain the vigilance - and therefore the mystery, uncertainty, and doubt are going to be healthy companions in this intellectual quest.
My incredulity is having made recordings, even with wide bandwidth microphones. Even at 192 rates. And I don't hear an advantage. Combine that with so many people who hear big improvements with hirez as long as they are sighted, but fail to do so when not sighted, and I simply don't believe them to have heard anything except the idea they are hearing more.

I've said it before, if the difference in hirez and redbook were like HDTV and SDTV there would be no question at all. That it is debatable this far on tells me the difference if there is one, to anyone, it is very, very small. There are no superstars in this one. A difference that is perhaps non-existent to almost everyone, and at best very small to anyone.

If you had coherent directly pertinent material to make your case, you would have done so by now. Even you must admit any difference is a small one.
 

Krunok

Major Contributor
Joined
Mar 25, 2018
Messages
4,600
Likes
3,070
Location
Zg, Cro
My incredulity is having made recordings, even with wide bandwidth microphones. Even at 192 rates. And I don't hear an advantage. Combine that with so many people who hear big improvements with hirez as long as they are sighted, but fail to do so when not sighted, and I simply don't believe them to have heard anything except the idea they are hearing more.

I've said it before, if the difference in hirez and redbook were like HDTV and SDTV there would be no question at all. That it is debatable this far on tells me the difference if there is one, to anyone, it is very, very small. There are no superstars in this one. A difference that is perhaps non-existent to almost everyone, and at best very small to anyone.

If you had coherent directly pertinent material to make your case, you would have done so by now. Even you must admit any difference is a small one.

I fully agree. 44.1/16 PCM would fail to encode correctly some odd-shaped signals in the 11-22kHz range but luckilly for us there are no such signals in the music we're listening. Not to mention that our ability to hear the differences between them is small and is declining with age. :)
 
Last edited:

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,181
Location
UK
My incredulity is having made recordings, even with wide bandwidth microphones. Even at 192 rates. And I don't hear an advantage. Combine that with so many people who hear big improvements with hirez as long as they are sighted, but fail to do so when not sighted, and I simply don't believe them to have heard anything except the idea they are hearing more.

I've said it before, if the difference in hirez and redbook were like HDTV and SDTV there would be no question at all. That it is debatable this far on tells me the difference if there is one, to anyone, it is very, very small. There are no superstars in this one. A difference that is perhaps non-existent to almost everyone, and at best very small to anyone.

If you had coherent directly pertinent material to make your case, you would have done so by now. Even you must admit any difference is a small one.
I'm not thinking for a moment that I'm going to be hearing any difference whatsoever between 44/16 and 192/24, or MQA, but I am interested in the theoretical motivation behind it.

I think I am beginning to dimly conceive of a system that is able to start with a high bandwidth PCM signal and transform it to a lower bandwidth while maintaining the ability to reproduce sharp edges cleanly, just as fonts can be rescaled and stored with varying 'bandwidth' while keeping them perceptually as high quality as possible.

If that's what it is, it sounds to me like a solution looking for a problem, but I might just be persuaded that it's an interesting idea. What I've been missing so far is any explanation that resembles what I just said; everyone is so fixated on the frequency domain that they define everything in terms of leakage and aliasing etc.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
My incredulity is having made recordings, even with wide bandwidth microphones. Even at 192 rates. And I don't hear an advantage. Combine that with so many people who hear big improvements with hirez as long as they are sighted, but fail to do so when not sighted, and I simply don't believe them to have heard anything except the idea they are hearing more.

I've said it before, if the difference in hirez and redbook were like HDTV and SDTV there would be no question at all. That it is debatable this far on tells me the difference if there is one, to anyone, it is very, very small. There are no superstars in this one. A difference that is perhaps non-existent to almost everyone, and at best very small to anyone.

If you had coherent directly pertinent material to make your case, you would have done so by now. Even you must admit any difference is a small one.

You have no idea :) We went as far as recording live performances with three independent sets of ADC boxes: at 48/24, 96/24, and 192/24; and giving the tracks (numbering between 12 and 14) to three different mixing engineers to produce mixes they liked. Also doing downsampling of the resulting mixes from 192/24 to lower resolutions. Also converting them to MP3 and AAC.

The three mixing engineers had quite different preferences for the mixes. One of them would de-emphasize high-frequencies, and in general didn't like transients. Another one liked "crisp" and "punchy" mixes. The third one was somewhere in between. Mixes of the first engineer tended to translate well all the way down to 48/16, as they were "mellow" to start with. MP3 and AAC of these mixes sounded OK too.

Mixes of the second engineer not so much: that's where I could hear the differences between the 192/24 and downsampled 48/24. Interestingly enough, I personally couldn't hear differences of 192/24 vs 96/24 and 96/24 vs 48/24, so it was "two steps down" that would reveal the differences to me. You are right, the differences were small.

And as I mentioned before, for simple music, such as a solo vocalist accompanied by a solo acoustic guitar, there were no differences. The mixes of the three engineers sounded virtually identical, between each other, and at all sampling rates.
 

Krunok

Major Contributor
Joined
Mar 25, 2018
Messages
4,600
Likes
3,070
Location
Zg, Cro
You have no idea :) We went as far as recording live performances with three independent sets of ADC boxes: at 48/24, 96/24, and 192/24; and giving the tracks (numbering between 12 and 14) to three different mixing engineers to produce mixes they liked. Also doing downsampling of the resulting mixes from 192/24 to lower resolutions. Also converting them to MP3 and AAC.

The three mixing engineers had quite different preferences for the mixes. One of them would de-emphasize high-frequencies, and in general didn't like transients. Another one liked "crisp" and "punchy" mixes. The third one was somewhere in between. Mixes of the first engineer tended to translate well all the way down to 48/16, as they were "mellow" to start with. MP3 and AAC of these mixes sounded OK too.

Mixes of the second engineer not so much: that's where I could hear the differences between the 192/24 and downsampled 48/24. Interestingly enough, I personally couldn't hear differences of 192/24 vs 96/24 and 96/24 vs 48/24, so it was "two steps down" that would reveal the differences to me. You are right, the differences were small.

Is this supposed to prove there is an audible difference between the recordings?
 

JJB70

Major Contributor
Forum Donor
Joined
Aug 17, 2018
Messages
2,905
Likes
6,161
Location
Singapore
My honest opinion is that if differences are so marginal that there is a debate over whether they are audible and even if real can only be discerned by a trained listener under controlled conditions (even there with less than a 100% success rate in DBT) then that tells its own story. That story being that it is all meaningless in the real world of using hifi and carrier media to just enjoy music.
 

Krunok

Major Contributor
Joined
Mar 25, 2018
Messages
4,600
Likes
3,070
Location
Zg, Cro
I fully agree. 44.1/16 PCM would fail to encode correctly some odd-shaped signals in the 11-22kHz range but luckilly for us there are no such signals in the music we're listening. Not to mention that our ability to hear the differences between them is small and is declining with age. :)

I do think that @RayDunzl 's and @Blumlein 88 's album "Things we find in the kitchen" should be recorded at 96/24 but the rest of the "normal" stuff - sorry, simply no need.. :D
 

Krunok

Major Contributor
Joined
Mar 25, 2018
Messages
4,600
Likes
3,070
Location
Zg, Cro
I think I am beginning to dimly conceive of a system that is able to start with a high bandwidth PCM signal and transform it to a lower bandwidth while maintaining the ability to reproduce sharp edges cleanly, just as fonts can be rescaled and stored with varying 'bandwidth' while keeping them perceptually as high quality as possible.

If that's what it is, it sounds to me like a solution looking for a problem..

LOL Very good way of putting it! :D
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
But once we have captured a waveform using the standard PCM system, perhaps at 768 kHz, there is nothing to stop us encoding and reproducing it using a spline-based system. Is this what MQA is about?

There are two major uses of splines in audio time domain processing I know of:
(A) Extrapolating between samples during upsampling.
(B) Using them as basis functions for sampling and reconstruction of analog signal.

(A) is not actually the best interpolation approach. It is fast computationally, and produces nice smooth upsampled graphs pleasing the eye, yet it introduces distortions that the sinc-based interpolation and Fourier-based upsampling don't.

(B) is theoretically attractive because it allows to vary between the most accurate representation of the resulting analog output in frequency domain and its most accurate representation in the time domain, by changing the spline order parameter, while getting rid of the infinite character of sinc-based reconstruction.

So, I could imagine using different spline orders for recording an opera singer solo vs a death metal concert. However, (B) is computationally challenging, and requires either using a proprietary end-to-end system (thus interest of MQA creators I guess) or a new set of open standards. IMHO, the latter is not going to happen any time soon - PCM is very convenient and too entrenched.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Is this supposed to prove there is an audible difference between the recordings?

My intent is not to prove anything. Rather, to illustrate why I believe that the hi-res vs CD perception depends a lot on the music genre, on how a particular music piece was captured and mixed, on the listener, and on many other parameters.

To illustrate the illustration of the point. A puzzle by Lewis Carrol (paraphrasing):

- You need to walk to a train station to catch a train departing in 2 hours.
- The distance to the train station is 5 miles.
- You walk with the speed of 3 miles per hour.
- Will you catch the train?

The straightforward answer is: "Yes."
Lewis Carrol retorts: "But what if on the way to the station, a mad bull starts chasing you?"
Morale of the story: the correct answer is "I don't know".

So, the correct answer to the "Is there an audible difference between ... ?", being hotly debated on the forums like this one, is, in most cases, "I don't know", as it has many important parameters under-specified.

Once you concretize the question with the music, equipment, person, and person't physiological condition, then you may get to a more concrete answer. Averaging these answers over many variations of music, equipment, person, and trials, will get you a general answer. If that's what you are after, great. If you are interested in the answer regarding concretely your music and your equipment - you'd need to experiment on yourself.
 
Top Bottom