• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

MQA creator Bob Stuart answers questions.

PierreV

Major Contributor
Forum Donor
Joined
Nov 6, 2018
Messages
1,449
Likes
4,818
Well, the MQA shill has successfully derailed the thread and has completely shifted focus from Bob and MQA. No one has noticed yet?

Shifted is the correct terms: his MQA arguments have progressively been right shifted to the least significant bit (or some will say to zero). Or, maybe they were signed, should be undefined, but the board doesn't throw an error by convention.
 

March Audio

Master Contributor
Audio Company
Joined
Mar 1, 2016
Messages
6,378
Likes
9,321
Location
Albany Western Australia
OK. Let me explain what I'm saying from a different angle. Nothing about Fourier Transform anymore :) Imagine that you are in a boxing gym in front of a heavy punching bag. You have an equally strong sparring partner, standing on the opposite side of the bag.

First experiment: you push on the bag, slowly, the pressure of your hand approximating a sinusoid. The bag starts moving. Once it passes the zero velocity point on the partner's side, the partner does what you just did. Continue the exercise, both of you pushing on the bag while it is moving from you, until the bag reaches very high amplitude. That's an approximation of a normal basilar membrane resonance behavior.

Second experiment: you and your partner are punching the bag, each of you landing a strong straight every second, yet out of phase, so that the bag is hit at your side at 0 milliseconds, then at partners's side at 500 milliseconds, then at your side at 1,500 milliseconds, and so on. You can do this for a very long time, assuming you are very fit :) Yet the bag will not reach the amplitude that it reached in the first experiment. That's an approximation of a basilar membrane response to an ultrasonic wave.

Third experiment: you punch the bag, once. The bag won't reach the amplitude of the first experiment. Yet if won't stay in place like it effectively did during the second experiment either. It will move, because you transferred to it mechanical momentum (mass multiplied by velocity), which was not quickly counteracted by your partner.

Eventually, the bag will settle in an oscillation with its own characteristic frequency, no matter how softly and slowly, or how hard and quickly, you punched it. The maximum amplitude it reaches will depend on the mechanical momentum transferred by your punch. You can try a "noodle slap" vs "all my body weight in" punches and observe the difference. That's an approximation of basilar membrane response to a short pulse.

In the first and second experiments, we can talk about the frequency, because the applications of force are repetitive. In the third experiment, we can't, because with just one punch, how do we determine the time period until the second punch? The second punch simply doesn't come during the third experiment.

Analogously, the basilar membrane effectively ignores a vigorous application of ultrasonic force analogous to the second experiment. Yet it has to react, somehow, upon getting half of that ultrasonic wave in the third experiment, because the half wave is asymmetrical, and thus transfers mechanical momentum.
You need to be neuralized about all things audio and re educated.

 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA

Tape cartridge rules. Look: https://en.wikipedia.org/wiki/Linear_Tape-Open#Tape_specifications.

LTO-8: linear density is 20,668 bits/mm. If we translate into 30 inches per second: 20,668 x 25.4 X 30 = 15,749,016 bits per second = 1,968,627 bytes per second, which is about 2 MB/second = 656,209 24-bit PCM samples per second.

LTO-8 is much faster than that of course. Max uncompressed speed is 360 MB/second, ~180x higher. My point: magnetic tape's inherent information density is high. It's been widely utilized through analog circuitry up until mid-1980s. Now through digital.

500px-Supertape_data_storage_capacities.svg.png
 

Wombat

Master Contributor
Joined
Nov 5, 2017
Messages
6,722
Likes
6,465
Location
Australia
Tie it all together in a rational, verifiable, finding/conclusion. :cool:
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Okay let us just spitball this explanation in fairly unrelated terms. Seems a favorite method. I mean literally spitball this.

A punching bag weighs 100 pounds (45 kilos). I create a paper spitball and shoot it from a regulation straw from 10 feet (3 meters) away. The spitball hits the bag and bounces off. We know it must have imparted some momentum. We can even calculate it if necessary. We might need to also include how much of the energy was absorbed by deformation of the paper spitball vs the relatively stiff punching bag. And then whatever energy did get to the interior of the punching bag (coupled through some impedance we'll say) will get significantly absorbed by the grains of sand in the bag. In fact good chance that most or all of this energy gets turned into heat. And no momentum that can be ascertained is imparted to the bag. Which likely means the bag doesn't start to move and repeated high frequency spit balls won't make it move.

Maybe I should offer some links to acoustic impedance of the ear. But I feel that would be too directly related to the topic. So I'll forego that.

If two "spitting partners" keep shooting spitballs from two sides, at the bag's characteristic oscillations frequency, in anti-phase to each other, how much time will it take them to make the the bag oscillate with an amplitude, let's say, one foot of displacement from the vertical?

Is that how basilar membrane reacts on a sinusoidal acoustic wave in a hearing range, with SPL above the hearing threshold? Does it takes thousands, or maybe millions, of cycles of the wave to make the membrane move?

I won't bother you with references anymore. As I recall, it takes between 4 and 20 wave cycles for the IHC stereocilia motion to reach maximum amplitude. The experiments I described are in this ballpark. Yours uses way too low force.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
So where did the 95% come from? I'm not trying to pin you down on the exact figure, but several pages back you you said it was 100% without any caveat.

About 5% of energy comes from transient components in average music. 95% comes from slowly evolving sinusoids. Some passages, played on xylophone, if memory serves me, delivered ~50% of energy from transients.

The transient components - by definition actually - are not band-limited by the human hearing range. This is one of the two types of music components that don't Fourier-transform well (the other being noise). The two papers I referred to go into depth discussing the reasons why.

So, it we ignore the transients, we can still say that we reproduce the music with 95% fidelity. But, is the 5% "transients omission or mis-timing distortion" acceptable nowadays?

The topic of transients was settled in research literature by mid-2000s. By 2007, studio-quality 192/24 ADC/DAC units, which effectively take care of the transients, became widely available.

Since ~2010, the research literature mostly discusses how to identify and compactly represent them. Here, we are still discussing whether they even exist.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,708
Likes
241,456
Location
Seattle Area
About 5% of energy comes from transient components in average music. 95% comes from slowly evolving sinusoids.
Music is full of transients. They are not what you think of transients but it is full of it. So not sure where you got those numbers.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,708
Likes
241,456
Location
Seattle Area
The transient components - by definition actually - are not band-limited by the human hearing range.
Of course there are band-limited transients. Here is a random track:

1561094206840.png


What do you call those spikes if they are not transients? And oh, they are quite audible because well, they are in audible band!

Here is their spectrum:

1561094286788.png


Honestly, it seems you have never put your thoughts in practice.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
About 5% of energy comes from transient components in average music. 95% comes from slowly evolving sinusoids. Some passages, played on xylophone, if memory serves me, delivered ~50% of energy from transients.

The transient components - by definition actually - are not band-limited by the human hearing range. This is one of the two types of music components that don't Fourier-transform well (the other being noise). The two papers I referred to go into depth discussing the reasons why.

So, it we ignore the transients, we can still say that we reproduce the music with 95% fidelity. But, is the 5% "transients omission or mis-timing distortion" acceptable nowadays?

The topic of transients was settled in research literature by mid-2000s. By 2007, studio-quality 192/24 ADC/DAC units, which effectively take care of the transients, became widely available.

Since ~2010, the research literature mostly discusses how to identify and compactly represent them. Here, we are still discussing whether they even exist.
But are you saying that 192 kHz PCM faithfully captures and reproduces the whole 5% + 95%? Or only the 95%? On page 11 or wherever it was you seemed to say that 192 kHz captures 100%.

If it's 100% we can all go home, because 192 kHz audio is a reality and available for recording and playback for a few dollars. Phew!
 

svart-hvitt

Major Contributor
Joined
Aug 31, 2017
Messages
2,375
Likes
1,253
But are you saying that 192 kHz PCM faithfully captures and reproduces the whole 5% + 95%? Or only the 95%? On page 11 or wherever it was you seemed to say that 192 kHz captures 100%.

If it's 100% we can all go home, because 192 kHz audio is a reality and available for recording and playback for a few dollars. Phew!

Mathematically, MQA is way below 95%.

Perceptually, subjectively, it is somewhere between +/-5 from the reference 100%, i.e. somewhere between 95% and 105 %, depending on preferences, situation etc. So it’s like a lottery ticket, or a box of chocolate where all the chocolate pieces taste about the same.

Of course, this lottery ticket - box of chocolates - comes at a fee, you know.

Enjoy.

PS: I didn’t feel this comment of mine made much sense, but I won’t take the responsibility for that.
 
Last edited:

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Getting answers about MQA...

 
Last edited:

Saffuria

Member
Joined
May 27, 2019
Messages
20
Likes
11
Location
Aethalia - Etruria
Listening to MQA and non MQA versions of many tracks on Tidal, through Audirvana which unfold to 88/96, and .... I'm not shure to hear much of a difference, if none.
Maybe, I say maybe, some "air".

I mean, I'm not going to pay Tidal extra money for some "air", should they ask.:)
 
Last edited:

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Of course there are band-limited transients. Here is a random track:

View attachment 28006

What do you call those spikes if they are not transients? And oh, they are quite audible because well, they are in audible band!

Here is their spectrum:

View attachment 28007

Honestly, it seems you have never put your thoughts in practice.

Amir, I honestly don't understand what you are trying to illustrate. First, you say "they are in audible band". Then you show spectrum graph that indicates energy beyond the audible band.

Very roughly, the integral over frequency below the audible band on this graph is ~50 grid cells. The integral below the ultrasonic band is ~3 cells. Let's assume the Parseval's theorem applies here, at least roughly. Then the ultrasonic energy part is 3/(50+3)=0.0566, in the ballpark of the 5%.

Without further transients analysis, different flavors of which are described in the two papers on transients I referred, we can't tell which parts of these "ultrasonics" are true ultrasonic sinusoids, which are noise, and which are transients.

Yes, we don't care much about the true ultrasonic sinusoids, unless they have high, potentially damaging, intensity. The hearing system will ignore the low-intensity ones.

We care a bit more about noise, yet if it has Gaussian distribution of amplitudes and phases, which it does in most practical cases, then the Fourier Transform still works. This is proven mathematically: as I recall, the proof uses the fact that Fourier Transform of Gaussian is Gaussian.

The transients (pulses) are different though. Their Fourier Spectrum can have infinite support. They have to be extracted by special DSP techniques, development of which has been an active research topic for the last decade. MQA claims to do something proprietary in this regard.

In any case, wouldn't it be logical to figure out what those 5% are, and deal with them intelligently, instead of just low-passing the signal at 20 KHz? Distortions lower than 0.001% in other parts of the audio delivery chain are nice, yet with the 5% swept under the rug, would it matter?
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,708
Likes
241,456
Location
Seattle Area
Amir, I honestly don't understand what you are trying to illustrate. First, you say "they are in audible band". Then you show spectrum graph that indicates energy beyond the audible band.
There is always stuff out there since this is high sample rate music

but levels are -130 dB down at 20 kHz. What you hear are the peaks in low frequencies to the left.

The transients (pulses) are different though. Their Fourier Spectrum can have infinite support.
Again, you are imagining infinitely sharp transients. These don't exist in music. What exist is sudden jumps in amplitude which are called in transients. Think of something as hitting a piano key. That is a transient. And of course very much audible.

Lossy compression codecs have trouble with these transients and create pre-echo which can be quite audible if the bit rate is low. I spent a lifetime with my team minimizing loss of fidelity in handling them. Uncompressed PCM music has none of that pre-echo. So you are really chasing an invisible flaw.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
But are you saying that 192 kHz PCM faithfully captures and reproduces the whole 5% + 95%? Or only the 95%? On page 11 or wherever it was you seemed to say that 192 kHz captures 100%.

This appears to be the case, for me, in the context of the audio systems I had access to, and genres of music I ever cared about. 192/24 is 100% for me, to the best of my knowledge, as of today.
If it's 100% we can all go home, because 192 kHz audio is a reality and available for recording and playback for a few dollars. Phew!

I'd think so too, if all the tracks I cared about, including Gamelan and Prog Rock (I think I could survive without Mariachi for a while :)), were available in 192/24. But they aren't :( Perhaps that's why I'm so passionate about this whole Hi-Res thing: I want more music in 192/24!
 
Last edited:
Top Bottom