• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

MQA creator Bob Stuart answers questions.

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I wouldn't mind if this was an 'holistic' approach, where the person putting forward these ideas was also specifying that the transducers had to meet certain criteria in terms of time alignment and phase error, etc. and that unless this was done, the supposed improvements were non-existent.

As it is, it almost looks as though they're spreading FUD in the hope of selling a system that most people's systems can't even use the supposed benefit of.
 

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,407
Likes
4,004
Location
Pacific Northwest
...
As it is, it almost looks as though they're spreading FUD in the hope of selling a system [MQA] that most people's systems can't even use the supposed benefit of.
Indeed. High end audio has had enough B.S. over the years it's understandable to get suspicious about technical explanations that sound suspiciously like techno-babble.
 

somebodyelse

Major Contributor
Joined
Dec 5, 2018
Messages
3,629
Likes
2,905
This was in illustration of the statement that widening an analog "too sharp to represent" pulse by replacing it with a wider "sampling rate friendly" pulse having the same LTI energy, or even the same perceptually accurate energy, may not be faithful because the perceptual timing may be off.
Can we try to take this from the hypothetical and into the at-least-somewhat verifiable? We need at least one hypothesis, then some way to test it/them. I'm quite sure I've missed or misunderstood bits of @Sergei's arguments and the counterarguments so please correct me when I get it wrong so we can have something agreed to test. The core points seem to be:
  1. Human auditory timing and volume perception work differently to measurement instruments, so any analysis we do needs to take account of this.
  2. Some transients we may want to recreate are too sharp to be sufficiently characterised by an uncompressed recording at 44.1kHz to preserve the timing as perceived by humans, but timing can be preserved with a higher sample rate (192kHz? more? less?)
  3. Lossy codecs like mp3, aac, opus can't preserve timing information as needed by humans as they don't keep the shape of the waveform
  4. MQA is a lossy codec that can preserve the timing information as needed by humans
I think point 1 is more or less agreed - the contention seems to be whether it makes any difference, which is part of point 2.
To test 2 we need to:
  1. Find something that may be 'too sharp to represent'. @RayDunzl has shown coffee spoons giving a much sharper transient than @Blumlein 88 found with cymbals.
  2. Record it repeatedly with some system agreed to be good enough to capture the timing information so we know how much variation there is between repeats of the same action. Recording should be repeated at a variety of lower sample rates, and perhaps with different microphones. The 'good enough' recordings should also be downsampled to the lower sample rates using some agreed good method.
  3. All recordings are then upsampled to the 'good enough' rate for analysis, as per @RayDunzl's earlier posts.
  4. All recordings are analysed using a model of the human auditory timing process to determine what timing differences there may be, and whether they exceed a threshold of perception.
For 3 take the recordings above, encode and decode via the lossy codec, then proceed with steps 3 and 4.
For 4 we have no way to test as we don't have access to the MQA encoding/decoding process.

It's arguable that we should play the recordings through sufficiently good speakers, record the speaker output and analyse that for timing. I'd anticipate arguments about whether the playback devices were 'good enough' if there were no significant timing difference, although that would in itself be informative. Upsampling represents a notionally perfect playback and recording chain.

Please improve my proposal. I'd try it myself but I don't have the mics for the job.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,359
Likes
24,661
Location
Alfred, NY
On the core points:

1. That analysis is irrelevant to lossless systems.
2. That assertion, repeated endlessly by Sergei despite this being pointed out and demonstrated repeatedly, is out and out false. Timing accuracy for bandwidth-limited signals is absolutely unrelated to sample rate and is 3-4 orders of magnitude better than the best-case human thresholds.
3. Also untrue. But also not relevant for lossless.
4. Yes, in the sense that it is no different than any other scheme. The idea that it is somehow unique is rubbish.

The dancing and lying by the system's promoters and shills are effective distractions.

edit: One of the distractions treasured by the shills is trying to relate things to biological phenomena and use that as a FUD technique to cast doubt on well-established acoustic and electronic measurement methods. You must remember that we are NOT injecting signals into brains, we are converting them to acoustic waves.
 
Last edited:

somebodyelse

Major Contributor
Joined
Dec 5, 2018
Messages
3,629
Likes
2,905
That's my understanding too, and I expect the results to show it. I had a similar experience once with a colleague regarding a digital telemetry system and signals close to nyquist. A few seconds looking at the input and output on a scope in the lab did more than any amount of explanation.
 

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
Room EQ Wizard signal generator.
Tried it with Audacity, 0db white noise flac is slightly larger than the wav.
Z3ZV8rw.png


at -6db, flac is slightly smaller


30sec*16bit*48000khz=2,880,000bytes
 
Last edited:

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,407
Likes
4,004
Location
Pacific Northwest
Strange that your results are so different from mine. My version of flac is 1.3.1-4 on Ubuntu Linux.
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
One thing I will say about this thread is that it's helped me to think through the concepts.

Let's put aside audibility for a moment. Why would we do that? For one, there seems to be much more ready acceptance amongst members of this site that the pursuit of engineering excellence is worthwhile, even for it's own sake. Look at the fervour for ever better SINAD on dacs.

Also let's put aside our reservations about DRM and commercial motivation.

Would we not therefore give some credence to a system which, if proven, is measurably closer to perfect analog reconstruction?

From a pure theory perspective, there is an issue with shannon/nyquist sampling in that the sin(x)/x function requires an infinite extent in order to perfectly reconstruct the analog waveform. In practice this means truncating the time extent of the impulse filter. I'm quoting from "Modern Sampling: A Tutorial" by Jamie Angus (which was linked to by a member earlie) here, which I've attached again along with the Stuart paper for easy reference.

1. The filter no longer has an infinite rate of cut-off and thus needs a guard band between the upper frequency of the continuous signal and the lowest frequency of the first alias.
2. More subtly, because the impulse response is now finite in extent, it is impossible to realize a stop-band response of zero (-∞ dB of attenuation) over the whole frequency range of the stop band.
3. In fact, unless the stop band achieves infinite attenuation at infinite frequency, there will always be some alias components present even if the sampling frequency is increased to infinity.
4. The truncated sin(x)/x functions no longer add up to a constant value when the sampled continuous signal is constant. This means that there is some difference between the reconstructed signal and the original signal.
5. The truncated sin(x)/x functions are no longer orthogonal for a time shift equal to multiples of a sample period. This means that the samples are no longer projected properly onto a sampleable space and therefore the samples will have leakage into each other (alias distortion), even if the continuous signal was white noise.

One possible way to overcome this issue is to use a bi-orthogonal approach using β-splines, which can allow for an exponential fall off in impulse response rather than a linear one. By bi-orthogonal the authors mean the effects of the ADC filter being reversed at output with the DAC filter.

This seems to be part of what MQA attempt to do. Now, it seems to me, putting aside the considerations relating to DRM, commercial motive, and audibility, that there is nothing inherently wrong with seeking to pursue a greater degree of engineering excellence in sampling. Especially if we have the means to do so.
 

Attachments

  • 20458.pdf
    518.7 KB · Views: 124
  • 17501.pdf
    887 KB · Views: 82

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,359
Likes
24,661
Location
Alfred, NY
What part of that AES paper on Modern Sampling do you think is incorrect?

Something does not have to be incorrect to be irrelevant. If the effects are orders of magnitude below analog errors and don't improve actual hardware, then it's pointless as a practical matter.
 

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
What part of that AES paper on Modern Sampling do you think is incorrect?
They havent shown the impulse response of the full system, prefilter+reconstruction filter.
They also havent shown how the full system affects a signal from input to output in comparison with the regular methods. Or any metrics like snr compared to regular methods.
 

dc655321

Major Contributor
Joined
Mar 4, 2018
Messages
1,597
Likes
2,235
Something does not have to be incorrect to be irrelevant. If the effects are orders of magnitude below analog errors and don't improve actual hardware, then it's pointless as a practical matter.

This sentiment is a nice corollary to what I was questioning earlier - these new (to audio) methods are interesting, but when presented in relative isolation (i.e. not balanced with a comparison to "traditional" methods), the authors begin to sound more like marketeers than the scientists they are.

All things in balance
.
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
Something does not have to be incorrect to be irrelevant. If the effects are orders of magnitude below analog errors and don't improve actual hardware, then it's pointless as a practical matter.

It may not be inaudible though. That remains yet to be proven/disproven. We've seen on this thread that there is some preliminary evidence of the audibility of some higher sample rate formats over standard CD quality, in some situations, for some listeners. We've identified the effects of the anti-aliasing filter as a possible candidate for the reasons and postulated that, say 20bit - 66khz would be sufficient to overcome that if in fact if the preliminary evidence regarding audibility is concerned. Would we not also want to consider whether a bi-orthogonal encoding-decoding scheme also addresses this?
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
They havent shown the impulse response of the full system, prefilter+reconstruction filter.
They also havent shown how the full system affects a signal from input to output in comparison with the regular methods. Or any metrics like snr compared to regular methods.

That paper is in the domain of theory. It's looking at the maths. That's not a valid criticism. That would be the next stage of inquiry.
 

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,407
Likes
4,004
Location
Pacific Northwest
...
From a pure theory perspective, there is an issue with shannon/nyquist sampling in that the sin(x)/x function requires an infinite extent in order to perfectly reconstruct the analog waveform. In practice this means truncating the time extent of the impulse filter....
True. However, from the Whittaker-Shannon formula, you can measure how many samples away the ripple effect drops below the noise floor. For 16-bit it's about 22,000 samples in each direction. Put differently: the impact of this sample, on the sample 22,000 earlier (and the one 22,000 later), is about -96 dB. So at CD quality, you only need to read ahead 1/2 second, keeping a rolling window of 1 second of samples in memory, for your finite truncation to be close enough to perfection that the difference is below the noise floor.
Note: the fact that 44,000 happens to be close to the sampling rate of CD is pure coincidence. Here, the number 44,000 comes from punching the W-S formula into a spreadsheet to numerically construct various waveforms.
EDIT: I looked at my spreadsheet and confirmed that for a single sample full scale impulse, the sample 22,055 away measures -96.81 dB.
 
Last edited:
Top Bottom