• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

miniDSP Tide16 - Holy Grail with 16 Channel Atmos/DTS:X, high SINAD

Mics (and apps) care about capturing the signal the moment it emerges.
Smart mics (like arrays) also know from where this signal emerges.

Yes but for the specific context under discussion, direction is irrelevant. It is additional information but not necessary or even helpful information specifically for time alignment.
 
Yes but for the specific context under discussion, direction is irrelevant. It is additional information but not necessary or even helpful information specifically for time alignment.
The key words here along with direction, is dimension, precise ones.
That's trigonometry's magic,.
 
The key words here along with direction, is dimension, precise ones.
That's trigonometry's magic,.
And my point is that knowing all of this with even 100% accuracy is irrelevant for time-alignment since it is not the only thing affecting time-delays.
 
Yes but for the specific context under discussion, direction is irrelevant. It is additional information but not necessary or even helpful information specifically for time alignment.
Another application, from the same article:

Gate.PNG


I quote:
"As the diagram illustrates, the wanted sound arrives at the first microphone and then travels a known distance to the second. Signal processing compensates for the resulting known delay and adds the two signals, producing a much larger result. Summing sound signals coming from behind the array or from off-axis produce a much smaller effect.


Typical applications for endfire arrays include handheld microphones for television or radio that are intended to be pointed towards the source – such as a presenter or speaker – to capture the speaker’s voice clearly and eliminate background noises."

Gating anyone???

And that's with only two.

And we can do even better:



Gate1.PNG
 
Last edited:
And my point is that knowing all of this with even 100% accuracy is irrelevant for time-alignment since it is not the only thing affecting time-delays.
After the speakers/subs/whatever? What else can introduce delays to the direct signal after that?
Cause that's what the mics see.
 
they COULD tell the difference between full-range audio, and audio that had been low-pass filtered at 22kHz.
That's a big exaggeration. They couldn't tell the difference. They had brain activity, that's all.
 
There was some research at Kyoto University and published in the Journal of Neurophysiology Volume 83 Issue 6 that concluded that although listeners could NOT detect any sounds that were high-pass-filtered at 22kHz, they COULD tell the difference between full-range audio, and audio that had been low-pass filtered at 22kHz.
All of the work produced by Oohashi et al is self-referential (the same group of people cite each other), has poor controls in all the papers I've seen and is not accepted in hearing science. You will not see those authors cited in serious psychoacoustic literature.

I'm a member of the AES. This recent paper shows that when listeners were presented with signals that were band-limited to 24kHz and those with harmonics extending to 34kHz, they rated them identical. There was 100% consistency: https://aes.org/publications/elibrary-page/?id=17938 This paper even cites some of those Japanese publications and has this to say as its final word on the matter: "In summary, findings here contrasted inconclusive evidence from previous studies probing the perceptual influence of high f harmonics and were consistent with studies that have shown the upper limit of frequency detection to vanish above 24 kHz."
 
If I remember right, you are also an EE, so you must have a good understand of Fourier, 48 kHz can cover high frequencies up to 24 kHz (yes I know there are other things like ringing, and other stuff to consider), that gives a margin of 4 kHz above the generally accepted human limit of 20 kHz. Transparency, in absolute sense it one thing, in practice, even if "others" can hear higher frequency, it doesn't mean (in fact highly unlikely) that they could discern the difference due to including those higher frequencies. That's not even for single pure tones, let alone in music/movie tracks when the waveforms are complex. Those research you cited, are just research, I know may PhDs, so I know how those things are, but wouldn't say any more about that. Anyway, I am all for devices capable of 96 kHz sampling rate regardless, so that there would be a huge margin for even super humans so no argument there. As to 192 kHz, good to have too, as long as price wouldn't be jacked so high lol..
Yes indeed.

Well, 24kHz is the Nyquist frequency for 48kHz, and I guess that DVD & BD have a slightly higher bandwidth than CD, though not quite as far as 24kHz. Perhaps it's 22kHz, I'm not sure. Perhaps they relax the transition zone instead, in order to give the reconstruction filter a bit more elbow room. I don't know whether it makes much difference. I've always considered 44.1k and 48k to be virtually the same. On the other hand, maybe if Sony & Philips had just been a tiny bit more patient and waited until laser technology allowed 48k on a CD, that might have been completely transparent, and we might not be discussing this at all.

I think the research DID show that the subjects "could discern the difference due to including those higher frequencies". The point is that they couldn't discern the high frequencies in isolation. Therefore simply testing human hearing to find the highest frequency sine wave that can be detected in isolation is not the right way to establish the requirements for transparency.

I agree about 96k and 192k. There seem to be a lot of people who look for DACs that can process 192k or 384k or higher, so they can use better off-board reconstruction filters like HQP or PGGB. However this test by John Atkinson using a Chord M Scaler and a Mark Levinson No 30.6 DAC (from 1999!) showed it was only necessary to re-sample up to 88.2k in order get this reconstruction filter response, which is to all intents and purposes perfect:

1769173943111.png
 
All of the work produced by Oohashi et al is self-referential (the same group of people cite each other), has poor controls in all the papers I've seen and is not accepted in hearing science. You will not see those authors cited in serious psychoacoustic literature.

I'm wary of self-referential research, having seen a few bad examples that people have gleefully latched on to, in order to prove their point.
However in this case, I could only find a single instance of a self-reference out of 51 references.

I'm a member of the AES. This recent paper shows that when listeners were presented with signals that were band-limited to 24kHz and those with harmonics extending to 34kHz, they rated them identical. There was 100% consistency: https://aes.org/publications/elibrary-page/?id=17938 This paper even cites some of those Japanese publications and has this to say as its final word on the matter: "In summary, findings here contrasted inconclusive evidence from previous studies probing the perceptual influence of high f harmonics and were consistent with studies that have shown the upper limit of frequency detection to vanish above 24 kHz."

That's an example of a negative outcome, where subjects were unable to discern a small difference. There are lots of those.
My point is that the limit of audibility should be set at the smallest discernible difference that CAN be detected, rather than the largest difference that can't be detected.
 
My point is that the limit of audibility should be set at the smallest discernible difference that CAN be detected, rather than the largest difference that can't be detected.
Sure but you are still relying on a tiny set of questionable research to make this statement.

Ok. It's slightly interesting but you have not proven your case.
 
Last edited:
Sure but you are still relying on a tiny set of questionable research to make this statement.
Exactly.

Take infrasonics, the other end of the spectrum. In more than 35 years of testing, the thresholds agree within decent tolerances. The same can't be said of high frequency hearing, where the curves I've seen in established research basically brickwall at 15-17kHz and presbycusis in a large chunk of the population make the question somewhat pointless. In fringe ultrasonic research, they don't bother to produce minimum audible pressure curves like this, instead using what I think are ridiculous methods. In the paper linked by @welwynnick, the signals are explicitly acknowledged as inaudible by the subjects and researchers, but there is brain activity. There needs to be an exceptionally well-planned level of control to ensure that measured response correlates to the audio signal, not some other environmental variable. So the results are at best questionable.

1769175768986.png

I'm wary of self-referential research, having seen a few bad examples that people have gleefully latched on to, in order to prove their point.
However in this case, I could only find a single instance of a self-reference out of 51 references.
Not just the same author citing themselves, the same group citing each other in a circular fashion across many papers.
 
Yes indeed.

Well, 24kHz is the Nyquist frequency for 48kHz, and I guess that DVD & BD have a slightly higher bandwidth than CD, though not quite as far as 24kHz. Perhaps it's 22kHz, I'm not sure. Perhaps they relax the transition zone instead, in order to give the reconstruction filter a bit more elbow room. I don't know whether it makes much difference. I've always considered 44.1k and 48k to be virtually the same. On the other hand, maybe if Sony & Philips had just been a tiny bit more patient and waited until laser technology allowed 48k on a CD, that might have been completely transparent, and we might not be discussing this at all.

I think the research DID show that the subjects "could discern the difference due to including those higher frequencies". The point is that they couldn't discern the high frequencies in isolation. Therefore simply testing human hearing to find the highest frequency sine wave that can be detected in isolation is not the right way to establish the requirements for transparency.

I agree about 96k and 192k. There seem to be a lot of people who look for DACs that can process 192k or 384k or higher, so they can use better off-board reconstruction filters like HQP or PGGB. However this test by John Atkinson using a Chord M Scaler and a Mark Levinson No 30.6 DAC (from 1999!) showed it was only necessary to re-sample up to 88.2k in order get this reconstruction filter response, which is to all intents and purposes perfect:

View attachment 506362
I meant to say Nyquist, but It is very much related to Fourier theorem/analysis anyway to the point if an engineer or mathematician knows one, he/she bound to know the other. I forgot the details but I remember reading that 88.2 is actually better than 96 kHz in more cases, it seemed to have to do with it being a multiple of 44.1, all in theory obviously.
 
Not just the same author citing themselves, the same group citing each other in a circular fashion across many papers.
Agreed. I rather have validation from complete strangers that are peers than coworkers and friends.
 
I'll post one last reply and then leave this as I'm well aware this isn't getting anyone anywhere!

After the speakers/subs/whatever? What else can introduce delays to the direct signal after that?
Cause that's what the mics see.

DSP in the sub introduces a time delay that is in addition to the delay associated with the physical distance between the sub and the microphone/MLP. Yes the microphone sees (hears) this which is why microphone measurements work for time-delay corrections. As I've said previously, the original post I responded to what about accurately knowing just the spatial distribution, which is not sufficient information for time-alignment and therefore making a super accurate measurement of just the spatial part is not beneficial purely for time-alignment.

Signal delay per channel for any reason is equivalent to physical delay due to distance.

Yes you can hypothetically represent any delay as if it is one due to distance, but not all delays are due to propagation delays through the air. Accurate spatial information misses other causes of delays and so is not sufficient for optimal time-alignment.
 
Sure but you are still relying on a tiny set of questionable research to make this statement.
Some years ago I read this in Hydrogen Audio, where you need evidence of a Foobar2000 ABX DBT or similar in order to even discuss subjective differences.
Like AVS but draconian. Googlebot took a 24/96 recording, down-sampled it to 16/44.1, and up-sampled it again to 24/96.
He compared the original and re-sampled recordings, and correctly identified them 16 / 16.

My hearing tops at about 17 kHz. Still I can hear pretty clear difference over a pair of Elac FS 607 X-Jet speakers, which are rated for 28-50000Hz (IEC 268-5):

Code: [Select]
foo_abx 1.3.4 report
foobar2000 v1.0.3
2010/06/07 18:25:51

File A: C:\Dokumente und Einstellungen\Christian\Desktop\16-441to24-96.wav
File B: C:\Dokumente und Einstellungen\Christian\Desktop\24-96.wav

18:25:51 : Test started.
18:26:44 : 01/01 50.0%
18:27:09 : 02/02 25.0%
18:27:26 : 03/03 12.5%
18:27:44 : 04/04 6.3%
18:28:00 : 05/05 3.1%
18:28:21 : 06/06 1.6%
18:28:38 : 07/07 0.8%
18:28:52 : 08/08 0.4%
18:29:06 : 09/09 0.2%
18:29:17 : 10/10 0.1%
18:29:33 : 11/11 0.0%
18:29:46 : 12/12 0.0%
18:30:03 : 13/13 0.0%
18:30:26 : 14/14 0.0%
18:30:42 : 15/15 0.0%
18:31:04 : 16/16 0.0%
18:31:08 : Test finished.

----------
Total: 16/16 (0.0%)

I do not think, that I actually hear the presence or absence of stationary HF content. But what I am hearing is louder and somewhat smeared sounding transients in the Redbook version. That is exactly the type of pass-band artifact to be expected from low-pass filtering, but until lately I hadn't considered that to be audible, when done right. And since that step is necessary it can also not be avoided for Redbook delivery.

I do not hear a difference over my regular Canton speakers, which top at about 20kHz! Neither over my Grado headphones. Since both act as a mechanical low pass, I think they just cause the same artifacts in the pass band as the digital low-passing.

Until today I have mostly laughed at high resolution apologists. But that ABX result somewhat changed my perspective.

IgorC and Wombat also got 5/5 ABX DBT results.
Googlebot said he couldn't hear anything above 17 kHz, and he couldn't hear the difference in the HF content of the 2 samples.
He was only able to hear the difference with the Elec speakers, and not the Canton speakers, which extend beyond his own hearing.
This was some time ago, and I'm sure there are lots more DBTs on HydrogenAudio.

There are two points.
Firstly, there is evidence out there.
Secondly, the audibility of high frequency tones isn't enough to determine the requirements for transparency.
 
Last edited:
Over the years I've built up a favourites folder full of evidence about the audibility of high resolution audio. Here's another one from AES Convention 128 Paper 172 by Pras & Guastavino at the Centre for Interdisciplinary Research in Music Media and Technology, Multimodal Interaction Laboratory, McGill University, Montréal


It is currently common practice for sound engineers to record digital music using high-resolution formats, and then down sample the files to 44.1kHz for commercial release. This study aims at investigating whether listeners can perceive differences between musical files recorded at 44.1kHz and 88.2kHz with the same analog chain and type of AD-converter. Sixteen expert listeners were asked to compare 3 versions (44.1kHz, 88.2kHz and the 88.2kHz version down-sampled to 44.1kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2kHz and their 44.1kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2kHz and files recorded at 44.1kHz.
 
Googlebot took a 24/96 recording, down-sampled it to 16/44.1, and up-sampled it again to 24/96.
That's not possible. The first step removes information. The second step adds information that's not there?!!!
 
Back
Top Bottom