• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

MQA creator Bob Stuart answers questions.

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,424
Likes
4,030
Location
Pacific Northwest
Being a Tidal Hifi subscriber, I like the concept and high res music that stems from it. But I couldn't really conclude anything from what the guy was saying.
I'd like to think that there's some set of standards and quality control that allows music to be "MQA" like the Tidal Masters, that's then streamed at 96 kHz / 24 bit, or am I missing the point here?
I get the impression that MQA is a marketing ploy. Uncompressed CD audio is about 172 kB / sec. If you stream the FLAC of this, it will compress to about 57% of the original size which is 98 kB / sec. That's well within what you can stream over mobile, let alone home network. And FLAC is an open source algorithm anyone can use for free. If I listened to streaming music, I would consider a FLAC of a CD to be known and reliable and trust the quality of that more than some proprietary algorithm that stuffs 96/24 into the same bandwidth. Most recordings don't even benefit from 96/24 anyway because the recording's other limitations -- mics, mastering, compression, etc. -- make them lower than CD resolution. That is, most recordings don't have > 20 kHz bandwith or > 96 dB dynamic range to begin with coming off the live mic feed, before they're even encoded and processed.

I do really hope that I'm never ever opposed to an audio play back chain which is able to kill my hearing or even me on the spot if something goes wrong like having the volume by accident set to 100%.
Just about any set of headphones with a decent powered amp can cause hearing damage if you accidentally turn it on at full blast. Be careful!
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
The paper you linked in the post I replied to starts on page 6151 and ends at 6156

Oops, I messed up. Apologies. In the paper I referred you to originally this effect is shown on Fig 3 (a). Too many papers were opened on my screen. And it was way too late, after a long day.

The "Figure 8 on page 7412" I mentioned refers to this paper: https://pdfs.semanticscholar.org/3b82/b370794b92b168addbe9b5f5657a6b128eb3.pdf.

As far as I can tell there are no mentions of microsecond long integration times in that paper.
And even the part you quoted now, what does a 1-5ms (ms as in mili, not micro) transmission delay has to do with the microsecond integration you calimed before?

From the original paper I discussed (https://www.pnas.org/content/100/10/6151): "Here we demonstrate that cortical and perceptual responses are based on integration of the pressure envelope of the sound, as we have previously shown for AN fibers, rather than on intensity."

They go on: "The first spike following the onset of a stimulus is triggered when the stimulus reaches the neuron's threshold and occurs with some fixed “transmission delay” thereafter."

The second paper characterizes what these transmission delays are. In combination, the papers indicate that the supra-millisecond "reaction times" of the hearing system are not caused by the slow integration machinery in IHC. This machinery operates on micro-, rather then mill-seconds time scale.

So now you are saying that paper was irrelevant and linking to yet another paper that studies something else?
Am I supposed to play an endless game of cat and mouse with you where you keep switching papers?
Am i supposed to now ignore your previous claim on microsecond integration and chase the new set of claims?

I realize that this may feel to you like https://en.wikipedia.org/wiki/Blind_men_and_an_elephant. Please realize that to me it doesn't. After reading dozens of textbooks relevant to the subject of this discussion, hundreds of papers, thousands of internet postings, and giving it all some quality time to cross-correlate, I came to a rather coherent, albeit not perfectly resolved, picture of what's going on with the high resolution audio.

When I try to summarize the ideas on a higher level, some people on this discussion board demand proofs. When I try to give proofs, some people say they are irrelevant: and in their frame of thinking, may well be - to them. What keeps me here are yet other people, who are getting what I'm doing: building a common frame of thinking aimed at understanding this subject, more accurate than any one of us could have built alone.

The papers you are quoting now seem to say that when short pulses are heard, there is a delay and ringing.
Which is exactly what I showed you that happens in a graph when you capture a very short pulse at a low sampling rate.
It doesn't dissappear.

Here it is again in case you forgot:
54o3riZ.png

As you can see the short pulse is captured at 48KHz, and there is ringing in the 48KHz signal, and it doesn't follow the pulse's outline.
Just like your new set of papers say.
In reality there will also be a delay, since the response to the pulse cant start before the pulse even started.
So it will becaptured, it will be reproduced, and according to the papers you quoted now, the 48Khz signal looks like what the ear was going to hear anyway.

Whatever happens inside the hearing system after the pulse on the upper graph arrives, the neural networks will more readily adjust the neural signals in such a way that the onset time, direction, and intensity of that sound will be perceptually placed where it is supposed to be, based on prior experiences of the brain involving the correlations between audio, visual, tactile etc. stimuli.

While I'm typing this very sentence on keyboard, my neural networks are constantly adjusting their weights, so that the perceptual origins of the clicks, estimated on the basis of the sounds coming to my ears, remain consistent with what proprioceptors in my hands are sending, and what fingertips are feeling, with all the neural transmission delays taken into account. This all adds up to a consistent experience, which feels like real life, because it is.

With the signal depicted on the lower graph, the perceptual timing of the signal onset may be dependent on the signal amplitude in a way different compared to how it happens in real life. To put it another way: the neural networks are trained to compensate for the ringing of the person's own cochlea, and the delays of his or her own nervous system, yet they are not necessarily trained to compensate for "artificial" signal transformations.

If anyting the papers you quoted seem to suggest there is no point in capturing a perfect pulse shape, since the ear is going to hear a distorted version of it, which is in the milisecond, rathen than microsecond range.
The lowest number you quoted so far was (edit: in regards to the ears response, rather than the stimuli) "few hundred microseconds" which would be in the ~3Khz range which 48khz sampling can capture perfectly.
And I will also remind you that the time resolution of 16/44 is in the nano-second range.

You got this right: according to everything I know so far, capturing the shape of a 5-microseconds pulse perfectly is not important. It does appear important, however, to reasonably accurately capture its onset time, duration, and the mechanical momentum transferred (which is the integral of air pressure over time).

Going back to your graphs. The pulse on the upper graph will cause the cochlea to ring in a way it pleases, and if the pulse amplitude is high enough, there will be a sensation of a short pitch-less sound. The lower graph looks more like a 25 KHz sinusoid that changes its phase midway. It is a good question whether it would be perceived in a similar way, or perceptually significantly filtered out, like 25 KHz sinusoids usually are.

It seems you think that quoting random scientific papers at people is going to convince them of unrelated claims.
I am done with this, its pointless.

I respect that. I realize it can feel pointless to you. Sorry I couldn't be of better help.
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
Oops, I messed up. Apologies. In the paper I referred you to originally this effect is shown on Fig 3 (a). Too many papers were opened on my screen. And it was way too late, after a long day.

When I try to summarize the ideas on a higher level, some people on this discussion board demand proofs. When I try to give proofs, some people say they are irrelevant: and in their frame of thinking, may well be - to them. What keeps me here are yet other people, who are getting what I'm doing: building a common frame of thinking aimed at understanding this subject, more accurate than any one of us could have built alone.

I think that part of the problem is that we have an extended discussion coming from many angles, as well as several slightly off-topic things being discussed in the same thread.

If you want people to follow what you are arguing, then I think that it would be helpful at the beginning of each post to say in high level what particular hypothesis you are trying to support, and how the information relates back to the central thesis.

For example I'm still unsure why you seem to have such a heavy focus on the mechanics of the human ear. I can see that you could possibly build a case for a plausible hypothesis as to why higher than standard CD quality resolution might be able to be discriminated by some listeners. The trouble is the evidence for the ability to discriminate even being real (and significant) is still quite shaky, so it seems quite premature to go galloping off with extensive explanations relating back to the human hearing system.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
15,891
Likes
35,912
Location
The Neitherlands
It is one thing to explore hearing boundaries with some unrealistic and artificially generated test tones and pulses (which also would have to be reproduced accurately) and talk about timing of it, it's another thing for a brain to do this while listening to (complex) music signals which often aren't reproduced time coherent (when using speakers) anyway and are rather band limited due to mechanical properties.

Maybe there are some folks that benefit from higher than 44/16 formats.
I suspect there are far more people that merely think (sorry... know) they can clearly hear the difference.
Even on equipment and recordings that cannot do it justice.

I would say there are plenty of test files of hires music that is downsampled (properly) and those who would really want to find out for themselves can listen for themselves.

Then there are those that want the best of the best.. just to be sure... buy 192/24 or DSDx(pick a number) and NOS filterless DACs that can do 384kHz and enjoy the music instead of worrying or trying to prove hearing can detect things that aren't in any recording to begin with.
Give them their joy and happiness.

I am willing to bet there will be people selling their RME because there are better measuring DACS (that lack features that makes the RME interesting) just because the numbers (and price) is higher and they want to be sure they own the best. These folks will claim it sounds 'better' after the upgrade while listening to streamed and watermarked recordings. I am happy for them.

There is really only one way to find out IF you really benefit from hires or not and that is to do blind listening tests with actual music yourself.
When you can't hear it you either have a bottleneck or ... you just don't hear it and can live your audio life happy ever after.
This is far more educational than reading all the research out there.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
I think that part of the problem is that we have an extended discussion coming from many angles, as well as several slightly off-topic things being discussed in the same thread.

If you want people to follow what you are arguing, then I think that it would be helpful at the beginning of each post to say in high level what particular hypothesis you are trying to support, and how the information relates back to the central thesis.

Good point.
For example I'm still unsure why you seem to have such a heavy focus on the mechanics of the human ear. I can see that you could possibly build a case for a plausible hypothesis as to why higher than standard CD quality resolution might be able to be discriminated by some listeners.

Not only on that. We can have an interesting discussion about fundamentals of music theory. Or about speakers design. All of the above, and much more, is relevant to the subject of high resolution audio.

The detailed mechanics and neurophysiology of the hearing system, however, is something that was either unknown or misunderstood for so long that such state of fundamental science has kept alive up to this day too many audiophile and audio experts misconceptions no longer compatible with what is scientifically proven by now.
The trouble is the evidence for the ability to discriminate even being real (and significant) is still quite shaky, so it seems quite premature to go galloping off with extensive explanations relating back to the human hearing system.

I don't see it that way. I agree with you in the context of evidence averaged over the population, delivery systems, music genres, and other factors. When a particular person is considered though, or a minority social group like audiophiles, the evidence appears to be much more compelling. Why do these people spend so much time and money on the quest for that perfect sound? And why other people can't understand what they are after?
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
15,891
Likes
35,912
Location
The Neitherlands
Why do these people spend so much time and money on the quest for that perfect sound? And why other people can't understand what they are after?

The majority of these people may well do that because
A: They have the money and like to show it off
B: They have the money and like audiojewelry themselves
C: It does sound better to them
D: The equipment bought actually is higher quality and the owner appreciates this
E: The ones buying certain audio jewelry has a preferred tonality they are after
F: Some people like to buy the maximum quality they can afford
G: Some of them are horders and can't help themselves.
H: Some of them just want to have the newest and the latest
I: Some of them may want rare (vintage or new) items with a cult status

I bet I can think of a dozen other reasons why people buy what they buy. More often than not bought on incorrect assumptions or information they read somewhere.
 
Last edited:

LuckyLuke575

Senior Member
Forum Donor
Joined
May 19, 2019
Messages
357
Likes
315
Location
Germany
The majority of these people may well do that because
A: They have the money and like to show it off
B: They have the money and like audiojewelry themselves
C: They genuinly believe it sounds better to them
D: The equipment bought actually is higher quality and the owner appreciates this
E: The ones buying certain audio jewelry has a preferred tonality they are after
F: Some people like to buy the maximum quality they can afford
G: Some of them are horders and can't help themselves.
H: Some of them just want to have the newest and the latest
I: Some of them may want rare (vintage or new) items with a cult status

I bet I can think of a dozen other reasons why people buy what they buy. More often than not bought on incorrect assumptions or information they read somewhere.

I believe in buying good quality equipment, and ensuring that my whole audio setup is on par; that is the headphones, DAC, amp, cables, and source in terms of file quality.

In terms of the "perfect sound" I don't think that's something real that can be achieved. I perceive that there's a curve of audio reproduction quality that can be achieved for a certain amount of technology and money spent. You get a lot when starting from nowhere and you first start investing in decent equipment and music sources, but after a certain point, you're not getting the benefit the more that you spend. And at some point, spending more could actually reduce audio quality (think "signal cleaners" and the host of obscure cables and dongles that exist out there). The graph below kind of pictures what I have in mind.

graph-diminishing-returns.gif


There probably is a lot of the other things you've refereed to. I especially see the bragging rights aspect being a part of people paying ridiculous amounts of money for equipment, plus placebo effect that comes from spending mega dollars; "I spent $600 on a USB cable that's think and braided, the sound is so much better!"
 

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
That looks like a good analogy.
You are showing a rope, a wall and a hard cylinder, which are probably just that.
And then claim they are a part of an invisible elephant.

So far you have only shown that it is possible that a sharp pulse might be recorded differently that the ear's response to it.
You didn't show that this is actually the case.
Even if the physical response is different, you didn't show that this difference can be consciously perceived in an ABX test.
You also didn't show 16/44 is not enough to capture the perceived sound.
This means that even if the perception is different, it could be possible to compensate for the difference during the recording, for example by using binaural recording.

I would also suggest that this difference, even if all of the above is proven true sometime in the future, is not very relevant to music.
While there are instruments with a sharp onset, I would suspect most of the audible vibrations/energy would be in the subsequent "ringing" of that instrument (the instrument itself, not the ear's response).
Generating a sharp pulse is hard, even the air itself would attenuate it while it travels from the instrument to the ear.
Additionally, many of the studies used pulses at very high SPL to be audible at all.
I don't think most people listen to music at such levels.
 

PierreV

Major Contributor
Forum Donor
Joined
Nov 6, 2018
Messages
1,437
Likes
4,686
When I try to summarize the ideas on a higher level, some people on this discussion board demand proofs. When I try to give proofs, some people say they are irrelevant: and in their frame of thinking, may well be - to them. What keeps me here are yet other people, who are getting what I'm doing: building a common frame of thinking aimed at understanding this subject, more accurate than any one of us could have built alone.

I think one should be extremely careful when one builds a scaffold of disparate papers, some of them showing effects in small to very small cohorts, others based or reasoning by exclusion, many of them in non-human species. Cherry picking stuff that "deserves further investigations", "seem to imply that", "might indicate that" to try to explain the possible benefits of a process that is not even fully documented is unlikely to be very convincing.

weak experimental scaffold -> possible benefits (a weak effect itself) of hires audio -> benefits of an undocumented process (proven not to do all what it claimed to do, known and disclosed conflict of interest)

doesn't look to good to me on the whole...

Ideally, I'd prefer

1) strongly prove people can tell and prefer hires from CD quality
2) strongly prove that people can tell and prefer a certain type of codec
3) investigate why 1 and 2 occurs, formulate hypotheses and test them.

Now, I will grant you that gerbils at least have an advantage over humans in eventual DAC algorithm testing: immunity to ownership and pricing biases.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
The majority of these people may well do that because
A: They have the money and like to show it off
B: They have the money and like audiojewelry themselves
C: It does sound better to them
D: The equipment bought actually is higher quality and the owner appreciates this
E: The ones buying certain audio jewelry has a preferred tonality they are after
F: Some people like to buy the maximum quality they can afford
G: Some of them are horders and can't help themselves.
H: Some of them just want to have the newest and the latest
I: Some of them may want rare (vintage or new) items with a cult status

I bet I can think of a dozen other reasons why people buy what they buy. More often than not bought on incorrect assumptions or information they read somewhere.

Nice list! I personally prefer to think about this in three dimensions:

(X) Sound quality
(Y) Affordability
(Z) Desirability due to other factors

Investigating topics related to sound quality dimension interests me the most. Then affordability. It's not like I'm discounting the influence of other factors, yet historically marketers, designers, builders, and salesmen took good enough care of those.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
That looks like a good analogy.
You are showing a rope, a wall and a hard cylinder, which are probably just that.
And then claim they are a part of an invisible elephant.

:) I wouldn't care about an invisible elephant if it did't "step on my ear" every once in a while :)

This usually manifests itself in my aversion to a certain genre of music recorded on CD, until I hear it played live by competent musicians. I already mentioned gamelan. Over the years, I had to add to this list mariachi, and all kinds of rock styles involving heavily distorted electric guitars.

So far you have only shown that it is possible that a sharp pulse might be recorded differently that the ear's response to it.
You didn't show that this is actually the case.
Even if the physical response is different, you didn't show that this difference can be consciously perceived in an ABX test.
You also didn't show 16/44 is not enough to capture the perceived sound.
This means that even if the perception is different, it could be possible to compensate for the difference during the recording, for example by using binaural recording.

Well, there were attempts to prove just what you said needs to be proven. For instance:
http://boson.physics.sc.edu/~kunchur//Acoustics-papers.htm
http://boson.physics.sc.edu/~kunchur//papers/FAQs.pdf

It's been "debunked", for instance here:
https://www.audiosciencereview.com/forum/index.php?threads/milan-kunchur.522/
https://hydrogenaud.io/index.php/topic,73598.100.html

I carefully went through all of the above. Roughly, the "debunking" could be split onto three categories:

(1) Arguing that Mr. Kunchur doesn't understand the Sampling Theory. Well, this didn't sound right to me, because, based on my first-hand experience, Sampling Theory is an order of magnitude easier to understand than Quantum Mechanics, which is in turn an order of magnitude easier to understand than Quantum Electrodynamics - the primary area of Mr. Kunchur's expertise, which was never in doubt by his peers.

After carefully looking at all the "debunking" presented, I realized that the would-be debunkers understanding of what Mr. Kunchur was trying to prove wasn't satisfactory. For instance, https://hydrogenaud.io/index.php/topic,73598.msg834710.html#msg834710 alleges that Monty Montgomery "debunks Kunchur soundly using a very clear example with no math: https://xiph.org/video/vid2.shtml. That part starts near the end at around 21:55".

Well, what does Mr. Montgomery says at 21:24 in this video? Verbatim: "Again, our input signals are band-limited". And then he goes on to demonstrate that a band-limited signal that kinda sorta looks like the original in-between-samples pulse can indeed by perfectly captured. What can this prove about a signal with spectrum depicted on Fig. 4 (page 597) of http://boson.physics.sc.edu/~kunchur//temporal.pdf, or about a signal with spectrum depicted on Fig. 3 (page 5) of https://pdfs.semanticscholar.org/d9bf/d506271a8c38cf0f77e6edfbffebf5e368b6.pdf?

(2) Arguing that Mr. Kunchur didn't control some of significant experimental input variations, and thus purely physical measurements biases crept into the experiment. Once again, this didn't sound right to me, as Mr. Kunchur went to great lengths to eliminate such biases, using his expertise in designing and conducting experiments using much more refined experimental machinery, and producing results of much higher precision, than what was required for the audio experiment.

Those were earnest and serious inquires. One by one, they were answered: as much as I could tell, to the complete satisfaction of the inquiring parties. My best guess is that those online questions reflected what Mr. Kunchur encountered during the peer reviews, not only of the finished paper, but also during the experiment design, and discussions of its preliminary results.

(3) Debunking based on potential audiological experimental biases. Note that in this case I used debunking instead of "debunking". Mr. Kinchur answered some of those inquiries to my satisfaction, yet not all. Specifically, this one was never answered, to the best of my knowledge: https://hydrogenaud.io/index.php/topic,73598.msg701379.html#msg701379. Indeed, there is no reference in Mr. Kunchur writings to the experimental controls proving that the participants couldn't differentiate the primary harmonic of the 69 dB signal at 7 Khz with precision better than 0.25 dB.

I happen to share perspective on these experiments with Mr. Johnston, the one who is so prominent on this forum as well (https://hydrogenaud.io/index.php/topic,73598.msg676753.html#msg676753): "If you want to support this premise, repeat the experiment and see if you can confirm the results. You might even try to improve the experimental process, and try several different kinds of ultrasonic stimulii to see what's going on."

Knowing what we know now about the cochlea machinery, it is possible to imagine that the differences Mr. Kunchur measured could in fact change the perceived timbre. An experiment with even finer physical resolution, and more rigorous audiological controls, could either prove or disprove that, with acceptable level of precision. As it stands, I can give a rebuttal to the only serious original rebuttal still standing I'm aware of, but what would it prove? Essential audiological controls in the original rebuttal were absent too. My rebuttal could go like this:

So, you are saying that you:
"... generated two 7 kHz sines using Soundforge. Since the software is all 16 bits, I decreased the volume of the first one by 1 dB, and of the second one by 1.25 dB, in order to get the same quantization noise on both ... I was wearing headphones and the playback volume was moderate. Inferior to 80 dB, but I couldn't say how much .. Like the listeners in Kunchur's experiments, I found the louder sine "brighter"".

Please tell me:
What is the mechanism behind you finding the louder sine "brigher"? That is, why the perceived timbre of what is ostensibly a pure sine signal appears to be noticeably changing with such a small change in volume? Could it be that this effect was caused by unaccounted-for harmonic distortions in the software and/or hardware you were using? Or by distortions particular to your hearing system at that unspecified sound level?

I would also suggest that this difference, even if all of the above is proven true sometime in the future, is not very relevant to music.
While there are instruments with a sharp onset, I would suspect most of the audible vibrations/energy would be in the subsequent "ringing" of that instrument (the instrument itself, not the ear's response).

Could be, or could be not. If the pulse is strong enough to excite essentially all auditory nerve fibers in the course of, say, 200 microseconds of the cochlear ringing, than for the next ~2 milliseconds the IHCs would be recovering, and wouldn't register what the instrument was sending during these 2 milliseconds, potentially making that instrument to sound subjectively softer.

If, on the other hand, the pulse is so weak that it wouldn't be heard all by itself, it could still pump up the IHCs enough to trigger much faster during the integration of the consequent sound that the instrument was sending. If the instrument's signal is also not very strong, then instead of it being heard, let's say only after being integrated enough over 200 ms, it could be heard just 50 ms after the non-perceived transient, which could be quite noticeable musically.

Qualitatively, this corresponds well to the differences audiophiles ascribe sometimes to the qualities of an "under-resolved" record: gross music components - loud and long - sound right, but levels and timing of decay tails and reverberations are off, taking away from the immersive experience.

The effects described above can potentially affect perception of any music where transients and sinusoids are placed close enough together: gamelan, mariachi, and heavily distorting guitars are perhaps extreme examples of that, yet I personally also noticed that sometimes cymbals produce the "blackouts" of consecutive music tones, and decay over time, perceptually differently on 192/24 vs 44/16 - for me.

Generating a sharp pulse is hard, even the air itself would attenuate it while it travels from the instrument to the ear.
Additionally, many of the studies used pulses at very high SPL to be audible at all.
I don't think most people listen to music at such levels.

Yes, a sharp pulse attenuates while traveling through the air. Hearing system probably takes this into account, I would expect most readily for natural sounds. I believe this could be one of the reasons why we can feel the "stage depth" of a good acoustic recording. But then again, such pulse could be perceived as "veiled" instead of as more distant.

Other strong depth clues are given by the patterns of reverberations, and spectral changes of sounds emitted by sound objects with known characteristics. Perhaps the integration of intensity of the direct transient, the clues coming from the reverberations, and spectral analysis gives us the robust estimate of the depth of the scene?

From evolutionary considerations, it must work well both in the the open air and in the forest, where the transients can be actually veiled by the foliage. I'm not aware of a detailed peer-reviewed research on that (haven't yet looked deeply into the stereo stage depth effect), yet somehow the stage depth ought to be recorded and then perceived. Maybe other members have more insight into this?
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
I thought this was an interesting section in one of the above papers:

Temporal resolution and digital signals In most fields of science, "to resolve" means to "substantially preserve the essence of the original signal" and in particular to be preserve enough information in the signal so that it can "become separated or reduced to constituents" (e.g., please see v.tr. [11] and v.intr. [2] under http://www.thefreedictionary.com/resolve). If the constituents cannot be separated and have merged together, the signal's essence has been killed. However, a certain other definition exists which pertains to the smallest time shift that produces a difference in the final digital code; this resolution allows noticing differences in the "degrees of death" of the killed signal rather than the system’s ability to preserve sonic details and convey them to the ear. In psychoacoustics and auditory neurophysiology, the former definition applies. Below I give optical and audio examples to explain this further. Optical example: A binary star system is imaged through a telescope with a CCD. First, there is the analog optical resolution that is available, which depends on the objective diameter, the figure (optical correctness) of the optics, and seeing (atmospheric steadiness). This optical resolution is analogous to the "analog bandwidth". Because this resolution is limited, a point source becomes spread out into a fuzzy spot with an intensity profile governed by the point spread function or (PSF). Next we are concerned with the density of pixels in the CCD. To avoid aliasing, the pixel spacing L must be finer than the optical resolution so that the optics provides "low pass filtering". If the pixels and their separation are larger than the separation of the centers of the two star images, the two stars will not be resolved separately and will appear as a single larger merged spot. In this case the essential feature (the fact that there are two separate stars and not an oblong nebula) has been destroyed. This is usually what is meant by "resolution" or the lack of it. The number of bits N that can differentiate shades of intensity ("vertical resolution") has little to do with this – no number of vertical bits can undo the damage. However, details of the fuzzy merger do indeed depend on N: if the star images are moved closer together, the digital data of the sampled image will be different as long as the image shift exceeds L/N. This L/N definition of resolution applies to the coding itself and not to the system's ability to resolve essential features in the signal as described above (otherwise, the average 6" backyard telescope with a 12 bit CCD would have a resolution that is < 0.001 arc seconds, which is better than the ~0.1 arc seconds resolution of the research grade telescopes!). Digital audio recording: In my papers, statements related to "consumer audio" refer to CD quality, i.e., 16 bits of vertical resolution and a 44.1 kHz sampling rate (when the work for these papers was begun around 2003, 24bit/96kHz and other fancier formats were not in common use in people's homes for music reproduction). For CD, the sampling period is 1/44100 ~ 23 microseconds and the Nyquist frequency fN for this is 22.05 kHz. Frequencies above fN must be removed by anti-alias/low-pass filtering to avoid aliasing. While oversampling and other techniques may be used at one stage or another, the final 44.1 kHz sampled digital data should have no content above fN. If there are two sharp peaks in sound pressure separated by 5 microseconds (which was the threshold upper bound determined in our experiments), they will merge together and the essential feature (the presence of two distinct peaks rather than one blurry blob) is destroyed. There is no ambiguity about this and no number of vertical bits or DSP can fix this. Hence the temporal resolution of the CD is inadequate for delivering the essence of the acoustic signal (2 distinct peaks). However this lack of temporal resolution regarding the acoustic signal transmission should not be confused with the coding resolution of the digitizer, which is given by 23 microseconds/2^16 = 346 picoseconds. This latter quantity has no direct bearing on the system's ability to separate and keep distinct two nearby peaks and hence to preserve the details of musical sounds. Now the CD's lack of temporal resolution for complete fidelity is not systemic of the digital format in general: the problem is relaxed as one goes to higher sampling rates and by the time one gets to 192 kHz, the bandwidth and the ability to reproduce fine temporal details is likely to be adequate. I use the word "likely" rather state definitely for two reasons. In our research we found human temporal resolution to be ~5 microseconds. This is an upper bound: i.e., with even better equipment, younger subjects, more sensitive psychophysical testing protocols, etc., one might find a lower value. The second reason to not give an unambiguous green signal to a particular sampling rate is that the effective bandwidth that can be recorded is less than the Nyquist frequency because of the properties of the anti-aliasing filter, which is never perfect in real life. One more thing I want to add is that one forum poster inquired whether the blurring is an analog effect and not a digital one (“… this isn't a sampling rate issue, it's a simple question of linear filtering…"). But the two are not separate. While it is true that the smearing may take place in the analog low-pass filter circuitry before the signal reaches the ADC, the low-pass filter cutoff is dictated directly by the sampling rate. The exact amount of smearing and other errors will depend on the slope and other details of the filter, but the big-picture conclusion is still the same.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,200
Likes
16,982
Location
Riverview FL
Whatever happened to paragraphs?
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
15,891
Likes
35,912
Location
The Neitherlands
An observation:

....Hearing system probably takes this into account, I would expect most r.... I believe this could be one of the reasons why we can feel the "stage depth" of a good acoustic recording. But then again, such pulse could be perceived as "veiled" instead of as more distant.

...... Perhaps the integration of intensity..... ?

yet somehow the stage depth ought to be recorded and then perceived.

There seem to be a lot of uncertainties in your line of thinking or you use those words not to offend ?
The word 'but' for instance nullifies the words that came before it.
If any researcher would use these words instead of presenting evidence I would probably scratch my head.
 

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
:) I wouldn't care about an invisible elephant if it did't "step on my ear" every once in a while :)
This usually manifests itself in my aversion to a certain genre of music recorded on CD, until I hear it played live by competent musicians. I already mentioned gamelan. Over the years, I had to add to this list mariachi, and all kinds of rock styles involving heavily distorted electric guitars.
I highly doubt its due to the sampling rate. I would suspect its due to the recording and reproduction itself.
You mentioned reverberations yourself.
Why focus on reverberation details of a specific high SPL/db impulse signal, when I am quite sure we are still unable to reproduce the reverberation patterns of an opera house at home.
If you even slightly move your head you will instantly get a difference between a live performance and a speaker's reproduction.
I have been at several opera houses in Europe, and the difference between a live performance and a recording is obvious.
But i think it has a lot more to do with the acoustics of the hall and the positioning of sound sources than some slight differences in utrasonic impulse responses.

So, you are saying that you:
"... generated two 7 kHz sines using Soundforge. Since the software is all 16 bits, I decreased the volume of the first one by 1 dB, and of the second one by 1.25 dB, in order to ge
I don't remember saying that. Or even anything similar.

From one of Kunchur's papers, the one that shows the 5us discrimination:
https://asa.scitation.org/doi/pdf/10.1121/1.2998548?class=pdf
Thus the source of the discrimination may be related to the change in the waveform shape rather than just its power.
So even he's isn't sure why the discrimination happened.
It could be the 5us delay, but it could also be because in addition to the delay itself, the audio waveform was changed due to the equipment.
Which is why he also points out:
On the other hand it also points to the need for higher bandwidths in apparatus used in psychoacoustic research for certain types of experiments, so that the thresholds measured are not affected by the limitations of the equipment.

Edit:
Somewhat unrelated to the current theoretical discussion, but just want to mention this anyway.
Kunchur is having trouble figuring out whats going on with state of the art lab equipment that took him 2 years to set up.
An enormous time (of the order of two years) and effort were spent to develop the instrumentation and the methods for checking for artifacts
So personally, I would conclude that claims by people who say they can tell the difference using consumer grade electronics are BS.
 
Last edited:

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
An observation:



There seem to be a lot of uncertainties in your line of thinking or you use those words not to offend ?
The word 'but' for instance nullifies the words that came before it.
If any researcher would use these words instead of presenting evidence I would probably scratch my head.

It is not evidence, but rather a hypothesis to be tested.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
15,891
Likes
35,912
Location
The Neitherlands
Testing a hypothesis is always good.:)

Its in human nature to only look for evidence that concurs with their line of thinking, rarely to evidence that disagrees with their line of thinking.
(also an observation, not a personal remark b.t.w. but a general one that applies to everyone including me)

Do you plan to test this yourself one day, in order to gather evidence, or look for it in literature ?
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
I highly doubt its due to the sampling rate. I would suspect its due to the recording and reproduction itself.
You mentioned reverberations yourself.
Why focus on reverberation details of a specific high SPL/db impulse signal, when I am quite sure we are still unable to reproduce the reverberation patterns of an opera house at home.
If you even slightly move your head you will instantly get a difference between a live performance and a speaker's reproduction.
I have been at several opera houses in Europe, and the difference between a live performance and a recording is obvious.
But i think it has a lot more to do with the acoustics of the hall and the positioning of sound sources than some slight differences in utrasonic impulse responses.
The effects I described can be observed with high-quality headphones as well. This takes away the room acoustics considerations.
I don't remember saying that. Or even anything similar.
Wasn't quoting you. That's what the debunker said.
From one of Kunchur's papers, the one that shows the 5us discrimination:
https://asa.scitation.org/doi/pdf/10.1121/1.2998548?class=pdf

So even he's isn't sure why the discrimination happened.
It could be the 5us delay, but it could also be because in addition to the delay itself, the audio waveform was changed due to the equipment.
Which is why he also points out:
Exactly! After spending 5 years on this, Kunchur demonstrated the effect, yet he wasn't sure about the underlying neurophysiological basis for it. Because the science hadn't discovered it yet back then! Now we understand the mechanisms in play better, and can formulate better hypotheses for future experiments to test. This is a normal cycle of refinement of scientific knowledge.
Edit:
Somewhat unrelated to the current theoretical discussion, but just want to mention this anyway.
Kunchur is having trouble figuring out whats going on with state of the art lab equipment that took him 2 years to set up.

So personally, I would conclude that claims by people who say they can tell the difference using consumer grade electronics are BS.
You may want to read this: https://en.wikipedia.org/wiki/Ignaz_Semmelweis. See the parallels? That's how the inertia of an established paradigm works. We, individually, have a choice: either be like those guards who beat the "savior of mothers" to death, or be like Louis Pasteur, who refined the scientific knowledge, and proved Ignaz Semmelweis right.
 
Top Bottom