• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

How well High-Fidelity audio aligns with human perception?

WTF. That's probably the strangest definition of science I've ever heard.
 
Just Googled it. Wow. A very humorous description. Who am I to argue. Splendid!
 
Find a piano or a keyboard. Play your favorite chord 10 times in a row as identically as you can. Same attack, same timing, same sustain. It should sound remarkably similar.
Now realize that every time you played that chord you heard a "coherent" result, but the relative phases between your notes were COMPLETELY random because you don't have enough control over the timing to get the same result twice. If our brains were sensitive to the type of phase distortion you're concerned with, then the relative phase of the notes in a chord would make the chord sound different every time we heard it, and songs would be unpredictable because musicians would be incapable of consistently generating the sound they want.
you are confusing live sound with reproduced sound.
 
you are confusing live sound with reproduced sound.
I don't see how that's relevant in the context of what they said.
 
I don't see how that's relevant in the context of what they said.
simply that the example of the played musical note is not illustrative to demonstrate the importance of phase coherence in the reproduced sound
 
simply that the example of the played musical note is not illustrative to demonstrate the importance of phase coherence in the reproduced sound
My understanding of the example is that the phase relationship between different sounds — like notes in a piano chord — is basically random. It's impossible to strike keys with perfect timing, so the frequency components from different notes don’t have any consistent phase relationship. And since we don’t hear a noticeable difference when the same chord is played again, it seems like faithfully reproducing that randomness isn’t critical.

But it’s a different story for each individual piano key. The harmonics within a single note start out with a well-defined phase relationship. That can drift over time — due to slight detuning or string dispersion — but the initial alignment is always there. If a playback system disrupts that, it might change how the note sounds.

Of course, practical playback systems can’t tell whether frequencies belong to the same sound or not. If they alter phase in a frequency-dependent way, they affect everything. But that doesn’t change the basic idea.
 
My understanding of the example is that the phase relationship between different sounds — like notes in a piano chord — is basically random. It's impossible to strike keys with perfect timing, so the frequency components from different notes don’t have any consistent phase relationship. And since we don’t hear a noticeable difference when the same chord is played again, it seems like faithfully reproducing that randomness isn’t critical.

But it’s a different story for each individual piano key. The harmonics within a single note start out with a well-defined phase relationship. That can drift over time — due to slight detuning or string dispersion — but the initial alignment is always there. If a playback system disrupts that, it might change how the note sounds.

Of course, practical playback systems can’t tell whether frequencies belong to the same sound or not. If they alter phase in a frequency-dependent way, they affect everything. But that doesn’t change the basic idea.
That discrimination of science versus engineering, a funny self-ironical piece. Regarding your above made observation, true. The boss chimes in, mathematics. What is the concept of a tone, frequency? It is not ‚natural‘ to begin with. Frequency as such doesn‘t exsist, but it is a descriptive simplification, that holds only under strict limitation of its use. Fourier‘s theorem was proven recently, even human perception has to obey, and so has the recording industry. Keywords ‚integration time‘, ‚information content‘. As you‘re talking about a tone ‚beginning‘, it is not. Rather than a tone, ‚beginning’ says, it is a continuous spectrum.

Your speculations are lacking on different fields. First, there are (regularly) two ears, connected by the Head Related Transfer Function (see Dr Gunther Theile, his phd thesis). Your central precondition is preemptive, the following inference shows wide gaps. You could fill those with exp/ data, but that‘s neither provided, nor is it put into prospect in that you tell how they could be taken. Not any, may it be on a sheer physical level.

You say, the human auditory system identifies the origin of overtones/harmonics by their phase relation to the basetone.
 
That can drift over time — due to slight detuning or string dispersion — but the initial alignment is always there. If a playback system disrupts
exactly, the sound of a note, live music is always in phase. reproduced music should be in phase as well. some loudspeaker designers have always studied this phenomenon and their first goal is to limit the damage
 
exactly, the sound of a note, live music is always in phase. reproduced music should be in phase as well. some loudspeaker designers have always studied this phenomenon and their first goal is to limit the damage
Look, that‘s the two points; a wild speculation and stating that if it wasn‘t followed some damage is done, of course to some perfectly unknown original. Proof is, there’s some third person following your science.

No more I wonder, why you do not offer any systematic investigation from your own side. Yes, of your own. It is easy to investigate such topics at home today. Mostly it lacks a concept, and if there were, I‘m told that do-it yourself would not fulfill scientific standards. So, somebody else is asked to do the due homework. With all respect, by what means could your suspicion be proven (wrong)?

(btw, this weekend I found a small panel of test persons to prove a point true, that I made here. For entertainment, and people liked it. It was ignored here, and will be ignored forever by the experts. Nothing wrong with that, not my fault or concern. Only that I could, without any hassle, explain the origin of the effect to the naive audience perfectly, critical questions proving the understanding. Isn‘t that embarrassing?)
 
Keep in mind, the ear is an amplitude sensor of frequency. It is not a phase sensor. It is the same with light sensors, biological, or physical. The only way to detect phase with an amplitude sensor is through interference with another signal. I have spent a lot of time in photonics and radio, and the principle is the same.

There is a lot of work today and there are products that separate instruments in a mix with software. They might be based on harmonics. I would look at patents and scientific papers on that. It is plausible the auditory cortex can do something like that, like it can to an extent follow more than one voice. There may be some prediction at work in the brain trained on music as well.

It is possible to detect phase by sampling at multiples of the Nyquist Frequency, but the cochlea does not do that. So the auditory cortex and higher brain functions cannot do it. The brain can do things with two ears and the head transfer function, but cannot detect phase.
 
Last edited:
The boss chimes in, mathematics. What is the concept of a tone, frequency? It is not ‚natural‘ to begin with. Frequency as such doesn‘t exsist, but it is a descriptive simplification, that holds only under strict limitation of its use. Fourier‘s theorem was proven recently, even human perception has to obey, and so has the recording industry. Keywords ‚integration time‘, ‚information content‘. As you‘re talking about a tone ‚beginning‘, it is not. Rather than a tone, ‚beginning’ says, it is a continuous spectrum.

If you want to be pedantic, then yes — “frequency,” used casually, is a simplification. And yes, if a signal has a beginning, its spectrum is technically a continuum and infinite — depending, of course, on how you choose to analyze it.

However — despite lacking expertise in psychoacoustics — even with my rudimentary understanding of ear anatomy, I can see that the cochlea is, among other things, an array of mechanical filters tuned to different frequencies. So it makes sense that our perception of sound is shaped by a model that assumes sounds are composed of different frequency components — because that’s how the sensory input is presented to the brain.

True, the tuning of this array is neither linear nor logarithmic on a frequency scale, and the plain discrete Fourier transform doesn’t perfectly map to it — but it’s still a useful and practical approximation.

You say, the human auditory system identifies the origin of overtones/harmonics by their phase relation to the basetone.

My observation was mainly about the physical difference between striking a single string — where the tight phase relationship between the overtones and the fundamental can be observed — and striking multiple strings tuned to different frequencies, even when those frequencies fall within the same harmonic sequence. The rest is, of course, just speculation on my part.
 
Keep in mind, the ear is an amplitude sensor of frequency. It is not a phase sensor. It is the same with light sensors, biological, or physical. The only way to detect phase with an amplitude sensor is through interference with another signal. I have spent a lot of time in photonics, and the principle is the same.

There is a lot of work today and there are products that separate instruments in a mix with software. They might be based on harmonics. I would look at patents and scientific papers on that. It is plausible the auditory cortex can do something like that, like it can to an extent follow more than one voice. There may be some prediction at work in the brain trained on music as well.
But there must be something in the auditory system that is somehow sensitive to phase? Otherwise what is the explanation for the "binaural beats" illusion?

***
Edit:
I found a relatively recent paper on the subject: In vivo coincidence detection in mammalian sound localization generates phase delays.

The claim is the brain doesn't detect sound timing differences by simply comparing simultaneous signals from each ear. Instead, sound timing is interpreted more like a phase shift than a time lag.
 
Last edited:
That discrimination of science versus engineering, a funny self-ironical piece. Regarding your above made observation, true. The boss chimes in, mathematics. What is the concept of a tone, frequency? It is not ‚natural‘ to begin with. Frequency as such doesn‘t exsist, but it is a descriptive simplification, that holds only under strict limitation of its use. Fourier‘s theorem was proven recently, even human perception has to obey, and so has the recording industry. Keywords ‚integration time‘, ‚information content‘. As you‘re talking about a tone ‚beginning‘, it is not. Rather than a tone, ‚beginning’ says, it is a continuous spectrum.

Your speculations are lacking on different fields. First, there are (regularly) two ears, connected by the Head Related Transfer Function (see Dr Gunther Theile, his phd thesis). Your central precondition is preemptive, the following inference shows wide gaps. You could fill those with exp/ data, but that‘s neither provided, nor is it put into prospect in that you tell how they could be taken. Not any, may it be on a sheer physical level.

You say, the human auditory system identifies the origin of overtones/harmonics by their phase relation to the basetone.
sorry but I don't understand what I should explain
 
sorry but I don't understand what I should explain
Seems you quoted another post of mine. How could the claim you made be proven wrong or right, see post #30.

If you want to be pedantic, …
I’m not pedantic …

However — despite lacking expertise in psychoacoustics — even with my rudimentary understanding of ear anatomy …
The operation of the colchea is still not understood. Some say it‘s a waveguide with self amplification. To see it as a filterbank is a simplification that doesn‘t hold. At least so I’m told.

Thing is, who cares, if we see the recording not as a mirror of an original, but as an artifact of human decision making, more a pencil sketch than a photograph? More some collected hints on what happened, than virtual reality. It‘s not ‚the brain‘, automated, to listen, but the mind, some say the soul even. We, as consumers impose our understanding onto the physical sensoric input. We need more education in that regard, me thinks. We might spare the bigger effort in investigating the auditory systems for topics that are more life changeing than hifi. But thanks, I‘ve learned we‘ve got artificial colcheas, great!
 
Thing is, who cares, if we see the recording not as a mirror of an original, but as an artifact of human decision making, more a pencil sketch than a photograph? More some collected hints on what happened, than virtual reality. It‘s not ‚the brain‘, automated, to listen, but the mind, some say the soul even. We, as consumers impose our understanding onto the physical sensoric input. We need more education in that regard, me thinks.
No argument here. Still, it's always nice when technology can create an ever more convincing illusion of being there—for our sensory apparatus, and without much expense. (Symphony tickets are expensive, especially if you want to go regularly, not just once in a while. And don’t even get me started on opera ticket prices.)

That’s in addition to us having a soul, of course, to be moved by music..

But thanks, I‘ve learned we‘ve got artificial colcheas, great!
I was excited at first, but the more I learned, the more disappointed I became. They do work—but hardly for music. Hopefully, that will improve over time.
 
Last edited:
Seems you quoted another post of mine. How could the claim you made be proven wrong or right, see post #30.
And what should I explain to you? That the sound captured by a microphone must be reproduced with timing and phase? Please read post #4 I realized that they had already written the same thing!
 
I lean towards thinking there's some truth to it.
That was post #4, in the, for me, most relevant part.

No argument here. Still, it's always nice when technology can create an ever more convincing illusion …

I personally think, taking stereo for an illusion is delusional. Coming to this, once you have phase X right, what about next step, stereo. Head related transfer function is different for real listening versus phantom sources, and that‘s frequency dependent. I wonder, as a passer-by, what the next assumption will be to save your conclusions from being dismissed.
 
That was post #4, in the, for me, most relevant par
and do you agree with him or have you done scientific studies in neurobiology and discarded everything?
If I remember correctly, there is a neurobiologist designer who makes loudspeakers. Have you ever listened to them? Have you ever made comparisons? Or do you just write that we have to demonstrate our ignorant intuitions to you?
 
Coming from a background in physical acoustics and engineering - though not specifically in audio - I’ve been wondering why high-fidelity audio still seems so focused on signal accuracy - "waveform fidelity," for lack of a better term. By signal, I mean what microphones capture and what gets delivered near the ears.

To be fair, modern high-fidelity systems already account for a lot of the basics: that we hear with two ears, the limits of human hearing in frequency and dynamic range, how we perceive loudness, acceptable levels of various distortions, and so on. Perceptual insights have clearly shaped things like lossy compression, spatial audio, and room acoustics. But beyond that, I haven’t seen much that really engages with the more subtle ways we actually perceive sound.

Take harmonics, for example. Real-world sounds aren’t just single tones - harmonics are a universal property of oscillations, and any hearing system that evolved to make sense of sound likely developed around this structure. From very early on, brains - ours and those of many other animals - are wired to hear harmonically related frequencies as one sound, or as belonging together. It plays a huge role in how we recognize voices, distinguish instruments, and perceive musical harmony. So I can’t help but wonder - could that be used more directly in how we design audio systems?

I’d be curious to learn if there are any interesting efforts along those lines, or what the main challenges are.

Thanks.
How well High-Fidelity audio aligns with human perception?

Add after perception Expectations . Knowledge or probably lack of knowledge.

Imo most using (for a better word :facepalm:) perception-aware systems (what you buy is what you get so out of the box) without knowing what the room acoustics in combination with your bought system are actually doing.

From there, it becomes a matter of what perceptual truth you want the system to serve. Like your source material ? LP, CD, high Res Files etc that are constructed in a proffesionals Studio Controll Room with a investment ratio by average 30% Gear 70% Room treatment (canceling trafick tremors, reflections, reverb so phase coherent time alignt behaviour etc etc) that could run in the millions.

So realistically, what do we expect? That our home setup will sound like that?
In many cases the dream burst and or we accept the odd sound as reality an that is fine or worse get used to such sound.
Most of us on ASR probably aren’t satisfied with that. So we tweak, measure, and adjust always chasing that better sound :cool:


 
Last edited:
Back
Top Bottom