• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

How well High-Fidelity audio aligns with human perception?

Do you play an instrument yourself? (Electric) Bass guitarists are in parts very deep into magical thinking. The pianists I know not so much ... . Back to the bassists they claim, that a so called 'long scale' model will sound richer in the overtones etc bla bla, than a 'short scale'. Thing is, due to scaling effects in the mechanics the short is less prone to buzzing the string to the neck, while the long scale does it all the time. It enriches the sound for sure, but it is decidedly not harmonic - the guitarists don't get that.

So much on harmonics: even for musicians even its spectral content is not that important, let alone the phase of the chaotic elements.

A wind instrument won't be that exact in regard to phase, especially when the note starts. Again, the mechanics of the instrument prevents that. The strict relation develops within a few cycles, but isn't established at once, like it is with strings. While for strings the overtones depend strongly on where the string is excited, including irregularities that sustain with a violin, but not with a guitar. Consider that for a sustained note the player has to drive the damped oscillator continuously, which may have an impact on how the harmonics are excited, and hence the phase. Chaos is a crucial part of an instrument's sound, and the artists seek out for a personal signature originating in that.

All in all the overtones typically are not static; with my guitar I observe an exchange of energy between the harmonics, one going down while the other comes up, and back again. Asking for shape of wave? This effect depends on the model, not only on type of instrument. Percussion is a different field alltogether.

This, at least combined, is another counterargument against a need for strict phase fidelity. Would you mind to present your experiment with a little bit more detail, some numbers?

What do we have? Some quite speculative assumptions picked as isolated, not to the point results from neurobiology, applied to a very much simplified and widely generalized model of mechanical instruments, and a king size gap in between.
In my previous life I used to build research sonars. I mean, I needed unique multibeam sonars for my research, not because sonar was the goal, but because I needed them to study sound propagation, scattering, diffusion, etc., in layered media with tricky boundaries - think the ocean. In that context, assuming phase precision doesn’t matter just wouldn’t fly. I know how small phase tweaks can have huge effects if you know what you're doing.

On the other hand, given the shape of boundary conditions, think the irregularities of the bottom of the ocean, or the opposite boundary, the surface, which is not only irregular but also in constant motion. Imagine what that would do to the phase relationships between different frequency components, especially when the sonar is constantly in motion as well.

Yet, with a systematic and clever approach to the problem, you can successfully combine both "attitudes." Some parts of the signal chain need phase accuracy. Other parts, after you normalize things, benefit from eliminating the phase to bring out patterns you’d otherwise miss.

This is how I built my intuitions, and when I hear that phase fidelity does not matter in audio at all, I am not sure I agree it would be physically possible in all cases. On the other hand, I understand that for real life, with real sounds, in real rooms, with real people, the physical possibility of phase fidelity being critical is extremely rare.

And yeah, I play piano. Or used to. Mostly classical.

As for the experiment, it was totally artificial. I took a fundamental, added ten harmonics with a sloping power profile, shaped it with an attack/decay envelope, then started messing with the harmonic phases. The waveform gets totally warped, even though the power spectrum stays the same. And sometimes, you can hear the difference, kind of like how some alarm clocks just sound more annoying than others.
 
I think I didn't say it clearly in my initial blurb - I mean the bit about how well the audio aligns with human hearing. What I meant is not so much that hifi audio is laking something, but more like the other way around. Like the lossy compression when applied appropriately does not really cause any loss of audio fidelity. The ear+brain complex seems much more forgiving than the "strict" waveform fidelity position would assume.

The whole detour into phase talk just kind of happened for me. I didn’t realize it was such a triggering topic in educated circles :). I’d heard about it before and figured it was settled. In real life, where sounds aren’t perfectly stable and you rarely get pure harmonics locked in, natural phase shifts in physical environments might create some weird effects, but usually they don’t matter much.
 
I don't know if this is a helpful comment or not, but the goal of the signal and electrical domains is to have errors either vanishingly small or perceptually irrelevant. Hi-fi debates happen when perceptually irrelevant errors are found relevant, or a new technological breakthrough allows significant improvement in physical construction and signal delivery.

The errors that we are working with are fairly gross. Psychoacoustics doesn't have the right kind of money driving perceptual research, and as a consequence we lack the models to probe and investigate more subtle problems. A lot of progress has been made, but it pales in comparison to progress in loudspeaker design, for example.
 
... I mean the bit about how well the audio aligns with human hearing. ...
The errors that we are working with are fairly gross. Psychoacoustics doesn't have the right kind of money ...
Tried to formulate a reply, that was entertaining, yet not to silly. For the fun of it, I gave it to an AI for evaluation, and reformulation. Puh ... deleted my first attempt in consequence.

Artificial playback requires the integration of the listener into a technical system - meaning the stereo triangle, not moving a bit and so on. Personally, I find no delight in that at all. It reminds me of Alex during the aversion therapy session in the famous movie A Clockwork Orange. Wilson Audio builds speakers that look exactly like that - not beautiful. What is the motivation for using audio in the first place? That would be the real question for psychoacoustics. Is perfection necessary to achieve those purposes? Or is it a silly end in itself?

As I said many times before, to listen to audio first and mostly is personal involvement, not neurobiology.
 
I personally assume that I have to be subjectively active in order to understand a recording, just like I read a book. By this I mean that the design of a recording in all the parameters of its composition must allow me to extract meaning from it. The playback technology only plays a secondary role, the first role is played by my mind. I'll leave it at.
Yep.
 
Listening to music is a hobby, and high fidelity is too. they are not in opposition to each other
 
Last edited:
Listening to music is a hobby, and high fidelity is too. they are not in opposition to each other
If I was funny, I could say, they point not in 180° opposing directions, but are orthogonal at 90°. They have nothing in common by defintion.

Toole's (Floyd's?) finding to use preference as a quality parameter is a big achievement to actually depart from the 'neurobiology' or 'psychoacoustic' route when evaluating the merits of gear. The latter isn't useful, as far as I follow his argumentation. It is all to vague, indirect. Not the least, personally I feel a little bit embarrassed when others speak of me as an ear/brain apparatus.

But when it comes to preference, though, it may change in different situations. Because of human nature I imagine certain filters involved, one for evaluating a single speaker listening to a few seconds of Fast Car, the most revealing piece of contemporary music, or two listening for fun to Glenn Branca's Symphony #1 in stereo on my couch.

Could you tell where the toggle switch is in my mind, and how I could determine which position it is in?
 
preference as a quality parameter is a big achievement to actually depart from the 'neurobiology' or 'psychoacoustic' route when evaluating the merits of gear.
Preferences are the product of listening experiences and the goal one sets for oneself. When I sit in front of a system, I try to be transported to where the event took place. Low distortion and transparency create this magic, less listening fatigue, a rich acoustic stage. Like any distortion, when it becomes distinctly audible, the magic is long gone. Some systems are transparent, others less so, to the point of nothing. Why, if some designers insist on phase coherence for greater transparency, because they consider it a distortion, can the opposite be said? This is where psychoacoustics and neurobiology come into play. I like high fidelity,
I'm thrilled to listen to music, and I'm fascinated by the level to which stereo reproduction has reached.
 
Not the least, personally I feel a little bit embarrassed when others speak of me as an ear/brain apparatus
Understood. I only use it as a shortcut, with no intent to be callous. What would you prefer I use instead of those words in contexts aiming to be 'objective' - like applicable to most humans?
 
Preferences are the product of listening experiences and the goal one sets for oneself. When I sit in front of a system, I try to be transported to where the event took place. ...

Not any of the parameters from that elementary research fields, neuro... and psycho... is actually observed during the recording process. Do we know the distortion figures of microphones? What about phase distortion in cardioid microphones? Is the necessary equalisation to a natural human ear bearable as a process? Mixing those mikes into a two track recording?

The unseen gorilla passing by, stereo fails originating in its basics, eventually forcing people into unnatural, even bewildering listening behaviour, and still failing. Surround isn't any better.

With all that said, why not just ask people what they like? There are caveats, see my post above, but still. Recording is an artform, not a task of replicating, based on unconscious signal processing.

Hope you take this with a bit of humor:

1752080695193.png
 
Last edited:
Only that not any of the parameters from that elementary research fields, neuro... and psycho... is actually observed during the recording process. Do you know the distortion figures of microphones? What about phase distortion in cardioid microphones? Is the necessary equalisation to a natural human ear bearable as a process? Mixing those mikes into a two track recording?
Sorry, but the amount of distortion—assuming the microphone distorts at all—isn't important, or relatively so. There are €100 microphones and €10,000 microphones. What's important is that the speaker doesn't distort. If the only way to record a musical event is through a microphone, that's what we should use. The proof is in the post I wrote above. If I sit in front of one system and listen to a track, then I listen to the same track on another system, and the first sounds true, while the second is all smeared on the wall behind the speakers, what should I think?

Hope you take this with a bit of humor:
Yes
 
Many years ago I was playing with Uher portable tape recorder using the supplied microphones. At the time Nagra was state of the art, but Uher was a close second.

I was in an Army barracks at the time, and unintentionally recorded someone opening the door and entering the room.

Months later I was checking the tape to see if it was blank, using some pricy headphones. I heard the door open and a person enter the room. I looked up, and for a very long half second, saw the person, real as life.

Since then I have considered realism in sound reproduction to be something manufactured in the brain. Reality is not distortion-free. There are competing ambient sounds and room acoustics. We can choose to hear through noise and distortion. It’s nice to have products that are the best we can make and afford, but in the end, satisfaction is something we choose.
 
While I agree that audio involves personal engagement and not just neurobiology, it’s still fundamentally rooted in it. One cannot eliminate the neurobiology from the picture. I mean, it always starts – well, almost always (divine intervention or chemicals, artificial or naturally produced, aside) – as a physical, material substrate first presented to your sensory input, whether it’s the real thing or a recording played over speakers.

Moreover, the experience not only correlates in time and space with those funny air pressure pulsations emanating from physical transducers – it is also largely shared with other humans, provided they’re in a compatible state of mind and have some prior common experience. In that sense, it carries a significant ‘objective’ component; it’s not entirely uniquely personally subjective.

And because that ‘objective’ component exists, the better it is understood (or at least meticulously and systematically observed and noted), the more effectively it can (potentially) be reproduced – to better facilitate the hacking of the personal orienting reflex's focus and attention-grabbing mechanisms.
 
Last edited:
Artificial playback requires the integration of the listener into a technical system - meaning the stereo triangle, not moving a bit and so on. Personally, I find no delight in that at all. It reminds me of Alex during the aversion therapy session in the famous movie A Clockwork Orange. Wilson Audio builds speakers that look exactly like that - not beautiful. What is the motivation for using audio in the first place? That would be the real question for psychoacoustics. Is perfection necessary to achieve those purposes? Or is it a silly end in itself?
I have a complex reply. Allow me to meander.

Comparing a stereo sweetspot to the Clockwork Orange scene is hilarious and very true! Stereophonic sound is inherently unnatural. There is no natural circumstance in which an identical sound comes at you from multiple directions. And yet that's exactly what two-channel and various multichannel formats do (I hope it's well known that stereo is a specific technique and not some particular number of channels). Phantom sources rely on this artificial means of manipulating sound through timing and level differences.

The other side to this is that stereo's requirements are fairly flexible as a consequence of loudspeaker improvements. Early loudspeakers, and even some loudspeakers today, are characteristic in having radiation patterns that force a person to keep their head in a particular spot for optimal direct sound. The current designs focusing on consistent on- and off-axis responses horizontally and vertically have, as a consequence, reduced the need to play with loudspeaker distances, angles and strict listening positions. The physically correct listening position is always centered and symmetrical, but there is less need to be so correct, particularly with measurement and adjustment EQ being so common.

But the other, more pressing thing is that you don't seem to understand what psychoacoustics is, or are being excessively flippant.
What is the motivation for using audio in the first place? That would be the real question for psychoacoustics. Is perfection necessary to achieve those purposes? Or is it a silly end in itself?

As I said many times before, to listen to audio first and mostly is personal involvement, not neurobiology.
Psychoacoustics is the study of the subjective experience of sound. It is tightly related to our biology, since the physical structures of our body draw the limits and abilities of our hearing, and to the physical world, the circumstances which cause the formation of sound, studied in acoustics.
Toole's (Floyd's?) finding to use preference as a quality parameter is a big achievement to actually depart from the 'neurobiology' or 'psychoacoustic' route when evaluating the merits of gear. The latter isn't useful, as far as I follow his argumentation. It is all to vague, indirect. Not the least, personally I feel a little bit embarrassed when others speak of me as an ear/brain apparatus.

But when it comes to preference, though, it may change in different situations. Because of human nature I imagine certain filters involved, one for evaluating a single speaker listening to a few seconds of Fast Car, the most revealing piece of contemporary music, or two listening for fun to Glenn Branca's Symphony #1 in stereo on my couch.
Floyd Toole's work is important because loudspeaker design had few, sporadic perceptual references before his highly comprehensive experiments and research into the field. In other words, he definitively introduced psychoacoustics into loudspeaker design, and his references were highly established findings from acoustics, audiology and other related fields.

A word on psychoacoustics. Why is that the main reference point and not, for example, the neurology of the auditory cortex? The main work of psychoacoustics is the development of listening tests and their interpretation. Neurology is the physical examination and testing of the brain and its processing and transformation of sensory signals. The reason is to do with the history and moral side of science. We cannot directly probe the brain, because even the thinnest electrode is destructive and coarse as it cuts through tissue to reach its target. This kind of direct intervention into the human body for experiments cruel, and so for the most part a lot of what we know about hearing had been discovered through careful listening tests. I can go into gruesome detail here about what happens outside of listening tests but it's not necessary to make the point.

These listening tests are a proxy for studying biology directly. Since that is off-limits, you can study human capabilities and limits instead. What that has led to is a clear set of design criteria for loudspeakers, because the way we judge what we hear is related to how we hear, i.e., the how is found in the underlying mechanisms and biology.

However It is simply ignorant to equate preference in ordinary circumstances to preference in blind circumstances. When hearing is isolated by design, and layers of cognitive context are stripped away through tight experimental controls, people tend to make the same judgments. That doesn't mean that the judgments are exactly the same. It just means that people make judgments within limits, that those limits can be rigorously defined, and that those definitions, although clear, are flexible enough to allow a range of outcomes. Trends, correlations, and directions. Not absolutes. But, all of these, tightly related to biology.

There's a particularly blunt argument that has been suggested even in this thread, that your buying decisions should be dictated by science and nothing else. It's stupid because it misrepresents personal needs and the science. There is no best loudspeaker because science cannot answer that question yet. We don't know enough about the body or our own capabilities. What we have, from the science, is a set of guides that let us examine what a loudspeaker can do and come to reasonable judgments about how it will sound under a variety of circumstances.

In general, the work on electrical and digital systems has produced results which allow for, from a perceptual point of view, perfect signal delivery. Vanishly low distortion and extreme dynamic range. Perfect, unwavering frequency response and extended bandwidth. This perceptual point is taken even further with the excellent research underlying compression algorithms. Their testing and design has a large part accomplished in listening tests, and an equally important part based on prior knowledge of the physical mechanisms of hearing. When I wrote above that we don't have perceptual models to probe the subtleties, codecs are a good example. The distortions they generate are highly complex, content-dependent and not meaningfully described with traditional single or multitone tests. Some may think codecs aren't important to audiophiles because you can go and buy lossless media. That's a highly limiting perspective because new immersive and spatial technology requires a very high channel count to work. Lossless delivery and high channel count massively increase bandwidth and storage requirements. So perceptual codec studies try and work out the most efficient means of perceptual rather than lossless preservation of the signal. It's simply unavoidable given how media is consumed and delivered these days. For it to be workable, we need better psychoacoustic models which precisely define what kind of error is acceptable before it becomes objectionable.

I can also discuss, if you like, audiology, hearing aids and cochlear implants. The driving force for noncommercial academic work is helping those with impaired hearing and understanding their experience. There are miles yet to go before damaged hearing can actually be restored. Tinnitus is also a major problem and we have no cure in sight. Like I said above, we simply don't know enough.

Where reproduction technology is showing the most important progress is with transducing systems: loudspeakers, headphones and microphones. This is because these systems have the greatest errors associated with them. In general, anytime anything involved in reproduction "has a sound", that characteristic sound is error. It is not timbre. Timbre is an acoustic or musical feature of the source. Loudspeakers, electronics, digital and physical media are not sources. You can of course treat them that way, but there are consequences to that. The primary consequence being confusion and inaccessibility, the worry, on the side of the artist, that the listener will simply miss certain elements of music because their system cannot reproduce them adequately.

Using the ears alone, the perfect playback system should sound like it places a completely stable, coherent virtual sound source anywhere. We are nowhere near that at home. Paying top dollar for gear and speakers produces very marginal improvements at best. Even for the best of the best gear supported by measurements.

What we have right now is a number reasonably competing and excellent loudspeaker designs prioritizing certain features over others. It is known what is always bad (resonances, distortion), and how to assess those problems (perceptual thresholds). This knowledge is fairly precise, but not yet comprehensive or complete.

Again, there is a very significant degree of flexibility in all of this. Artists know that their music sounds different everywhere. They accept that. Listeners accept that. It is even interesting to manipulate music to taste, song by song, to produce new and amazing artistic effects. That's a highly important motivator in sample-based electronic music. Audiophiles do the very, very slight version of that at home by switching out components or by EQ. No one is precluding that possibility. The only question is whether or not you are aware of what choices and tradeoffs you are really making.
 
Coming from a background in physical acoustics and engineering - though not specifically in audio - I’ve been wondering why high-fidelity audio still seems so focused on signal accuracy - "waveform fidelity," for lack of a better term. By signal, I mean what microphones capture and what gets delivered near the ears.

To be fair, modern high-fidelity systems already account for a lot of the basics: that we hear with two ears, the limits of human hearing in frequency and dynamic range, how we perceive loudness, acceptable levels of various distortions, and so on. Perceptual insights have clearly shaped things like lossy compression, spatial audio, and room acoustics. But beyond that, I haven’t seen much that really engages with the more subtle ways we actually perceive sound.

Take harmonics, for example. Real-world sounds aren’t just single tones - harmonics are a universal property of oscillations, and any hearing system that evolved to make sense of sound likely developed around this structure. From very early on, brains - ours and those of many other animals - are wired to hear harmonically related frequencies as one sound, or as belonging together. It plays a huge role in how we recognize voices, distinguish instruments, and perceive musical harmony. So I can’t help but wonder - could that be used more directly in how we design audio systems?

I’d be curious to learn if there are any interesting efforts along those lines, or what the main challenges are.

Thanks.
tldr the thread, original recording/signal is very important, but most crap sold as hifi....meh
 
Again, there is a very significant degree of flexibility in all of this. Artists know that their music sounds different everywhere. They accept that. Listeners accept that.
Between these two posts, you've exhausted the topic of this discussion, at least for me. My hat’s off to you.Thank you!
Same from my side, although it got a bit lengthy. Regarding the preference scale, I keep my criticism, that people may sport another mindset when listening for an assessment versus listening for recreational purposes, the actual use of hifi speakers. The reported difference of mono assessment versus stereo assessment underlines it.

Anyway, there's a major gap to skip from the theoretical ideal to practical use in actuality. Starting from the particular recording to where I take seat or just laze around. I think I do not deserve Alex's (A Clockwork Orange) fate ;-)

Wilson Audio's take on fun:
 
1. Real engineers don’t need induction or deduction - they need production.
2. Science is the art of satisfying private curiosity on the taxpayer’s dime.
Engineers focus upon technology.
Science focus is new fundamental knowledge.(theory)
 
Back
Top Bottom