Artificial playback requires the integration of the listener into a technical system - meaning the stereo triangle, not moving a bit and so on. Personally, I find no delight in that at all. It reminds me of Alex during the aversion therapy session in the famous movie A Clockwork Orange. Wilson Audio builds speakers that look exactly like that - not beautiful. What is the motivation for using audio in the first place? That would be the real question for psychoacoustics. Is perfection necessary to achieve those purposes? Or is it a silly end in itself?
I have a complex reply. Allow me to meander.
Comparing a stereo sweetspot to the Clockwork Orange scene is hilarious and very true! Stereophonic sound is inherently unnatural. There is no natural circumstance in which an identical sound comes at you from multiple directions. And yet that's exactly what two-channel and various multichannel formats do (I hope it's well known that stereo is a specific technique and not some particular number of channels). Phantom sources rely on this artificial means of manipulating sound through timing and level differences.
The other side to this is that stereo's requirements are fairly flexible as a consequence of loudspeaker improvements. Early loudspeakers, and even some loudspeakers today, are characteristic in having radiation patterns that force a person to keep their head in a particular spot for optimal direct sound. The current designs focusing on consistent on- and off-axis responses horizontally and vertically have, as a consequence, reduced the need to play with loudspeaker distances, angles and strict listening positions. The physically correct listening position is always centered and symmetrical, but there is less need to be so correct, particularly with measurement and adjustment EQ being so common.
But the other, more pressing thing is that you don't seem to understand what psychoacoustics is, or are being excessively flippant.
What is the motivation for using audio in the first place? That would be the real question for psychoacoustics. Is perfection necessary to achieve those purposes? Or is it a silly end in itself?
As I said many times before, to listen to audio first and mostly is personal involvement, not neurobiology.
Psychoacoustics is the study of the subjective experience of sound. It is tightly related to our biology, since the physical structures of our body draw the limits and abilities of our hearing, and to the physical world, the circumstances which cause the formation of sound, studied in acoustics.
Toole's (Floyd's?) finding to use preference as a quality parameter is a big achievement to actually depart from the 'neurobiology' or 'psychoacoustic' route when evaluating the merits of gear. The latter isn't useful, as far as I follow his argumentation. It is all to vague, indirect. Not the least, personally I feel a little bit embarrassed when others speak of me as an ear/brain apparatus.
But when it comes to preference, though, it may change in different situations. Because of human nature I imagine certain filters involved, one for evaluating a single speaker listening to a few seconds of Fast Car, the most revealing piece of contemporary music, or two listening for fun to Glenn Branca's Symphony #1 in stereo on my couch.
Floyd Toole's work is important because loudspeaker design had few, sporadic perceptual references before his highly comprehensive experiments and research into the field. In other words, he definitively introduced psychoacoustics into loudspeaker design, and his references were highly established findings from acoustics, audiology and other related fields.
A word on psychoacoustics. Why is that the main reference point and not, for example, the neurology of the auditory cortex? The main work of psychoacoustics is the development of listening tests and their interpretation. Neurology is the physical examination and testing of the brain and its processing and transformation of sensory signals. The reason is to do with the history and moral side of science. We cannot directly probe the brain, because even the thinnest electrode is destructive and coarse as it cuts through tissue to reach its target. This kind of direct intervention into the human body for experiments cruel, and so for the most part a lot of what we know about hearing had been discovered through careful listening tests. I can go into gruesome detail here about what happens outside of listening tests but it's not necessary to make the point.
These listening tests are a proxy for studying biology directly. Since that is off-limits, you can study human capabilities and limits instead. What that has led to is a clear set of design criteria for loudspeakers, because the way we judge what we hear is related to how we hear, i.e., the how is found in the underlying mechanisms and biology.
However It is simply ignorant to equate preference in ordinary circumstances to preference in blind circumstances. When hearing is isolated by design, and layers of cognitive context are stripped away through tight experimental controls, people tend to make the same judgments. That doesn't mean that the judgments are exactly the same. It just means that people make judgments within limits, that those limits can be rigorously defined, and that those definitions, although clear, are flexible enough to allow a range of outcomes. Trends, correlations, and directions. Not absolutes. But, all of these, tightly related to biology.
There's a particularly blunt argument that has been suggested even in this thread, that your buying decisions should be dictated by science and nothing else. It's stupid because it misrepresents personal needs and the science. There is no best loudspeaker because science cannot answer that question yet. We don't know enough about the body or our own capabilities. What we have, from the science, is a set of guides that let us examine what a loudspeaker can do and come to reasonable judgments about how it will sound under a variety of circumstances.
In general, the work on electrical and digital systems has produced results which allow for, from a perceptual point of view, perfect signal delivery. Vanishly low distortion and extreme dynamic range. Perfect, unwavering frequency response and extended bandwidth. This perceptual point is taken even further with the excellent research underlying compression algorithms. Their testing and design has a large part accomplished in listening tests, and an equally important part based on prior knowledge of the physical mechanisms of hearing. When I wrote above that we don't have perceptual models to probe the subtleties, codecs are a good example. The distortions they generate are highly complex, content-dependent and not meaningfully described with traditional single or multitone tests. Some may think codecs aren't important to audiophiles because you can go and buy lossless media. That's a highly limiting perspective because new immersive and spatial technology requires a very high channel count to work. Lossless delivery and high channel count massively increase bandwidth and storage requirements. So perceptual codec studies try and work out the most efficient means of perceptual rather than lossless preservation of the signal. It's simply unavoidable given how media is consumed and delivered these days. For it to be workable, we need better psychoacoustic models which precisely define what kind of error is acceptable before it becomes objectionable.
I can also discuss, if you like, audiology, hearing aids and cochlear implants. The driving force for noncommercial academic work is helping those with impaired hearing and understanding their experience. There are miles yet to go before damaged hearing can actually be restored. Tinnitus is also a major problem and we have no cure in sight. Like I said above, we simply don't know enough.
Where reproduction technology is showing the most important progress is with transducing systems: loudspeakers, headphones and microphones. This is because these systems have the greatest errors associated with them. In general, anytime anything involved in reproduction "has a sound", that characteristic sound is error. It is
not timbre. Timbre is an acoustic or musical feature of the source. Loudspeakers, electronics, digital and physical media are not sources. You can of course treat them that way, but there are consequences to that. The primary consequence being confusion and inaccessibility, the worry, on the side of the artist, that the listener will simply miss certain elements of music because their system cannot reproduce them adequately.
Using the ears alone, the perfect playback system should sound like it places a completely stable, coherent virtual sound source anywhere. We are nowhere near that at home. Paying top dollar for gear and speakers produces very marginal improvements at best. Even for the best of the best gear supported by measurements.
What we have right now is a number reasonably competing and excellent loudspeaker designs prioritizing certain features over others. It is known what is always bad (resonances, distortion), and how to assess those problems (perceptual thresholds). This knowledge is fairly precise, but not yet comprehensive or complete.
Again, there is a very significant degree of flexibility in all of this. Artists know that their music sounds different everywhere. They accept that. Listeners accept that. It is even interesting to manipulate music to taste, song by song, to produce new and amazing artistic effects. That's a highly important motivator in sample-based electronic music. Audiophiles do the very, very slight version of that at home by switching out components or by EQ. No one is precluding that possibility. The only question is whether or not you are aware of what choices and tradeoffs you are really making.