I’m going to assume you’re not intentionally trolling and provide an earnest response. Implementing a playback system, the recorded material is the only reasonable candidate to be the reference. It’s the only element that is distributed identically to all systems that will play back that specific version of a song (or movie, whatever), and therefore CAN be referred to. What happened prior to that is virtually unknowable and certainly not reproducible so cannot be the “reference”.
In the course of discussion, Serge has rejected any number of conventional uses of terminology, and has flat-out contradicted established mathematics, and then tried to equivocate his way around some of it, while remaining attached to confusing, unusual terminology. So this discussion is a bit hard to sort out between issues. First, there IS language defined here, and it helps communication work better when people actually use the language as intended, and not play some ad-populum game of "I want this to be distortion". The issues of linear vs. nonlinear vs. random processes are well established mathematically, and separating them out WHEN CALCULATING THE ERROR SIGNAL is actually kind of key to getting a meaningful measurement. What's more, his methods have no way of determining if a signal may rise above the masking level, or above absolute threshold (or room noise floor)
without consideration of spectrum. Yeah, in an average listening room if all error is down 110dB you're probably very safe, but that's not a very useful number, and it still only applies to signals in the electrical space, and only if carefully applied.
For one of the reasons that linear (which does not create new frequencies) and nonlinear (which does) are so very, very different TO HUMAN HEARING, please go here:
http://www.aes-media.org/sections/pnw/pnwrecaps/2019/apr2019/ Distinctions between linear and nonlinear are key.
BUT
It's kind of novel for me to get accused of being a subjective troll. Maybe you should check into my position on audiophile BS sometime.
It is, generally (obviously not always) possible to provide better envelopment, etc, in most any live recording, or in many produced recordings, and create something more like an original sensation. Now, you can't ever do that with loudspeakers without 5 channels (and you still have to do the production right, which is kind of rare), and you're better off with 7, but you can do quite a bit in 2 channels, either over headphones or speakers, to add some good spatial sensation that was lacking. Now, you're creating a sensation, NOT an exact soundfield duplicate, but we'll go there in a minute. Of course, this presumes there was an original acoustic, which is often not what one has. For some things, "original" can only be "final mix", of course.
BUT when you argue "virtually unknowable", and appeal to ignorance, you're dead wrong. I can measure a lot more than you appear to imagine.
First, at any given spot in a space, 1 point is easily measured with all 4 soundfield variables (dx, dy, dz, p). Obviously if you have more such devices (soundfield microphones) you can monitor more than one spot in space (and you should, even though most people don't). So, now we have a STANDARD, an analytic standard, a testable, falsifiable STANDARD for what was going on at, oh, let's say, 2 spots in a room, separated by say, oh, 6" or so. (yes, that distance is not pulled out of my hat, measure your head).
So, when you deliver sound, do you deliver the same 8 variables to the spot in the room where the listener's ear canal openings are? Good question.
Um, no you don't, generally. Furthermore, generally you don't get anything like that sensation, either. Fortunately you can create a good sensation without all of that information, when you consider how human hearing actually works. Which is another reason that error signal alone does not tell you what you need to know. It can not separate out the indetectable (in human terms) from the rather blatantly awful.
You can improve the sensation with processing, and yes, you can MEASURE your degree of success. You can do it in any number of ways, by asking listeners (in blind tests, of course) to localize where things are. You can even measure the 4D soundfield outside their ears (that way they get to use their own HRTF's and such, which is the best way to go), but I will say that's a pain in the butt.
Yeah, it's a pain in the butt. Believe it. Every bit of equipment is fighting you when you do that.
This sounds like something we see in super best audio friends forum. They stick with subjective methods claiming that it is not possible to establish a reference.
You, as well, may note that I am proposing nothing "subjective". You also note that I have pointed to one analytic reference just above.
I will point out to both of you that "subjective" is a very loose word to use, as well. Perception is subjective, but it is not a completely mysterious process. For example, how to create envelopment, directional sensation, distance sensation etc, are understood in the perceptual space (and can be shown in terms of analytic signals at the ear, although the audio industry is, as usual, years behind, and that and even more so in the "high end" that continues to flail about with untestable premises), and yes, it is possible to work with a recording and figure out some of what will be missing in the perceived signal. If you HAVE more of the information, you can do much, much better in terms of sensation as well as analytic measurement. More information is always useful, in the modern case where one can make wavefield measurements, etc, however, the trick is to learn WHAT PART of all those measurements is the important data. Yeah, that's what I do for a living. No, it's not something I can summarize in a paragraph.
Which is my point to Serge, frankly. His choice of "better" is rather, shall we say, dogmatic. Yeah, if we're talking about an amplifier, or electronics in general, recording playback systems (not capture!), an error signal is a very reasonable standard. But when we're talking about microphones, speakers, headphones, sorry, no, even defining the error signal is difficult. So when we take an electronic signal, MEASURE aspects of it, and modify it according to established understanding of human perception, to his measure, that's "distortion". For a simple example, consider room EQ. If you use room EQ (or speaker EQ, a much smarter idea in general) you are, according to the difference signal, increasing the error, which Serge persists in calling distortion. (you're also doing this with linear processing, and the lack of loss of information doing linear processing as opposed to nonlinear is key here, too, really)
But more is doable, and it's absolutely not trolling.
Serge is working on something that's fine for simple errors, be they linear processing, distortions, or noise mechanisms. BUT
1) Distortion refers to nonlinear processing
2) Distortion does not refer to linear processing (and linear processing is reversible to noise floor, despite his denial above, which was then excused by equivocation by him saying 'nobody does that', even though ADC's, DAC's, tape systems, LP systems all do it automatically as part of their normal function).
3) Noise can take many forms, including signal-modulated or mediated noise. It's an interesting point, but it's NOISE.
All of those contribute to the error signal.
THAT is where this argument started, and yet Serge refuses to use standard terms, and would rather equivocate than communicate clearly.
And now you guys are calling me an audiophile. You probably don't even want to know how deeply, horribly professionally insulting you're being when you say that, but you are.