This is going to seem a little trolly, because it is a basically existential criticism of this site. But I'm actually sympathetic to the goals of audio objectivism, and this post arises from my frustration with how it's so often carried out. In particular, one of the most perplexing things to me about the world of audio objectivism, as embodied on this site especially, is that it doesn't seem to even be trying to achieve its stated goals.
Because the problem at hand is that sighted listening impressions are -- famously, notoriously -- unreliable. It's provably impossible to remove expectation bias and to prevent non-auditory cues from strongly coloring audible impressions. Combine that with the weakness of long-term auditory memory, and it's clear that most subjective impressions/reviews aren't worth the photons they're displayed with. We need better!
And so we know how to do better: You do blind comparisons, and try to find what product is preferred when people don't know what it is. This is how
Floyd Toole and Sean Olive compared speakers at the NRCC and Harman; this is how Wine Spectator
compares and scores wines. (As an aside, how embarrassing is it that wine magazines have vastly better methodological rigor than audiophile magazines!)
If you want to take it a step further from that, you can try to identify patterns based on what you find in the blind comparisons (as Toole and Olive did with the "spinorama" measurements), and then do further experiments to see how strongly correlated those measurements are with blind preference -- keeping in mind that it's the blind preference that's the ultimate arbiter, and the measurements that are the hypothesis being tested.
But that's not what this site's flavor of objectivism does. This site doesn't do any blind listening comparisons at all. It appears to just take as a given, for no obvious reason, that the set of measurements Amir performs are the set of measurements that would correlate to blind listening preference. This seems like a strange assumption, particularly because those in the subjectivist camp already know about these measurements, and believe that they are
not correlated with audible performance at the levels generally measured here, that there are other factors in play that matter.
What's worse is, even if you believe that this set of measurements is what's important, it's not clear that any of the measurements (except for the most broken units) rise to any level of significance whatsoever. Take the "SINAD" tests for DACs, for instance. Amir is savage about
the Schiit Modi Multibit, because it measures worse than other products. Due to a second harmonic at -80dB, and higher order harmonics and other noise at -100dB, its SINAD measurement gets a "red" rating, as obviously inferior to other, better-measuring products.
But why? The AES has done actual listening tests, and they've experimentally established limits to the audibility of THD, and they're
much, much higher than we're dealing with here, particularly for 2nd order harmonic distortions. If your hypothesis is that measurements like SINAD tell the full story of listening quality, then based on what we know about perceptibility of distortion, your conclusion has to be that any product with measurements like the Modi Multibit is audibly perfect, and all this green/yellow/red stuff is marketing fluff that conveys no useful information.
So it seems to me that there are two possible scenarios in play:
1. There are audible differences between (non-broken) DACs and amps, such that you can tell the difference between a Modi Multibit and a Topping DX70 by listening. If that's the case, these differences aren't found in the numbers this site is collecting and publishing, so wouldn't it make more sense to go back to blind testing to collect more data about where listener preferences genuinely lie, and then try to identify a new hypothesis about which numbers
do matter for that?
2. There are no audible differences between (non-broken) DACs and amps, such that you could buy basically anything above a certain quality level, and it'd be audibly indistinguishable from anything else. In that case, these measurements should just be pass/fail, with no reason to give any more detailed breakdown (as there's no benefit to over-engineering inaudible "improvements"), and recommendations should be based on price, build quality, and ergonomics. But rather than assuming this, it seems like you'd want to
prove it first, by actually doing those blind preference tests, like Toole did with speakers.
Either way, it's hard for me to see the value in taking well-understood measurements, and then grading equipment based on how well it performs at solidly inaudible levels on those metrics. That has the appearance of science, but not the substance. It's not even rewarding good engineering, because good engineers don't gold-plate irrelevant metrics, they focus on what actually matters.
You've probably heard the old saw about the man looking for his keys in the parking lot. A good samaritan comes over to help him, and after some fruitless minutes, asks whether the man is sure that he lost the keys here. "Oh no," the man says, "I lost them in the bushes, but I'm looking here because that's where the light is."
Audio objectivists have spent too long looking where the light is; It's time to start looking in the bushes.