My main reasoning though is to do with the tonality. For whatever reason they just don't sound right to me, despite measuring well. This is one of those "headphones behave differently on different heads" things most likely, and I've verified this with other listeners on multiple units as well.
Edit: so what I'm saying here is that the issue could just be down to how it is on MY head, or others who hear it the way I do, but maybe the majority still hear it closer to the graph.
The thing is, this stuff is all about confidence - confidence that a thing is likely to be right for you before buying it. This stuff is super expensive, so I get wanting to anchor that confidence to the measurement. But those of us who spend the majority of our time scrutinizing these products, we'll tend to find sometimes it doesn't line up with our experiences. I'm not saying don't get your confidence from graphs, merely that there may be more to the end experience for individuals than what you see there.
For the subjective stuff, I completely sympathize with some of the comments here - there's no concrete definition of 'good' that we can all agree on, or even when things are merely awaiting a more complete analysis of FR. And in that sense, subjective reports are far worse for providing that confidence. But... With that said, there is a kind of rhetoric that a number of us use in the same way. So for example, when Crinacle or Android describe something, I know exactly what they're talking about, because I've built up a familiarity with how they use those terms. So... While it's still worse, and less straightforward than a graph, you can still find something useful from subjective reports as well in my opinion.