You are conflating two entirely different things.
'How good" (well) a speaker happens to be depends on how well it works as a transducer, converting electrical signals into sound waves. In this sense, the loudspeaker is a machine, and it has a job to do. Tests and measurements can give us a clear, objective picture of how well it does its job, just as tests and measurements tell us how well a car engine performs, an electrical generator performs or a heart monitor performs.
I think that one of the overlooked aspects of tests and measurements is that they weed out the under-performing examples, the stuff with inappropriate characteristics or the stuff that's just pure junk. This is my personal first-step use for tests and measurements. Some people use tests and measurements to identify good, better and best, as in a "race to the top", so to speak, but I think that's a second-step aspect ... it comes into play after you've separated the field into the thumbs-up and thumbs-down lists.
However ... this is impersonal, unemotional work using calibrated instruments to define electrical and mechanical characteristics. It has nothing to do with human emotion, and nothing to do with personal preference. The phrase you used ("personal enjoyment") is emotional and subjective, and not at all scientific.
Some people enjoy booming bass, or a smiley-face response curve, or high distortion.
There is no gainsaying personal preference. You like what you like. But that doesn't mean that you are hearing the closest replication of the recording. Personal preferences that include a highly affected playback can mask information on the recording and then you lose information.
Not only that, but manufacturers cannot predict what your particular personal preferences are. Accuracy is
one clear goal. Tests and measurements will give a clear view into the degree that any audio gear - electronic or transducer - hits or misses the mark.
Designing and manufacturing for affected characteristics is a swamp, a morass of
almost infinite deviations from the standard. It increases costs enormously, and it also makes identification of individual characteristics (and the degree of their deviation) a time-consuming and difficult process. IMO that's not good ... not good at all.
This is not to include the fact that personal preference is subjective, and subjective sensory judgement is not only biased, but
changes with time and surroundings. Tests and measurements do not; they are reproducible. Reproducibility is one of the primary aspects of the Scientific Method.
There are two ways to use tests and measurements. One is to accept accurate playback of the recording as your goal. The other is to learn, by dint of hard work and long-time experience, the exact nature of your desired deviations from standard. This is called correlation. It means that you have to become familiar with the concept of accuracy and the particular nature of your preferences. It's doable, but takes a lot of work.
Personally, I take accuracy as my goal because I'm lazy. I don't want to work any more than I have to.

