I think the main difference is that studio people, as part of one phase of their job, directly compare live sound and loudspeaker sound dozens or hundreds of times a day. Most - not all - audiophiles listen in a bubble with no real interest in live sound.
Ironically, in my experience this makes studio people less interested in loudspeaker choice, not more. You learn fast that no loudspeaker is remotely capable of reproducing the real thing. All you need is a tool with enough clarity to let you hear the reductive job you're doing. Everyone has half a dozen types they're happy with. Most people have personal favorites and comfort choices, but very few have dealbreakers.
I do doubt this part, in my understanding the recording/mixing guy in the mixing room, can only listen to loudspeaker sound which comes from the mic-> speaker, where only the artist in the recording room listen to live sound?? I thought their experience is tuning the sound to be "good" for replay using the systems they use, or have in mind what earbuds/car speakers/BT speakers would sound good while not F up the mix in higher res gears. If I am one of the producing chain guy actually I won't care too much of say, off axis or even if the speaker in "neutral" or "honest", only "good enough" as I would need to have some reference to make the average Joe's speaker still sound ok/good for my mix, so anything not limited in SPL and kind of neutral could get the job done, after all, the audiophile with hundreds of thousand dollar system and room are the <0.01%.
IME Hobbist usually seek for perfection way more than pro in the gears they use, say only geeks seek out the latest computer/cameras every generation, for pros, they choose the best/adequate setup they can, dump big money into it, and continue business until the gear breaks/ fall behind too much to get the job done right, then go for next upgrade, usually the pros choose on reliability, instant replacement service within warranty and familiarity of use over absolute quality. say for photography which is once my part time job, using the same brand of camera is always my choice as I don't need to re-adapt to the controls every new camera, even if my used to brand is falling behind half or even one generation in technical perfection, I think same goes for studios, if you adapted to some good enough but maybe less good speakers and you already adapted that to how it translate to the layman's system, you won't want the balance of it to change, even it's changing to be more transparent to source/ even live performance, coz the 99% of wild speakers out there is what feed you