For me, it's only natural to want my speakers to play back music as accurately as possible. I also want my voice to sound as close to talking to me IRL as possible. I definitely don't want my voice to be deeper because it makes me sound 'professional' or something. I don't want to sound better than I actually sound, I want to sound like how I actually sound.
I understand exactly what you mean. For me the
primary reason for seeking neutrality is simplicity and I would like to apply the same pragmatics to production. But I've accepted that it doesn't really work, not without investments and inconveniences (iso booths and calibrated mics etc.) I'm unwilling to make and that won't necessarily serve my
ultimate purposes any better. So I'm resigned to the standard engineering practice of scientifically-informed trial-and-error.
If what you want is a mic to use for your spoken voice, a headset mic is worth considering. It allows you to move around a bit and you can get a much higher ratio of direct signal to reflected than a lavaliere. That's a huge win assuming you are in a typically reflective room. Then adjust the position of the mic carefully to get the best trade-offs of signal strength, pops, esses, and proximity effect. If you can eq the signal then you can compensate for the latter but doing so adds complexity to things like video conference software. I've used a Shure SM35 quite a lot and does a decent job for the $$s.
Listen to a several different specified microphones with male spoken voice in the first couple of minutes of
"Phantomspeisung" von Felix Kubin.
Listen to my voice using the SM35 in conversation with Felix and Gavin in
A Round of drinks with Felix Kubin, in which Felix us using a quality LDC in a treated studio, and Gavin is using a cheap headset and is sitting too close to the corner of the room in which his computer is located. I used eq, multi-band compression on my and Gav's voice and dynamic leveling across all three.