If there are cancellation issues then it will also occur in the stereo image during playback
Not true, cancellation of out of phase content will be 100% when performed on the signal, much less than 100% when L+R interact acoustically due to 1) two ears each having different distances to the speakers, 2) reflected / scattered sound.
This is why simple stereo widener effects that invert and delay part of the signal sound interesting on speakers, and hollow and lifeless summed to mono.
In general you can get pretty objectionable comb filtering on some content summed to mono in DSP or electrically, or not, depending on how it was recorded and mixed. By and large the effect shouldn't be horrible, as
@ernestcarl mentions it's standard practice (or at least used to be) to check mixes in mono for big problems when summed. I still think everything summed to mono sounds kinda lifeless compared to stereo, but in isolation I never think how awful it is. On the other hand I only listen to mono on my JBL Go or countertop radio.
IMO there are no perfect solutions here, you can just listen to L or R alone, which IMO gives cleaner sound but you tend to miss entire sounds that are panned the other way, you can sum with the aforementioned issues, or you can just put stereo speakers close together. I don't think there is much point to using a mid side processor, as the drawbacks are similar to listening to just one channel but it takes more work to get there.
The reason to take a stereo recording down to one channel is for compatibility with a system that only has one speaker. I don't think there are any clear and significant advantages if that's not a restriction you're dealing with.