Another possibility is this: There might be a significant mis-match between the exit angle of the compression driver and the entry angle of the horn. This discontinuity will cause an undesirable reflection, part of which will propagate back down into the compression driver and bounce off the diaphragm. This bounced-back-off-the-diaphragm energy will then be reflected yet again when it encounters the discontinuity on its way out. So some of this reflection sort of ping-pongs between the diaphragm and the discontinuity as it decays.
I fully agree that it is a mechanism. But see further down for my view on its relevance as an explanation.
This kind of problem causes too small of a frequency response aberration to make its presence obvious on a frequency response graph,
As Toole would say, the FR plot needs to be of the appropriate resolution, and all will be revealed.
but the ear notices it because it's a distortion that happens later in time than the main signal (and therefore the ear/brain system's masking characteristic cannot mask its presence). The ear tends to interpret this kind of distortion as "harshness", and it becomes increasingly audible and objectionable as the volume level is turned up.
One would expect the exact same mechanism on non-horn speakers, with even longer time components, eg at the lip of a cone or the mounting plate of a dome.
Back in the days of long-throat compression drivers, they had a standardised flare rate, IIRC 18 degrees, to which bolt-on horns would match. It was important because the mechanism you described can screw things up. I'm not sure if that standard has been maintained into the modern era, because I don't think it is relevant with a modern ultra-short-throat horn, because the wavelength of the resonance would be above 20 kHz. Indeed, that is one of the advantages of an ultra short throat.
I remember attending lectures about 20 years ago at my local chapter of the AES, given by the Chief of Design at a commercial pro audio horn loudspeaker manufacturer, where he would throw onto the screen cutaway diagrams of commercial compression drivers, and show all the little pathways that the sound waves can take into various nooks and crannies in the driver housing. He would then show his modelling of the acoustics of the driver, and the effect of these pathways on the frequency response. He would then show measurements confirming this. It was always FR plots that showed it up, one just had to know where to look (which is where the modelling comes into its own and adds value).
When I combine the fact that non-horn speakers have the same mechanism, plus the fact that a horn has to have design issues for it to occur, I concur with you that any one horn might have it, but I don't think that this particular mechanism would explain a 'universal harsh horn sound'. (Which I don't think it true today, BTW)
cheers