This is actually a complicated subject, so get ready for my long response. A lot is said about the angles... but the angles are actually more about the imaging between adjacent channels, NOT where the speakers should go in the room. It's important to remember that Atmos as a format is allocentric (referenced off the room itself) rather than egocentric (referenced off the listener). However, where it gets complicated is that with small room acoustics, you inherently have to take a more egocentric approach, whereas a large theater has enough room for sound to propagate (and inherently more speakers, minimizing the angular problems). Contrary to what Matthew says here, we actually DO hear phantom imaging pretty well on the vertical - just not by way of the same mechanism that we hear stereo imaging. (See "Virtual Sound Source Positioning Using Vector Based Amplitude Panning" by Ville Pulkki, "Comparison of Amplitude Panning Approaches on ITU BS.2051 Loudspeaker Layouts with Height" by Michael Romanov, NHK's research on 22.2 layouts, Wilfried Van Baelen's Auro3D research, etc.). The question is how we prioritize lower speaker counts based on where we hear better. The research generally agrees that we can hear imaging very well with a 30 degree vertical separation, slightly less so at 45 but still good, and anything over 45, people generally start losing all specificity. And that's mostly in the FRONT region of our hearing. Directly overhead, we have even less spatial resolution, and behind us (the region between ear-level and rearmost heights) even less so. Dolby is well aware of this, and you can tell if you understand how their actual Atmos renderer works under the hood (and their guides actually take this into account, which I will explain later). I'll outline several scenarios by the numbers to try to make it all clear.
1. On the lateral (side to side in your room), objects in the renderer at full height do ZERO steering from 0.0-0.25 and 0.75-1.0 coordinates. This is part of why you don't have height rows placed too widely in the room. If we were taking a strictly allocentric approach to placement, this would mean that you would always put the height rows themselves at 25% and 75% of the room's width, since that would perfectly align with the expectation of the renderer itself. HOWEVER, remember when I said that small room acoustics are inherently more egocentric? Because of this, if your seating extends outside of that middle 50% of the room laterally, you don't necessarily want the speakers to be within the seating. This is why Trinnov's guide and RP22 both advocate more for about a 40 degree separation from the side surrounds, give or take depending on the seating. But for my money, if coverage won't be compromised by putting the height rows strictly at 25%/75% the way they are in the renderer's coordinate expectation, you should generally stick with that on the lateral. (This is not supposition, by the way. This has all been tested via pre-outs using different layouts and the tests on the Spatial Audio Calibration Toolkit.) If you want the underlying logic of why Grimani shoots for the closer height row placement, this is part of it. The renderer literally does no actual steering laterally until the middle 50% of the "room" in the panner interface. This is important to keep in mind if you're shooting for an Auro-centric placement where the surround heights would be on the sidewalls above the surrounds. The height panning will inherently sound "off" laterally if you prioritize that layout over an Atmos-style layout. Trinnov/RP22's 40 degree placement strikes a balance that generally works for both formats.
2. Now that we have the lateral placement of heights out of the way, let's consider the longitudinal placement. We'll start with 4 height channels. First, we have to talk about how Dolby's speaker designations affect the rendering. Remember how laterally, there's no steering done on the left and right quarters of the room for objects at max Z? If you set your AVR/Processor to Top Front/Rear, the same buffer happens longitudinally. Objects do not get steered off of the top front/rear speakers until they are outside of that 25% buffer zone. HOWEVER, if you set the designations to Front/Rear Height, that 25% limitation is no longer applied by the renderer, and cross-channel pans will immediately start to go into the adjacent speaker. So if we were taking a strictly allocentric approach, we would actually want to do front/rear heights at the room boundaries so that the full longitudinal range of the room could be represented. You'll notice that Dolby's theaters do actually extend the height channels near those points. The problem in the home space is that in most rooms, doing this leaves a pretty large angular gap between the front/rear heights. So while you are gaining spatial resolution between the two layers at the very front and back of the room, you're losing it overhead. This is why Dolby tends to recommend that if you can only do 4 heights, you want to do top front/rear. It's an egocentric compromise to an allocentric paradigm. Accordingly, they're using the 45/135 angles, which the research shows gives a good balance between vertical imaging while still leaving the overhead close enough for generalized imaging above (which is also then made better by the standard mixing practice of moving pans through that space rather than relying on static objects in that region, with the exception of the height beds).
3. Now let's consider higher speaker counts than 4 heights. Once you go 6 heights, you'll notice that Dolby's own guides now instead show them angularly placed at 30/90/150. Or, if you look at their diagrams, front/rear height + top mid. Remember how the 25% buffer zone where no panning occurs at the front and back of the room is gone when using the front/rear height designation? This is why Dolby's x.x.6 recommendation, if you look at their guides, shows only front/rear height + top mid. This is because you no longer need that 45/135 compromise to maintain overhead resolution, because you have top mids in play to fill the gap. This is the "best of both worlds" layout, giving you cohesive vertical imaging (especially in the front), and smoother panning across the array longitudinally. Having the top mids in place also gives you a better point source for sidewall imaging of objects placed between the two layers on the side wall of Dolby's virtual panner interface. Again, none of this is supposition - people have actually tested each of these scenarios using the SACT and direct measurement of the pre-outs to see when pans start/stop during the object moves. And if you go look at actual content in an object viewer, whether Trinnov's or the YouTube channel Object Viewer, you'll be surprised at where objects are actually getting placed. Anyone who thinks there isn't content meant to image between the two layers is just dead wrong.
Now what does this all mean in practice? Here's where things get fun. The answer is: IT DEPENDS ON YOUR HEARING. Some of us hear steering between the two layers better than steering overhead. Some of us hear overhead steering better than on the vertical. To a certain extent, since pans are usually moving fairly quickly, our brains tend to fill in the gap between channels perceptually anyway... so you may not hear a difference in practice regardless of which region of the room your personal hearing tends to prioritize. If you only have 4 heights, the answer is easy - top front/rear gives you the best compromise between the two. BUT if you personally don't hear much resolution overhead anyway, a front/rear height placement may actually work better in practice for you. This is why my ultimate advice is: Go hear it both ways if you can before you do your own x.x.4 installation.
If it's x.x.6 or more that we're talking about, you actually DO want speakers to be closer to the wall/ceiling intersection from a strictly allocentric standpoint... if that placement maintains Dolby's general angular range for a front/rear height + top mid placement. But in EITHER CASE, laterally, that 25% panning buffer happens no matter what.
You talk about hearing what the mixer hears in the mix room in your own space. That gets us into what's known as the "circle of confusion"... because what layout did that mixer have? You have no way of knowing. If they were running top front/rear, they had that 25% longitudinal panning buffer in place and so they made their pans based on how that sounds in practice in their mix room (though they at least have a visual reference for the object in the panning interface that makes it work with other layouts anyway). But now, more mix rooms are using the higher speaker counts. For example, Netflix advocates a minimum layout of 7.1.6, and they tend to be mixing with a front/rear height + top mid configuration, meaning no panning buffer longitudinally. So if you set your room up based on a mix room paradigm of x.x.4, are you then hearing what the mixer heard in a x.x.6 layout? Eh... mostly. Don't get too hung up on it. But now that you know the actual logic of the renderer itself, it should help inform you as far as speaker placement that conforms such that the expectation in the renderer matches the actual point sources in the room... which is all the videos you're talking about were trying to explain. If you have any further questions, please let me know. I'll be glad to give yet another lengthy answer with more information than you ever asked for."