Depends on the recording, of course... Do you have data? I don't, just what I remember from old AES papers and more recently reports on various audio fora, plus a vague memory of my previous measurements. 10 dB is not a lot of dynamic range, but then again that's the world we live in these days. I listen to a lot of jazz and some classical and I'm pretty certain soft to loud swings are more than 10 dB (half loudness).
This does not make sense to me: "Do dynamic signals really need more power than steady state to be driven to a certain level?" To reach a certain level you need the power to get there whether steady-state or "dynamic" peaks. Peaks may not last long so many amplifiers can handle short-term peaks about their rated power without clipping. This used to be specified as "dynamic headroom" and there was a standard test for it (a 20 ms burst IIRC). I do not see that much anymore, not sure why. For a while dynamic headroom was a selling point, with Bryston and NAD coming to mind as having 3-6 dB headroom per spec.
Power-wise, there's not much argument for having much greater than you need. It is worth noting that 3 dB is a relatively small amount of headroom SPL-wise but requires twice the power so there is a decent rationale for decent power reserves. HIgher-power amplifiers also tend to have lower output impedance so do better driving real-world speaker loads (the amps are closer to an ideal voltage source) and higher standing bias current so stay in class A longer (lower crossover distortion, though that is pretty much a solved problem). Too much power has drawbacks, natch, including higher energy costs (electric bill), more heat, potentially more hiss, etc.
I calculated what I expected to need to hit 105 dB SPL since that was THX reference level with a little margin and called it good. I rarely listen that loudly but the kids visit now and then...
Actually, I think you might be right. I've been checking on my cheap SPL meter. What I consider a loudish listening level, the readings will be averaging between 80-84 dB, with it rarely getting into the 90s. The problem is that this is A weighted, (so the unit should be dB(A)).
After looking into it, I've learned A weighting rolls off low frequencies.
I found an app for my iPad which will give measurements for instantaneous/peak measurements in both A and C weighting. C weighting is much closer to a flat frequency response, which I think is what is relevant for thinking about needed power for an amp.
The app A weighting measurement is spot on with my SPL meter...but when I have an average listening level of about 82 in A weighting, the A peaks won't get much above 90dB(A) but the C weighting goes well above 100dB(C) on peaks!
What's weird is that my amp (Kenwood KA-5500) has VU power level meters and at such a level they barely peak above 1 watt. VU meters are an averaging meter, but I find it hard to believe the amp is being driven in to clipping when these meters are down in the low part of their range. The meters have a low level setting, when put on the high level setting, which shows full scale up to 55 watts, the meters are barely moving.
Is it possible the amp is being "asked" to put out 200 watts on the peaks? (It's rated at 55 wpc). I'm not hearing any clipping. Perhaps the amp is doing some kind of "soft clipping?"
I'm listening on ADS L710, which have a given sensitivity of "92 db SPL at 1 watt RMS input at 1 meter." This seems roughly close to what I'm "ballpark" measuring.
It's a complicated subject. There is no exact specification for this concept, as it's dependent on the duration and shape of the measuring "window", the duration of the measurement, and the characteristics of the signal.
A simple approximation of "effective dynamic range" is the "crest factor"of a signal, which is the ratio of the peak level to the RMS average level.
This excellent article has an in depth discussion, and finds that the crest factor of most recordings is between 12-18 dB.
https://www.soundonsound.com/sound-advice/dynamic-range-loudness-war
The more relevant issue to this discussion is how we conceive of the "loudness" of a signal, which includes both average and peak levels, and how much demand our desired average listening puts upon our amplifier.