@TheWalkman touched on this, but I'll extend the thoughts a little further.
A loudspeaker is designed to operate as a transducer, not like a spring. Let's ignore the interaction between the driver and the cabinet for a moment and just focus on the output, it is meant to produce air movement that corresponds to the electrical input, and it is specifically designed to reduce resonance. With music, these are continuous wave forms, and the drivers are continuously moving with the signal. It doesn't move at 400 Hz, stop, then start moving at 800 Hz to play different notes. Rather in real sounds you have contributions at many frequencies all the time. Even for a single tone of of a violin playing an A, real sounds have harmonics at multiple frequencies.
Looking in the time domain, how does the amplitude vary with time from left to right, the wave form might look something like this. It doesn't really look like a sin wave, and that's what the forward and back driver movement looks like too. This is a G played on a violin, and you can see a speaker doesn't exactly go from playing one frequency to the next.
How does one obtain the spectrum of a sound? By using the Fourier theorem, which states that any periodic waveform can be broken down into sine waves.
www.tremblingsandwarblings.com
You can analyze this in the frequency domain for a totally different perspective. You can see that same G has a fundamental, but then there are harmonics extending way out. All of those frequencies added together are what give you the wave form you see above.
How does one obtain the spectrum of a sound? By using the Fourier theorem, which states that any periodic waveform can be broken down into sine waves.
www.tremblingsandwarblings.com
Hopefully seeing the above it becomes clearer that a speaker's job is to tranduce, or match, the signal going into it regardless of complexity. To do that job faithfully we measure the frequency response (for any frequency input, how much of that frequency do you get as output), and also measure the distortion (for any frequency in, how much of other frequencies do you get out).
Speakers aren't perfect though, and one of your intuitions is right that a speaker can't instantaneously jump from one state to the next. Take this example of feeding a speaker a square wave which is basically hell for how a driver naturally wants to move. You can see the response is far from perfect.
Most of us know that trying to acoustically reproduce a square wave is an exercise in futility. I've tried it with decent headphones, by mounting a measurement mic in the earpiece and sealing it off, but the result didn't even resemble anything like a square wave. Doing it with multiway speakers...
www.whatsbestforum.com
To your original question about how a speaker behaves when moving from one frequency to the next, you can describe any musical signal as the sum of the frequency components like the purple chart above. Going from 40 Hz to 33 Hz in your example, looks like a sin wave at 40 Hz that turns into a sin wave at 33 Hz in the time domain. In the frequency domain though if you ran a fourier transform on the signal, you would see at the transition time that it looks like many different frequencies that can be added together that can recreate that complex waveform. The speaker's ability to create each of those frequencies with linearity and low distortion corresponds to the ability to play that sound accurately.