1.
IMO it is misleading to create such squarewaves as they clearly hurt the Nyquist-Shannon sampling theorem.
A squarewave consists of its base sinewave plus odd sinewaves 3rd, 5th, 7th, 11th ... order, see also
https://en.wikipedia.org/wiki/Square_wave
Now at e.g. 48 kHz samplewrate the top frequency is 24 kHz. So there is no squarewave above 8 kHz, only sinewaves, because the max. frequency is limiting the harmonics.
So the Audacity squarewave is not a true bandlimited squarewave, indeed it creates many aliasing frequencies.
2.
A squarewave can be considered as a step into positive direction followed by a step into negative direction. The steps are repeating and the step times define the frequency of the squarewave.
Now if the step response of a speaker is known the response on a squarewave excitation can already be estimated quite well.
Indeed you can also convolve the pulse measurement of the speaker with the squarewave.
With a typical passive speaker with step response tweeter first, then midrange driver and finally bass driver you will get a squarewave response as expected.
3.
Usually a speaker does not transfer very low frequencies, it has a high-pass behaviour. By this reason the resulting "squarewaves" show a falling top and a rising bottom.
Summary: it is quite 'dangerous' to stick with the ideal picture of a squarewave, taken from the analog world. The result in the "sampling" domain is heavily influenced by the Nyquist criteria. Furthermore the highpass behaviour of the speaker changes the shape of the squarewave. So with low frequency squarewaves the top/bottom show up tilted. At higher frequencies this gets hidden by zooming into picture. Here the limitation of the bandwidth creates the typical oscillations on top/bottom. Finally the resulting squarewave is deformed by the characteristics of the speaker pulse response which can already be seen by the step response display calculated from the pulse response.