Shannon's Theorem (I call it the Nyquist Theorem, or Nyquist-Shannon Theorem, but it is all the same thing),.....
The most recent term is imo WKS-Theorem (Whittaker, Kotelnikow, Shannon) although Lüke /1/ in his interesting article mentioned that Raabe also independently developed and published the Sampling Theorem in 1938. For simplicity i tend to stick to the shorter Shannon Theorem.
.
...... relates back to information theory and defines the information bandwidth that can be successfully recovered from sampled data. You could apply simple interpolation, which does generate "new" samples in the sense that you are creating data that was not there in the original, but those samples would not contain any additional information so perhaps the problem is simply the way I define "new" samples. Even if you do nothing but zero-pad the data, the added zeros are "new" data not present in the original bit (sample) stream.
Iirc this is the third or fourth thread on "upsampling or oversampling" and i think in each thread we wrote essentially the same as in this one, which means we did not resolve our difference in description, but i was wondering why it even emerged.
I think we now are able to address it and explain the different definitions/terms in general/other fields and wrt their usage in (consumer) audio.
Taken your example of "zero padding" - which means inserting N-1 additional zero samples between the original samples for a N times increase in sampling rate - the mere inserting is often called "upsampling" in other fields, done by an "upsampler" stage - so no interpolation is done at this state.
In audio - and in our typical discussions about upsampling and oversampling - the term "upsampling" denotes the combination of the "upsampler" stage and the following "interpolator" stage (which is a low pass digital filter) and the result of the operation is a data stream ready to be input data for a DAC unit (the DAC unit including then the analog anti-imaging filtering needed).
When upsampling data, which is what I thought was the discussion, the "new" samples must fit the theorem with respect to the new (faster) sampling rate, thus now the samples are no longer bound by the lower rate. That is, the new values may not fit the constraints of the lower sample rate, but can have information bandwidth greater than the original.
There is the difference wrt the process overall. In audio the terms "upsampling/oversampling" means typically (i remember only two manufacturers doing it differently, see below) that the new data stream looks as if the original analog signal (still with the same bandwidth restriction as before) were sampled with a higher sample rate.
No additional content above the nyquist frequency of the original signal should be resultin in the audio band from this process.
In other words, the additional ("new") samples created during upsampling can contain information not in the original samples due to the new (higher) Nyquist rate. To use numbers, if I upsample a CD-rate bit stream from 44.1 kS/s to 88.2 kS/s, I can now create data with frequencies up to 44 kHz where before it was limited to 22 kHz. Some papers use "extrapolation" to describe data that contains frequency information beyond the original signal. There are predictive algorithms that do try to add higher-frequency data based on the trend of the original samples, at least in the RF world; audio-rate converters are not my day job.
As said above, in the audio field it was mainly asserted to preserve the original content (approximately "perfect") while enabling usage of less compromised/gentler analog reconstruction filters. (or as in the Philips case to get better results due to noise shaping although using only 14 bits converters in the beginning)
As said above, i remember only two manufacturer (in the old(er) days) that deliberately tried to add content (or accepted this as a side effect) above the nyquist frequency, one was Pioneer with their so-called "Legato link" technology:
(copyright by Pioneer)
they obviously wanted to add content above nyquist, allegedly due to better perceived audio quality overall,
and the other was Wadia with their so-called "french curve" process which was an interpolation routine based on Bezier curves, also alledgedly done due to better perceived audio quality overall.
At this point I suspect we are far beyond what most of the readership cares about and down to differences in whatever courses and experience we have in defining the terms...
I hope that it helps our members as the distinctions are quite important for the understanding.
/1/ Lüke, Heinz Dieter.The Origins of the Sampling Theorem. IEEE Communications Magazine, April 1999, 106