Upsampling 16/44.1 collection a good idea?

Hayabusa · Apr 19, 2024

sunjam said:
Bottom line, we don't have the original picture.

If you would repeat your test starting with an original image that you down-sample and then up-sample you can evaluate how good your up-sampling matches your original

Hayabusa · Apr 19, 2024

sunjam said:
100% agree. Lost is lost.

In my very humble opinion, this is one of the major misunderstanding of the reason why people do upsampling.

Let me re-emphasize (my own view): upsampling is not to recover missing information

Again, to me, without fully understanding how DAC works internally, people would not understand the benefits of upsampling. I learned / understanding more once I know noise shaping, dithering, etc...

Before that I was like most people, consider upsampling is attempting to recover missing information (especially for the higher freqency range)

Indeed normal up sampling will not recover any lost information.

But I could imagine a new way of up-sampling:
Recognizing and classifying sounds and based on that reconstruct missing harmonics.
This looks like image reconstruction where you basically identify objects in the image, for instance, lines and use that information to produce better reconstructed pixels.

Tell · Apr 19, 2024

Well yeah maybe you could upsample an 44.1khz audio signal using AI to magically get some new harmonics above 22.5khz, but this is as useless as taking an 8K image or video and using AI to upsample it to 16K, you just won't see any difference because it's beyond what our eyes can see (unless maybe when view on an IMAX screen, but if we're taking home consumer usage). And if we're talking just normal upsampling without any AI magic or something alike then it's even more useless because it won't give us any new information whatsoever. I really don't see why anyone would bother at all.

Hayabusa · Apr 19, 2024

Tell said:
Well yeah maybe you could upsample an 44.1khz audio signal using AI to magically get some new harmonics above 22.5khz, but this is as useless as taking an 8K image or video and using AI to upsample it to 16K, you just won't see any difference because it's beyond what our eyes can see (unless maybe when view on an IMAX screen, but if we're taking home consumer usage). And if we're talking just normal upsampling without any AI magic or something alike then it's even more useless because it won't give us any new information whatsoever. I really don't see why anyone would bother at all.

its audioSCIENCE here...

Science sometimes likes to do things, just because we can!

Who cares if its useful or not

Tell · Apr 19, 2024

Hayabusa said:
its audioSCIENCE here...

Science sometimes likes to do things, just because we can!

Who cares if its useful or not

I thought we used science to make progress, and just upping resolutions just because we can won't give us any kind of progress so I still don't think there's any reason to bother.

Hayabusa · Apr 19, 2024

Tell said:
I thought we used science to make progress, and just upping resolutions just because we can won't give us any kind of progress so I still don't think there's any reason to bother.

can you see the emoticons I used?

antcollinet · Apr 19, 2024

Keith_W said:
Unless I am very much mistaken, he is talking about downsampling from a higher resolution source or from analog. What you are proposing is taking a 44.1kHz source file, upsampling it, and then sending it to the DAC. The limiting factor here is the 44.1kHz source.

Here is a quick example.

View attachment 364626

Original image, 1000 x 800 pixels.

View attachment 364624

Downsampled to 100 x 80 pixels, then upsampled back to 1000 x 800.

In the same way, if you have a 44.1kHz source file, no amount of upsampling will ever recover the information that was lost.

That example though creates a false impression.

Our eyes can clearly see the spatial resolution difference between the two files.

However, our ears are unable to distinguish the difference in spectral content between 44.1 and higher resolution audio files.

antcollinet · Apr 19, 2024

sunjam said:
Looks to me he is talking about Digital to Analog process (I could be wrong). i.e. how the DAC re-construct analog audio signal from a digitized source (say 44.1k / 16 bit PCM).

In your case, you have the original picture (i.e. analog signal) and convert it to 100 x 80 file (i.e 44.1k) and attemp to re-construct the original picture out from the 100 x 80 file. It is definitely something we want to avoid at all cost.

However, it is not what we are discussing here (I think, I could be wrong again).

To me, we are given the 100 x 80 file (i.e. 44.1k) only and we are doing our best to re-construct the original picture (i.e. the original analogy signal). The motion picture posted earlier show how error is getting into the picture during the re-construction process. Please remember, we don't have the original picture.

Using picture analogy, it is more like the below:

View attachment 364628

We were given the low-res picture on the left. Using upsampling could re-construct something like the picture on the right. Someone may prefer "bit perfect" re-construction (i.e. the one on the left); some may perfer a "smoother" upsampled output.
(Of course, it is a bad analogy as audio and picture upsampling are quite different. Here, I just want to show the high level idea).

Bottom line, we don't have the original picture.

That is also a false narrative. You are effectively (in your left hand picture) showing stair stepping. We don't get that in digital audio. When the signal is reconstructed it is perfectly smooth as the original captured signal (but band limited to 20KHz or so.

We don't need to upsample for that.

For your demo to be realistic to our ears capability - your left hand image would already need to be higher resolution than our eye can see. You would then perceive identical images on the left and the right - even though the right hand image were even higher resolution.

Tell · Apr 19, 2024

Hayabusa said:
can you see the emoticons I used?

I saw them, I just didn't read them the same way you apparently intended.

Hayabusa · Apr 19, 2024

sunjam said:
"When the signal is reconstructed it is perfectly smooth as the original captured signal (but band limited to 20KHz or so." <== my understanding of this is not 100% accurate. (Please correct me if I am wrong)

Yes, the reconstructed signal could be perfectly smooth even with 44.1k but it is not the same signal as the original. The following motion picture shows this issue clearly. Yes, it is happending with 44.1k PCM for signal with frequency less than 20k Hz. i.s. audio range frequency

View attachment 364636

Both the blue wave and red wave are as smooth as they can. Do they sound the same?

I prefer blue instead of red.

What caused the red not close to the blue? It is the sampling rate. Red is reconstructed based on a low resolution digitized source.

The higher the sampling rate, the closer the red wave will be to the blue wave (i.e. the original)

For video analogy, to me, 44.1k is like 720p. People are doing 4k and 8k these days.

For audio, I totally agreed with you that the audio reconstruction artifacts (errors) is not like the stair stepping for video. As I said earlier, video is very different from audio. The picture is kind of like showing you the high-level idea visually what upsampling is doing.

However, the above red/blue waves show you the actual artifacts that may happens in the audio reconstruction process.

You are showing a reconstruction of a clipped signal!
This is not a normal reconstruction.

Hayabusa · Apr 19, 2024

sunjam said:
LOL... you really know what is happening.

To my understanding, similar issue occur even under non-clipping situation. Correct? Please le me know if I am wrong.

Sorry, you are wrong, in normal conditions (F < Fs/2 , sufficient bits per sample, accurate clock) the reconstruction can be as perfect as you want.

danadam · Apr 19, 2024

sunjam said:
What caused the red not close to the blue? It is the sampling rate.

No, it is caused by not enough headroom. Blue wave goes above 1.0 and below -1.0. Higher sampling rate won't help here.

Hayabusa · Apr 19, 2024

sunjam said:
Exactly, the keypoint here is sufficient bits per sample. 16-bit good enough?

The quantization here, I believe, make a big difference too.

16 bit would give you 1/65536 error in amplitude for each sample.
You are right that reconstruction of in-between samples could in theory have a slightly larger deviation from 1/65536 due to the interpolation.

Tell · Apr 19, 2024

sunjam said:
The science of upsampling technology could make something like below: from the left picture to the right picture.

Of course, as I said, someone would prefer the "bit perfect" picture on the left. However, there are also a lot of people are enjoying the upsampled picture on the right with upsampling science. Depends on your definition of "make progress". To me, IMHO, it is making progress.

View attachment 364637

What you call "bit perfect" is how the data is stored, not how it should be presented. It's just a nearest neighbour upscaling, which in audio terms is equivalent to sample and hold which is not what we want out from a DAC. The middle image is just a linear upscale which is equivalent to "connect the dots" in audio, which is not what we want from a DAC either.

Made another example image for you with image upscaling above and the equivalent audio signal below. First one is stairstepped and not the correct way, second one is "connect the dots" and not correct either, while the last one is smooth and analog, just like it should be
What's very important to remember here is that both the images and the audio signal is zoomed in by A LOT and not how it should be viewed or heard, so coming back to what I wrote before, upsampling 44.1khz to something higher is like upsamling an 8K image/video to 16K, we just won't see any difference because it's beyond what our ears and eyes can resolve.

danadam · Apr 19, 2024

sunjam said:
Sorry, my bad. It may not be a good example.

However, the bottom line here is that: using 44.1/16 bit won't be able to perfectly reconstruct the original signal (even the frequency of the original signal is less than under half of the sampling rate).

It will be able to do that, to the accuracy of 16-bit, i.e. -96 dBFS. In other words, the only difference between the original signal and the reconstructed signal will be noise at -96 dBFS level (assuming flat dither).

antcollinet · Apr 19, 2024

sunjam said:
What caused the red not close to the blue? It is the sampling rate.

No, it isn’t.

What you are showing there are. Inter-sample overs. This is a form of digital clipping resulting from the reconstructed waveform being too large to fit with in the full scale digital signal range.

It is analogous to the input signal to an amp being to high for the input circuit, resulting in analogue clipping.

As long as the captured waveform is correctly scaled so these limits are not met, then it is perfectly reconstructed.

antcollinet · Apr 19, 2024

sunjam said:
Exactly, the keypoint here is sufficient bits per sample. 16-bit good enough?

The quantization here, I believe, make a big difference too.

Quantisation errors only result in quantisation noise. With 16 bits the quantisation noise floor is down around -95dB - yes, that is good enough.

I think this video will help you with your understanding
(Its Monty Time )

antcollinet · Apr 19, 2024

sunjam said:
Both Algo A and Algo B will give you "fake" values to connect the dots

No - again not fake values. This is a mathematical exact (within the limits of filter accuracy) of the original waveform. There is only one bandwidth limited signal that can fit between the dots.

(Also explained in the video I linked above)

Hayabusa · Apr 19, 2024

antcollinet said:
No, it isn’t.

What you are showing there are. Inter-sample overs. This is a form of digital clipping resulting from the reconstructed waveform being too large to fit with in the full scale digital signal range.

It is analogous to the input signal to an amp being to high for the input circuit, resulting in analogue clipping.

As long as the captured waveform is correctly scaled so these limits are not met, then it is perfectly reconstructed.

No it's real clipping

antcollinet · Apr 19, 2024

Hayabusa said:
No it's real clipping

Being digital doesn't mean it isn't real. Any more than describing clipping as analogue prevents it from being real.

Upsampling 16/44.1 collection a good idea?

Addicted to Fun and Learning

Addicted to Fun and Learning

Active Member

Addicted to Fun and Learning

Active Member

Addicted to Fun and Learning

Master Contributor

Master Contributor

Active Member

Addicted to Fun and Learning

Addicted to Fun and Learning

Addicted to Fun and Learning

Addicted to Fun and Learning

Active Member

Addicted to Fun and Learning

Master Contributor

Master Contributor

Master Contributor

Addicted to Fun and Learning

Master Contributor

Similar threads