Understanding Upsampling/Interpolation

j_j · Jan 22, 2021

pkane said:
Really? Please explain how this missed the mark: https://troll-audio.com/articles/time-resolution-of-digital-audio/

I have a small quibble with it, in which it does not take into regard the bandwidth around the carrier example it references, so that start and stop resolution are not necessarily accounted for. When you take that into account, you'll be right back to

1/( 2 * pi * bandwidth * quantizer steps) again.

But even with the somewhat faulty explanation involving only tones, it makes the point clearly enough.

j_j · Jan 22, 2021

RayDunzl said:
Sound travels (if I mathed it correctly) 0.343mm in a microsecond.

I don't think I could detect any sonic differences attributable to that level of precision.

There are a few 5-ish microsecond reports. This does seem extraordinary, but this number continues to show up. Such resolution must be on very, very specific, perceptually enabled kinds of signals, of course. One person has claimed 2 microseconds, but there are some confounding issues, and I doubt that is relevant.

Interestingly, the 5 microsecond number is almost exactly 30dB down the first attack on the wider cochlear filters. 30dB is the SNR of the inner hair cells. So there is even an arguable (BUT UNPROVEN!!!!) mechanism that could provide this, again, for very specialized signals. VERY specialized. And nothing any human wants to hear, too, seriously. Think short 10kHz modulated pulses for one example. A proper response to that is "ow!".

By the way, sine waves modulated by a gaussian pulse are also a great way to prove that sub-sample time resolution in a PCM system very obviously exists. You just shift the time a tiny bit, and there you are, purely in band (to 120dB or whatever you choose) signals moved a tiny fraction of a sample.

Of course, if very, very subsample resolution did not exist in PCM, modems, disc drives, orthonormal filter banks, etc, would all not work. As we all know, by using a cell phone and a computer, they do work.

NTK said:
If the original signal is not band limited, reconstruction is underdetermined and there are infinite number of solutions to he reconstruction problem. Now the question is which one of those infinite number of waveforms do you choose? And therefore JJ's question of what is the original? How do you define accuracy, i.e. what is a more "accurate" reconstruction when you have to guess the missing information?

Exactly. As the deck I pointed to in #25 that somebody apparently vaguely objects to (and who will not provide specifics) exactly what you will see as "error" in a time-domain signal that is downsampled without filtering is precisely predictable, and will show up in the passband (0-FS/2) of the new lower sampling rate. These will be frequencies not in the original passband, and will sound anywhere from kind of bad to intolerable headphone-throwing bad. And that is easily measured by simply comparing the in-band spectrum (at the lower sampling rate) to the original signal in that bandwidth.

For upsampling, images (frequency images here, not pictures) rather than aliases will occur, adding energy that wasn't in the original signal, and NOT adding any that was present in the original signal at the higher frequency (that information is gone forever). Often these are out of the pass band, and may not be audible, but will give your tweeter and other equipment literal heartburn.

SACD has similar problems, but with noise instead of images, if you don't filter it above 50kHz quite sharply to remove the high frequency noise. The noise arises from completely different sources, however.

Music1969 · Jan 23, 2021

j_j said:
For upsampling, images (frequency images here, not pictures) rather than aliases will occur, adding energy that wasn't in the original signal, and NOT adding any that was present in the original signal at the higher frequency (that information is gone forever). Often these are out of the pass band, and may not be audible, but will give your tweeter and other equipment literal heartburn.

Hi JJ

I thought upsampling to higher rates removes images from audible band?

Someone did some measurements here:

https://www.audiosciencereview.com/.../is-dac-ultrasonic-rf-output-important.10600/

j_j · Jan 23, 2021

Music1969 said:
Hi JJ

I thought upsampling to higher rates removes images from audible band?

Someone did some measurements here:

https://www.audiosciencereview.com/.../is-dac-ultrasonic-rf-output-important.10600/

UPsampling removes images OF the "audible band" from higher frequencies. Well, if it's done right, that is

What the image shows is how distortion products can ALIAS back down to in-band components. Different thing, but same mathematics, really.

(I have more time now to explain.) what the illustration and measurement shows is not imaging, it is a form of aliasing.

If I take, for instance, a sine wave of frequency (42100/3) and clip the daylights out of it symmetrically, there will be a very large 42100 Hz third harmonic component. Since that aliases down to 2kHz, now you've added a 2kHz tone that didn't originally exist.

This is why digital clipping often mega-sucks. Now, if you oversample enough that you have no distortion above fs_upsampled/2, then you downsample by filtering properly, this does not happen. For sharp discontinuities (like clipping) this can be rather a high oversampling rate.

KSTR · Jan 23, 2021

Upsampling (stuffing N-1 zeros after each sample for a upsampling factor of N) itself doesn't. It's the job of the post-filter to cut off the image lobes afterwards. Only a true (quasi-inifinite) sinc filter fully rejects the images and gives flat bandwith up to fs/2, though none of this is required in real life.

https://dsp.stackexchange.com/quest...nterpolation-does-it-insert-additional-freque

j_j · Jan 23, 2021

KSTR said:
Upsampling (stuffing N-1 zeros after each sample for a upsampling factor of N) itself doesn't. It's the job of the post-filter to cut off the image lobes afterwards. Only a true (quasi-inifinite) sinc filter fully rejects the images and gives flat bandwith up to fs/2, though none of this is required in real life.

https://dsp.stackexchange.com/quest...nterpolation-does-it-insert-additional-freque

More to the point, a constant delay FIR can remove the images to a level below quantization level. This is a common approach. Those who worry about time-domain issues and nonlinear interactions with the ear may choose an "apodizing" filter, in which some of the constant delay terms are replaced by minimum phase terms. An argument remains in that quarter. I'm of the "if the filter is long enough, not a problem". While this seems odd, forcing a filter to have more in-band ripple and higher band rejection in a shorter filter actually makes the filter's time response worse. Yes. Really.

Sharpi31 · Jan 23, 2021

Apologies for drifting off-topic slightly, but I’ve found Nvidia‘s AI Upscaling to make a significant subjective improvement to sub-UHD resolution video playback on a UHD display. I’ve always hated traditional sharpness processing on video displays (the obvious artifacts ruin the image for me) so have been very impressed that their AI Upscaling seems to increase perceived detail with much reduced artifacting.

“To predict the upscaled images with high accuracy, a neural network model must be trained on countless images. The deployed AI model can then take low-resolution video and produce incredible sharpness and enhanced details no traditional scaler can recreate. Edges look sharper, hair looks scruffier and landscapes pop with striking clarity.”

https://blogs.nvidia.com/blog/2020/02/03/what-is-ai-upscaling/

Have I been duped by clever marketing, or is there something of value here?

j_j · Jan 23, 2021

Sharpi31 said:
Apologies for drifting off-topic slightly, but I’ve found Nvidia‘s AI Upscaling to make a significant subjective improvement to sub-UHD resolution video playback on a UHD display. I’ve always hated traditional sharpness processing on video displays (the obvious artifacts ruin the image for me) so have been very impressed that their AI Upscaling seems to increase perceived detail with much reduced artifacting.

“To predict the upscaled images with high accuracy, a neural network model must be trained on countless images. The deployed AI model can then take low-resolution video and produce incredible sharpness and enhanced details no traditional scaler can recreate. Edges look sharper, hair looks scruffier and landscapes pop with striking clarity.”

https://blogs.nvidia.com/blog/2020/02/03/what-is-ai-upscaling/

Have I been duped by clever marketing, or is there something of value here?

Very likely something that's real. I know there are less "neural" things done, like conserving edges in an image during upsampling that are anything but linear processes that do much better jobs than analytic upsampling (yes, that's what started the argument in the first place, I think?). So I am sure there's something to it. This does take a crapload of FLOPS, so NVIdia as a provider makes rather a lot of sense.

BDWoody · Jan 23, 2021

j_j said:
And nothing any human wants to hear, too, seriously. Think short 10kHz modulated pulses for one example.

You never know. There are people who like Diana Krall...

Killingbeans · Jan 23, 2021

Sharpi31 said:
Have I been duped by clever marketing, or is there something of value here?

There's definitely happening some things:

But like j_j says, video is one thing, audio another.

j_j · Jan 23, 2021

BDWoody said:
You never know. There are people who like Diana Krall...

Or, alternatively, Neil Young or Robert Zimmerman.

Both of whom are wonderful songwriters.

bigguyca · Jan 23, 2021

Sharpi31 said:
Apologies for drifting off-topic slightly, but (1) I’ve found Nvidia‘s AI Upscaling to make a significant subjective improvement to sub-UHD resolution video playback on a UHD display. I’ve always hated traditional sharpness processing on video displays (the obvious artifacts ruin the image for me) so have been very impressed that their AI Upscaling seems to increase perceived detail with much reduced artifacting.

“To predict the upscaled images with high accuracy, a neural network model must be trained on countless images. The deployed AI model can then take low-resolution video and produce incredible sharpness and enhanced details no traditional scaler can recreate. Edges look sharper, hair looks scruffier and landscapes pop with striking clarity.”

https://blogs.nvidia.com/blog/2020/02/03/what-is-ai-upscaling/

Have I been duped by clever marketing, or (2) is there something of value here?

(1) 100% agree. I've communicated with other people who hold the same opinion. The improvement is striking.

(2) Yes

j_j · Jan 23, 2021

bigguyca said:
(1) 100% agree. I've communicated with other people who hold the same opinion. The improvement is striking.

(2) Yes

It's sounding like this is actually maturing now. I wonder if I can get some of my 48MP landscapes upsampled.

bigguyca · Jan 23, 2021

j_j said:
It's sounding like this is actually maturing now. I wonder if I can get some of my 48MP landscapes upsampled.

In a way the NVidia processing seems conceptually somewhat like the bombes used to break Enigma messages before and during WWII. The bombes evidently worked with some knowledge of the "correct" answer, measured a LOT of potential outcomes and picked potential decodes according to preestablished criteria. Here is an excellent YouTube video that goes into many technical details concerning the breaking of Enigma, as well as how Enigma machines worked.

Breaking Enigma - Exploiting a Pole Position - YouTube

In addition to giving a lot of the credit for breaking Enigma to the Polish and French, where credit is due, it also provides insight to the role of the bombes, which a Pole seems to have originally invented. Reading about the Pole who broke the original Enigma encoded messages, designed the original bombe, and built an Enigma machine only from seeing very limited information is a humbling experience. Many other remarkable individuals were involved as well. As an aside, there appears to be a lot of fiction, and misallocated credit, in the movie, The Imitation Game.

This link to an extensive document on the history breaking the Naval Enigma is from the YouTube video noted above. The documents provides more in-depth information. I haven't read all of the document as yet so I can't comment on its quality.

Cryptographic History of Work on the German Naval Enigma (ellsbury.com)

j_j · Jan 23, 2021

bigguyca said:
Here is an excellent YouTube video that goes into many technical details concerning the breaking of Enigma, as well as how Enigma machines worked.

Breaking Enigma - Exploiting a Pole Position - YouTube

That was quite an effort, but in fact there is absolutely a 'right answer' to figuring out what rotors are in use, and what starting positions are used.

It's still a deterministic result. You need some sequence of plaintext along with the encoded text to start.

KSTR · Jan 23, 2021

I've already mentioned it, anybody have tried this : https://www.stereo.net.au/forums/topic/261591-extreme-filtering-software-upscaling/ ?

bennetng · Jan 24, 2021

For audio AI (or not) upscaling (DSEE HX?), the result only needed to be good enough so that people cannot visually differentiate it from a real Hi-Res recording. Perhaps, even with intentionally added HF idle tones, simulated modulator noise and such. In this way, skeptics will start to suspect if the Hi-Res files are real or upscaled, then it would be quite amusing.

The modern version of breaking Enigma is more like breaking MD5 or SHA-1.
https://shattered.io/

Lambda · Feb 26, 2021

Sorry i found this Thread after i started this Poll:
https://www.audiosciencereview.com/...-sample-rate-are-you-using.20721/#post-687227
Lot of useful information here and i hope i can add some recurses:

This page shows results of sample rate converters comparison:
Speex 1.2rc1 with quality 1, 5 and 10 (1 being the lowest, 10 - the highest)
Soxr 0.1.1 with quality LQ, MQ, HQ and VHQ (which are low, medium, high and very high quality)

https://lastique.github.io/src_test/

https://thewelltemperedcomputer.com/KB/SRC.htm

https://ccrma.stanford.edu/~jos/resample/

dowsample examples from various software:
https://src.infinitewave.ca/

Pulsaudio SRC testings.
http://archimago.blogspot.com/2015/10/measurements-look-at-linux-audio-alsa.html

j_j it appears to me your expert on this topic!
So i have a view dump theoretical questions.

j_j said:
here is a specific, precise mathematical definition, and it can be executed on a low-end computer in real time these days.

A perfectly band limited signal can be mathematical perfect resampled with specific function?
therefore not changing in band information and not crating out of band signals.
we/you are talking about "Sinc" function windowing?

Pleases excuse my simple therms...
This ideal Sinc response to a pulse is theoretically infinitely long? but at at some point the "wiggls" become super small (smaller then one bit?)
since we only have limited input and output precision and we don't need an ideal filter a shorter approximation to a ideal Sinc response can be used?
Saving filter length/complexity?

The "longer" the "filter window" the more delay is added and the more DSP "power" is needed?

Thanks in advance

j_j · Feb 26, 2021

Lambda said:
A perfectly band limited signal can be mathematical perfect resampled with specific function?
therefore not changing in band information and not crating out of band signals.
we/you are talking about "Sinc" function windowing?

Pleases excuse my simple therms...
This ideal Sinc response to a pulse is theoretically infinitely long? but at at some point the "wiggls" become super small (smaller then one bit?)
since we only have limited input and output precision and we don't need an ideal filter a shorter approximation to a ideal Sinc response can be used?
Saving filter length/complexity?

The "longer" the "filter window" the more delay is added and the more DSP "power" is needed?

Thanks in advance

A bandlimited signal can be resampled precisely to any desired accuracy, of course depending on how many flops you wish to spend on it.

If this is an offline process, the delay is immaterial, you can always remove it.

There are two things that affect the cost, given a fixed resampling ratio. The first is "how close to the original fs/2 do you need to get" and the second is "how far down do you want any artifacts".

But the point is that you can do accurate to any given arbitrary accuracy, given enough computer.

Lambda · Feb 26, 2021

Thanks for this replay!

j_j said:
of course depending on how many flops you wish to spend on it.

If this is an offline process, the delay is immaterial, you can always remove it.

If its happening "on the fly" in an DSP/DAC flops, memory and time are limited so i assume corners get cut?

j_j said:
"how far down do you want any artifacts".

This would be my next question...
assuming we don't relay care too much about noise over 22khz i assume the filter can be optimized.

Since volume and bitdepth also often get changed in digital audio applications would this be the step to introduce dithering and noise shaping.
in a way extending in band SNR by sacrificing out of band SNR

The root of this questions is:
Assuming Procession Power, Storage and Bandwidth (in the PC) is basically free and unlimited. It's not inherently stupid or worse to up sample first and then send the data to the DAC, instead of sending "bit perfect" unchanged data to the DAC and relaying on its internal up sampler?

Understanding Upsampling/Interpolation

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Active Member

Major Contributor

Chief Cat Herder

Major Contributor

Major Contributor

Senior Member

Major Contributor

Senior Member

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Similar threads