• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Understanding Upsampling/Interpolation

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.

I have a small quibble with it, in which it does not take into regard the bandwidth around the carrier example it references, so that start and stop resolution are not necessarily accounted for. When you take that into account, you'll be right back to

1/( 2 * pi * bandwidth * quantizer steps) again. :D But even with the somewhat faulty explanation involving only tones, it makes the point clearly enough.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Sound travels (if I mathed it correctly) 0.343mm in a microsecond.

I don't think I could detect any sonic differences attributable to that level of precision.

There are a few 5-ish microsecond reports. This does seem extraordinary, but this number continues to show up. Such resolution must be on very, very specific, perceptually enabled kinds of signals, of course. One person has claimed 2 microseconds, but there are some confounding issues, and I doubt that is relevant.

Interestingly, the 5 microsecond number is almost exactly 30dB down the first attack on the wider cochlear filters. 30dB is the SNR of the inner hair cells. So there is even an arguable (BUT UNPROVEN!!!!) mechanism that could provide this, again, for very specialized signals. VERY specialized. And nothing any human wants to hear, too, seriously. Think short 10kHz modulated pulses for one example. A proper response to that is "ow!".

By the way, sine waves modulated by a gaussian pulse are also a great way to prove that sub-sample time resolution in a PCM system very obviously exists. You just shift the time a tiny bit, and there you are, purely in band (to 120dB or whatever you choose) signals moved a tiny fraction of a sample.

Of course, if very, very subsample resolution did not exist in PCM, modems, disc drives, orthonormal filter banks, etc, would all not work. As we all know, by using a cell phone and a computer, they do work.


If the original signal is not band limited, reconstruction is underdetermined and there are infinite number of solutions to he reconstruction problem. Now the question is which one of those infinite number of waveforms do you choose? And therefore JJ's question of what is the original? How do you define accuracy, i.e. what is a more "accurate" reconstruction when you have to guess the missing information?

Exactly. As the deck I pointed to in #25 that somebody apparently vaguely objects to (and who will not provide specifics) exactly what you will see as "error" in a time-domain signal that is downsampled without filtering is precisely predictable, and will show up in the passband (0-FS/2) of the new lower sampling rate. These will be frequencies not in the original passband, and will sound anywhere from kind of bad to intolerable headphone-throwing bad. And that is easily measured by simply comparing the in-band spectrum (at the lower sampling rate) to the original signal in that bandwidth.

For upsampling, images (frequency images here, not pictures) rather than aliases will occur, adding energy that wasn't in the original signal, and NOT adding any that was present in the original signal at the higher frequency (that information is gone forever). Often these are out of the pass band, and may not be audible, but will give your tweeter and other equipment literal heartburn.

SACD has similar problems, but with noise instead of images, if you don't filter it above 50kHz quite sharply to remove the high frequency noise. The noise arises from completely different sources, however.
 

Music1969

Major Contributor
Joined
Feb 19, 2018
Messages
4,669
Likes
2,845
For upsampling, images (frequency images here, not pictures) rather than aliases will occur, adding energy that wasn't in the original signal, and NOT adding any that was present in the original signal at the higher frequency (that information is gone forever). Often these are out of the pass band, and may not be audible, but will give your tweeter and other equipment literal heartburn.

Hi JJ

I thought upsampling to higher rates removes images from audible band?

Someone did some measurements here:

https://www.audiosciencereview.com/.../is-dac-ultrasonic-rf-output-important.10600/
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Hi JJ

I thought upsampling to higher rates removes images from audible band?

Someone did some measurements here:

https://www.audiosciencereview.com/.../is-dac-ultrasonic-rf-output-important.10600/

UPsampling removes images OF the "audible band" from higher frequencies. Well, if it's done right, that is :D

What the image shows is how distortion products can ALIAS back down to in-band components. Different thing, but same mathematics, really.

(I have more time now to explain.) what the illustration and measurement shows is not imaging, it is a form of aliasing.

If I take, for instance, a sine wave of frequency (42100/3) and clip the daylights out of it symmetrically, there will be a very large 42100 Hz third harmonic component. Since that aliases down to 2kHz, now you've added a 2kHz tone that didn't originally exist.

This is why digital clipping often mega-sucks. Now, if you oversample enough that you have no distortion above fs_upsampled/2, then you downsample by filtering properly, this does not happen. For sharp discontinuities (like clipping) this can be rather a high oversampling rate.
 
Last edited:

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,730
Likes
6,100
Location
Berlin, Germany

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Upsampling (stuffing N-1 zeros after each sample for a upsampling factor of N) itself doesn't. It's the job of the post-filter to cut off the image lobes afterwards. Only a true (quasi-inifinite) sinc filter fully rejects the images and gives flat bandwith up to fs/2, though none of this is required in real life.

https://dsp.stackexchange.com/quest...nterpolation-does-it-insert-additional-freque

More to the point, a constant delay FIR can remove the images to a level below quantization level. This is a common approach. Those who worry about time-domain issues and nonlinear interactions with the ear may choose an "apodizing" filter, in which some of the constant delay terms are replaced by minimum phase terms. An argument remains in that quarter. I'm of the "if the filter is long enough, not a problem". While this seems odd, forcing a filter to have more in-band ripple and higher band rejection in a shorter filter actually makes the filter's time response worse. Yes. Really.
 

Sharpi31

Active Member
Forum Donor
Joined
May 20, 2020
Messages
125
Likes
305
Apologies for drifting off-topic slightly, but I’ve found Nvidia‘s AI Upscaling to make a significant subjective improvement to sub-UHD resolution video playback on a UHD display. I’ve always hated traditional sharpness processing on video displays (the obvious artifacts ruin the image for me) so have been very impressed that their AI Upscaling seems to increase perceived detail with much reduced artifacting.

“To predict the upscaled images with high accuracy, a neural network model must be trained on countless images. The deployed AI model can then take low-resolution video and produce incredible sharpness and enhanced details no traditional scaler can recreate. Edges look sharper, hair looks scruffier and landscapes pop with striking clarity.”

https://blogs.nvidia.com/blog/2020/02/03/what-is-ai-upscaling/

Have I been duped by clever marketing, or is there something of value here?
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Apologies for drifting off-topic slightly, but I’ve found Nvidia‘s AI Upscaling to make a significant subjective improvement to sub-UHD resolution video playback on a UHD display. I’ve always hated traditional sharpness processing on video displays (the obvious artifacts ruin the image for me) so have been very impressed that their AI Upscaling seems to increase perceived detail with much reduced artifacting.

“To predict the upscaled images with high accuracy, a neural network model must be trained on countless images. The deployed AI model can then take low-resolution video and produce incredible sharpness and enhanced details no traditional scaler can recreate. Edges look sharper, hair looks scruffier and landscapes pop with striking clarity.”

https://blogs.nvidia.com/blog/2020/02/03/what-is-ai-upscaling/

Have I been duped by clever marketing, or is there something of value here?

Very likely something that's real. I know there are less "neural" things done, like conserving edges in an image during upsampling that are anything but linear processes that do much better jobs than analytic upsampling (yes, that's what started the argument in the first place, I think?). So I am sure there's something to it. This does take a crapload of FLOPS, so NVIdia as a provider makes rather a lot of sense.
 

BDWoody

Chief Cat Herder
Moderator
Forum Donor
Joined
Jan 9, 2019
Messages
7,039
Likes
23,178
Location
Mid-Atlantic, USA. (Maryland)
And nothing any human wants to hear, too, seriously. Think short 10kHz modulated pulses for one example.

You never know. There are people who like Diana Krall...
 

bigguyca

Senior Member
Joined
Jul 6, 2019
Messages
483
Likes
620
Apologies for drifting off-topic slightly, but (1) I’ve found Nvidia‘s AI Upscaling to make a significant subjective improvement to sub-UHD resolution video playback on a UHD display. I’ve always hated traditional sharpness processing on video displays (the obvious artifacts ruin the image for me) so have been very impressed that their AI Upscaling seems to increase perceived detail with much reduced artifacting.

“To predict the upscaled images with high accuracy, a neural network model must be trained on countless images. The deployed AI model can then take low-resolution video and produce incredible sharpness and enhanced details no traditional scaler can recreate. Edges look sharper, hair looks scruffier and landscapes pop with striking clarity.”

https://blogs.nvidia.com/blog/2020/02/03/what-is-ai-upscaling/

Have I been duped by clever marketing, or (2) is there something of value here?


(1) 100% agree. I've communicated with other people who hold the same opinion. The improvement is striking.

(2) Yes
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
(1) 100% agree. I've communicated with other people who hold the same opinion. The improvement is striking.

(2) Yes


It's sounding like this is actually maturing now. I wonder if I can get some of my 48MP landscapes upsampled. :D
 

bigguyca

Senior Member
Joined
Jul 6, 2019
Messages
483
Likes
620
It's sounding like this is actually maturing now. I wonder if I can get some of my 48MP landscapes upsampled. :D

In a way the NVidia processing seems conceptually somewhat like the bombes used to break Enigma messages before and during WWII. The bombes evidently worked with some knowledge of the "correct" answer, measured a LOT of potential outcomes and picked potential decodes according to preestablished criteria. Here is an excellent YouTube video that goes into many technical details concerning the breaking of Enigma, as well as how Enigma machines worked.

Breaking Enigma - Exploiting a Pole Position - YouTube

In addition to giving a lot of the credit for breaking Enigma to the Polish and French, where credit is due, it also provides insight to the role of the bombes, which a Pole seems to have originally invented. Reading about the Pole who broke the original Enigma encoded messages, designed the original bombe, and built an Enigma machine only from seeing very limited information is a humbling experience. Many other remarkable individuals were involved as well. As an aside, there appears to be a lot of fiction, and misallocated credit, in the movie, The Imitation Game.

This link to an extensive document on the history breaking the Naval Enigma is from the YouTube video noted above. The documents provides more in-depth information. I haven't read all of the document as yet so I can't comment on its quality.

Cryptographic History of Work on the German Naval Enigma (ellsbury.com)
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Here is an excellent YouTube video that goes into many technical details concerning the breaking of Enigma, as well as how Enigma machines worked.

Breaking Enigma - Exploiting a Pole Position - YouTube

That was quite an effort, but in fact there is absolutely a 'right answer' to figuring out what rotors are in use, and what starting positions are used.

It's still a deterministic result. You need some sequence of plaintext along with the encoded text to start.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
For audio AI (or not) upscaling (DSEE HX?), the result only needed to be good enough so that people cannot visually differentiate it from a real Hi-Res recording. Perhaps, even with intentionally added HF idle tones, simulated modulator noise and such. In this way, skeptics will start to suspect if the Hi-Res files are real or upscaled, then it would be quite amusing.

The modern version of breaking Enigma is more like breaking MD5 or SHA-1.
https://shattered.io/
 

Lambda

Major Contributor
Joined
Mar 22, 2020
Messages
1,791
Likes
1,525
Sorry i found this Thread after i started this Poll:
https://www.audiosciencereview.com/...-sample-rate-are-you-using.20721/#post-687227
Lot of useful information here and i hope i can add some recurses:

This page shows results of sample rate converters comparison:
Speex 1.2rc1 with quality 1, 5 and 10 (1 being the lowest, 10 - the highest)
Soxr 0.1.1 with quality LQ, MQ, HQ and VHQ (which are low, medium, high and very high quality)
https://lastique.github.io/src_test/

https://thewelltemperedcomputer.com/KB/SRC.htm

https://ccrma.stanford.edu/~jos/resample/

dowsample examples from various software:
https://src.infinitewave.ca/

Pulsaudio SRC testings.
http://archimago.blogspot.com/2015/10/measurements-look-at-linux-audio-alsa.html


j_j it appears to me your expert on this topic!
So i have a view dump theoretical questions.
here is a specific, precise mathematical definition, and it can be executed on a low-end computer in real time these days.

A perfectly band limited signal can be mathematical perfect resampled with specific function?
therefore not changing in band information and not crating out of band signals.
we/you are talking about "Sinc" function windowing?

Pleases excuse my simple therms...
This ideal Sinc response to a pulse is theoretically infinitely long? but at at some point the "wiggls" become super small (smaller then one bit?)
since we only have limited input and output precision and we don't need an ideal filter a shorter approximation to a ideal Sinc response can be used?
Saving filter length/complexity?

The "longer" the "filter window" the more delay is added and the more DSP "power" is needed?

Thanks in advance
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
A perfectly band limited signal can be mathematical perfect resampled with specific function?
therefore not changing in band information and not crating out of band signals.
we/you are talking about "Sinc" function windowing?

Pleases excuse my simple therms...
This ideal Sinc response to a pulse is theoretically infinitely long? but at at some point the "wiggls" become super small (smaller then one bit?)
since we only have limited input and output precision and we don't need an ideal filter a shorter approximation to a ideal Sinc response can be used?
Saving filter length/complexity?

The "longer" the "filter window" the more delay is added and the more DSP "power" is needed?

Thanks in advance

A bandlimited signal can be resampled precisely to any desired accuracy, of course depending on how many flops you wish to spend on it.

If this is an offline process, the delay is immaterial, you can always remove it.

There are two things that affect the cost, given a fixed resampling ratio. The first is "how close to the original fs/2 do you need to get" and the second is "how far down do you want any artifacts".

But the point is that you can do accurate to any given arbitrary accuracy, given enough computer.
 

Lambda

Major Contributor
Joined
Mar 22, 2020
Messages
1,791
Likes
1,525
Thanks for this replay!
of course depending on how many flops you wish to spend on it.

If this is an offline process, the delay is immaterial, you can always remove it.
If its happening "on the fly" in an DSP/DAC flops, memory and time are limited so i assume corners get cut?

"how far down do you want any artifacts".
This would be my next question...
assuming we don't relay care too much about noise over 22khz i assume the filter can be optimized.

Since volume and bitdepth also often get changed in digital audio applications would this be the step to introduce dithering and noise shaping.
in a way extending in band SNR by sacrificing out of band SNR

The root of this questions is:
Assuming Procession Power, Storage and Bandwidth (in the PC) is basically free and unlimited. It's not inherently stupid or worse to up sample first and then send the data to the DAC, instead of sending "bit perfect" unchanged data to the DAC and relaying on its internal up sampler?
 
Top Bottom