# Understanding Upsampling/Interpolation

#### amirm

Staff Member
CFO (Chief Fun Officer)
Signal processing methods can be complex to understand leading to misconceptions. None is more victim of that than upsampling or interpolation. I am sure you have heard of people saying they play their HD content on 4K display and it looks "almost 4K." Same with audio. There is this notion that upsampling content to higher sample rate will result in more resolution. Alas, both of these are completely false.

Nothing in the process of interpolation creates more information/detail. Nothing. The algorithms (methods) used have no intelligence whatsoever as to try to guess what is supposed to be there. Instead they rely on methods that generate more pixels (image/video) or PCM samples (audio).

The definition of an ideal interpolator or resampler is one that creates these new samples but generates no new distortion. That's right. The only thing an interpolator can do is to reduce fidelity, not increase it!

If I asked you what the number is between 1 and 3 what would you say? 2? Nope. Who says that number was halfway in the middle? Yet that is what a simple mathematical method would produce. By using more samples we may be able to better that guess but ultimately the process is quite "dumb." It makes no attempt to understand the sequence of numbers and attempt to figure out as say, a human would, to determine the missing values.

Let's visualize these with an image. Here is a picture I took of the Sake barrels ("Kazaridaru") outside of the Meiji Shrine in Tokyo, Japan:

Its dimensions are 1024 pixels wide by 648 pixels high. Now let's reduce the resolution of that image in half in each dimension and the "upsample" it back to 1024x648. If the process can make up information that is lost when we reduced its size, the new image should look just like the above:

I hope you see that we did not get there. The dimensions of the image are the same as before but clearly the image is softer. In image/video, softer image means it has less high frequency information. That is exactly what happens when we chop down an 88.2 Khz high-res audio sample to 44.1. Half the bandwidth is lost in that conversion. Upsampling back to 88.2 Khz will not get us that lost detail back. It is the same process as what happened to the above image.

This is not the full story though. Let's look at the same transformation to half size and the back up, this time with a different interpolation/upsampling algorithm:

Notice the image is now sharper than the previous enlargement. In some way it seems that we got some of our lost resolution back. But we did not. What happened was that we took the lower resolution image and amplified its high frequency content, giving the illusion of it having more resolution. In audio terms, it is like turning up the treble. You will get a higher pitched sound but it recovers no more detail than what is in the music.

Note that there is a trade off in doing this. If you look around the Chinese symbols, there is a white halo. That is distortion we created by boosting the high frequencies as we enlarged the image. Now you see how true my previous definition of interpolation/resampling was. That we could do harm in the process of conversion.

That harm though may seem like a benefit. If you step back from your monitor far enough, you may not notice the artifacts but still see the benefit of sharper image. Same with audio. A resampling/upsampling algorithm can make the sound different, increasing its highs or reducing them and subjectively make the sound better to us. But again, in no case does it actually recover information that was lost. We are simply manipulating the information we have.

Another way to understand this topic is that interpolation is actually a low pass filter! Yes, that is all it is. Let's look at the simplest way we can enlarge an image which is to simply double up the pixels:

Notice that we now have a very sharp image but obviously something is wrong about it. It is all "pixelated." Those blocks now invent new high frequency information that is not supposed to be there (sharp edges of the blocks). A low-pass filter can remedy that by filing off those sharp blocks turning them into the first enlargement we had.

Last edited:

#### Sal1950

##### Major Contributor
The Chicago Crusher
Forum Donor
The definition of an ideal interpolator or resampler is one that creates these new samples but generates no new distortion. That's right. The only thing an interpolator can do is to reduce fidelity, not increase it!
Thanks Amir, the KISS type guy I've always been told me that was so.
But with audio I've always been told that upsampling to a higher rate allows the use of better DAC filters resulting in better sound? I'm totally lost in this technology so have no idea how to separate the truth from the BS.

OP

#### amirm

Staff Member
CFO (Chief Fun Officer)
That is true. Upsampling allows a more gentle filter to be used since there is so much ultrasonic bandwidth to sacrifice. This crude picture shows the concept:

Gentle filters have less artifacts in the 20 Khz range that we care about.

Now, whether this is an audible problem is another matter.

#### Sal1950

##### Major Contributor
The Chicago Crusher
Forum Donor
Now, whether this is an audible problem is another matter.
Kool, now that makes it all perfectly clear.

#### Blumlein 88

##### Major Contributor
Forum Donor
If we could use a first order 6db/octave filter there would be no ringing. To manage a -120 db stopband with such a filter, and flat response to 20 khz (and 200 khz -3db), we only need 420 ghz sampling rates.

#### Werner

##### Active Member
That is true. Upsampling allows a more gentle filter to be used since there is so much ultrasonic bandwidth to sacrifice.
NOOOOOOOOOOOOO!!!

This is patently not true. The mere act of upsampling (inserting zero samples) does not change the spectrum of the signal. In the context of audio, upsampling is always in combination with filtering. And this filtering is steep and at the original Nyquist frequency. It is not relaxed at all.

The notion that up/oversampling eases the filtering holds only for the final analogue output filter, and in the presence of steep digital filtering preceding that.

#### DonH56

##### Technical Expert
Technical Expert
Forum Donor
I think Amir and Werner are on the same page, just different paragraphs due to the way Amir expressed himself. IME one of the main reasons for upsampling (zero padding and interpolation) at the DAC is to relax the image filter requirements. The signal at the output of a DAC "folds" around the Nyquist frequency, i.e. one-half the sampling frequency. At 44.1 kS/s (CD rate) a signal at 20 kHz has an image at 24.1 kHz as shown in Amir's picture. The first image is at (44.1 - 20) kHz around each Nyquist multiple (22.05, 44.1, 66.15, etc. kHz). Increase the sampling rate to 88.2 kS/s and the image moves out to 68.2 kHz where it is much easier to suppress (filter). But you must still have an interpolation filter to make it work; just sticking in zeros does not change the signal spectrum as Werner said. And of course there will be less in-band phase shift and roll-off with the softer filter roll-off, and it is cheaper and easier to design, and so forth. For a delta-sigma converter, probably the majority in use today, you also have to suppress the quantization noise generated by the DAC's modulator, so need a sharp filter for that as well. You can do that in the digital domain where it is easier to implement (assuming the logic/processor has the horsepower) and is more stable with respect to component drift, noise, and so forth.

There are other ways to perform image suppression, mostly used in the RF world, that use out-of-phase converters or mixers (usually with a carrier/clock suppression scheme) such that the images are canceled. These seem less common in audio, probably due to their higher cost and more stringent system level requirements for matching and tracking between (or among for more than two) the converters.

#### Ken Newton

##### Active Member
The practical benefit of DAC output filtering for human audio application is to eliminate the possibility of undesired circuit behavior with the post DAC active electronics, such as producing intermodulation distortion. Many DSP applications require sharp post DAC suppression of waveform sampling images, but it seems fairly questionable whether consumer audio is not one of them. The playback chain contains several band limiting devices aside from any within the DAC unit. The loudspeakers and the human ear are both band limited near 20kHz, and thereby serve as waveform reconstruction filters.

However, more interesting to me than the objective benefits of interpolation is the subjective affect of different interpolation modes. I find it very easy to hear differences among various interpolation filter settings, even though the signal is being band limited elsewhere in the playback chain. Such differences appear spread across the audible band, from bass to trebel as well as impacting the perceived stereo effect. So, if subjective differences are not due to frequency-domain differences then they seem likely due to time-domain differences. Yet, that possibility, too, is questionable because of the band limiting inherently provided by speakers and ears also produces ringing. Not to mention the ringing already encoded in the digital music due to recording chain band limiting.

These observations seem difficult to reconcile. One possibility is that some non-obvious secondary system parameter is also being affected with changes in filter mode.

#### DonH56

##### Technical Expert
Technical Expert
Forum Donor
Filter design is complicated... I have done a fair amount of it, analog and digital, but do not claim to be an expert. Take my thoughts as speculative.

I tend to agree that HF images are not likely to be heard but it depends on whether something (amp, speakers, ears, whatever) mixes them back down to audibility. Or something else happens that causes audible artifacts (see below).

The time-domain response of filters can vary quite a bit, and thus the resulting artifacts. In addition to the filters themselves, applying high-frequency content to the analog buffer at the output can cause all sorts of other strangeness depending upon how the buffer deals with the HF signal. Even worse for non-oversampled designs that potentially generate fairly large glitches as the signal changes, essentially little impulses that must be filtered out as well.

Some amplifier stages (low or high level) might not respond well to ultrasonic signals, especially in the form of little impulse functions (glitches) rather than just broadband noise. They may saturate (clip) and recover slowly, may ring themselves, or even worse break into oscillation. Worst would be ultrasonic oscillation so you don't hear anything until your tweeters get fried. Using the speakers as the filter sounds a little scary to me; might be no problem, but the extra energy has to go somewhere, and if the tweeter can't respond fast enough that means heat.

This is probably getting a bit off the original subject of what interpolation does (and does not) do but is pretty relevant IMO. Note you can interpolate a number of ways, from simple linear interpolation, to more complex sinc or polynomial functions, to fancy predictive interpolation and such.

IME/IMO/whatever - Don

OP

#### amirm

Staff Member
CFO (Chief Fun Officer)
The notion that up/oversampling eases the filtering holds only for the final analogue output filter, and in the presence of steep digital filtering preceding that.

#### Ken Newton

##### Active Member
...This is probably getting a bit off the original subject of what interpolation does (and does not) do but is pretty relevant IMO. Note you can interpolate a number of ways, from simple linear interpolation, to more complex sinc or polynomial functions, to fancy predictive interpolation and such.

IME/IMO/whatever - Don
Don, I concur that this subject is pretty relevant. For example, the time-domain behavior of the anti-alias and anti-image filters is, as far as I can determine, the primary performance focus of MQA. I've long been technically bothered by the easily heard subjective difference in interpolation filter mode. Particularly bothersome, is the distinct differences I hear even with human voice, which does not have the bandwidth to provoke significant filter ringing. I often hear obvious differences with lower register instruments too, such as bass violin.

#### DonH56

##### Technical Expert
Technical Expert
Forum Donor
There can be fairly broad spectral content even in things we do not normally associate with it, but also note that ringing and other artifacts are sometimes a function of the filter (and/or modulator) design and have little to do with the input signal except that it is there. For example, tones were a big problem in early delta-sigma designs, but are mostly if not completely a non-issue today due to dither and advances in converter (ADC and DAC) architecture and implementation. There are still other artifacts that arise, however, depending upon modulator order, implementation, analog buffers, phase of the moon, and so forth. But, I am over my head on this, as I have not done significant audio-frequency design recently and my filter theory classes were a while ago...

Look again at Amir's first post to get an idea of what various interpolation and filtering can do. Visual instead of audio but much the same concepts and you can see how things can be changed by the implementation and how better isn't always, nor is worse, depnding upon your point of view and what you like to see (hear).

I do not have a DAC with selectable image filters. I'd love to try one out sometime for myself. In my spare time, probably after I retire, about 147.367 years from now...

#### Sal1950

##### Major Contributor
The Chicago Crusher
Forum Donor
I've long been technically bothered by the easily heard subjective difference in interpolation filter mode.
There can be fairly broad spectral content even in things we do not normally associate with it,
Bouncing back to Amir's earlier thoughts "whether this is an audible ---"
Are you then confident that you could identify these "easily heard subjective differences" under bias controled blind listening tests?
Of maybe one of you already have?

#### Ken Newton

##### Active Member
Bouncing back to Amir's earlier thoughts "whether this is an audible ---"
Are you then confident that you could identify these "easily heard subjective differences" under bias controled blind listening tests?
Of maybe one of you already have?
Yes, at least, with my own system, I feel VERY confident, and that fact bothers me. I've no comfortable technical explanation for why I should hear so obvious an difference.

#### DonH56

##### Technical Expert
Technical Expert
Forum Donor
As I said, I have no easy way of comparing the differences with my present equipment. It should be fairly easy to measure, given the right equipment, ditto.

#### RayDunzl

##### Major Contributor
Central Scrutinizer
Signal processing methods can be complex to understand leading to misconceptions. None is more victim of that than upsampling or interpolation. I am sure you have heard of people saying they play their HD content on 4K display and it looks "almost 4K." Same with audio. There is this notion that upsampling content to higher sample rate will result in more resolution. Alas, both of these are completely false.
Ok, I'm confused and a bit ignorant...

Werner says upsampling adds "zero" samples in between. I'm not sure I understand that (or haven't the background to read it correctly).

Interpolation adds samples in between the originals and calculates values for them. That I can understand, or at least think I do.

The original examples above were based on Video.

For Audio, maybe there is another complication available.

My DAC2* adds ASRC (Asynchronous Sample Rate Conversion) and, as I read it, turns all the S/PDIF input rates into 211khz as part of its jitter reduction/elimination. The reason given that their tests showed best (measured?) performance using that rate.

It would seem that it must insert new samples and, due to non-integer sample rate multiplication, recalculate the original as well as calculating inserted sample values (if they are not inserted as zero).

Oh boy! Now I find "Conceptual Oversampling" (scroll way down)

CONCEPTUAL OVERSAMPLING
The digital filters in the DAC2 operate at a conceptual sample rate of about 250 GHz. Incoming audio is conceptually upsampled to 250 GHz and then down sampled to 211 kHz using a filter that mathematically behaves as if it is operating at a 250 Giga-sample-per-second rate. We use the word "conceptual" because the calculations and internal clocks are not actually running at 250 GHz. Due to the mathematics of upsampling, most of the filter calculations require a multiply by zero operation. These unnecessary zero-product calculations are eliminated while all of the non-zero calculations are executed. The net result is mathematically equal to the results that would have been produced by executing every calculation at a 250 GHz sample rate. Eliminating the unnecessary calculations reduces the DSP and processing rates to a manageable load.

OP

#### amirm

Staff Member
CFO (Chief Fun Officer)
Werner says upsampling adds "zero" samples in between. I'm not sure I understand that (or haven't the background to read it correctly).
It is the same thing I explained at the end of my article. Namely, you insert zeros to increase the sample rate. Now you have the original data plus zeros in between. That by itself creates lots of distortion of course. If you however low-pass filter that to the original bandwidth of the signal, that problem goes away and the zeros change to the interpolated value.

It is just an implementation detail.

OP

#### amirm

Staff Member
CFO (Chief Fun Officer)
Oh boy! Now I find "Conceptual Oversampling" (scroll way down)

CONCEPTUAL OVERSAMPLING
The digital filters in the DAC2 operate at a conceptual sample rate of about 250 GHz. Incoming audio is conceptually upsampled to 250 GHz and then down sampled to 211 kHz using a filter that mathematically behaves as if it is operating at a 250 Giga-sample-per-second rate. We use the word "conceptual" because the calculations and internal clocks are not actually running at 250 GHz. Due to the mathematics of upsampling, most of the filter calculations require a multiply by zero operation. These unnecessary zero-product calculations are eliminated while all of the non-zero calculations are executed. The net result is mathematically equal to the results that would have been produced by executing every calculation at a 250 GHz sample rate. Eliminating the unnecessary calculations reduces the DSP and processing rates to a manageable load.
It is a common technique to count the zero filter coefficients with respect to filter strength/merit as they are doing. The results though are usually stated in the number of "taps" though not frequency. But the conversion is one of the same.

The ideal is to use as many taps as you can so that the interpolated value has least distortion. Audio runs pretty slow compared to video and it is one dimensional so it is pretty easy these days to use many taps. There is a point of diminishing return though so going to infinity is not necessary.

#### March Audio

##### Major Contributor
Manufacturer
There can be fairly broad spectral content even in things we do not normally associate with it, but also note that ringing and other artifacts are sometimes a function of the filter (and/or modulator) design and have little to do with the input signal except that it is there. For example, tones were a big problem in early delta-sigma designs, but are mostly if not completely a non-issue today due to dither and advances in converter (ADC and DAC) architecture and implementation. There are still other artifacts that arise, however, depending upon modulator order, implementation, analog buffers, phase of the moon, and so forth. But, I am over my head on this, as I have not done significant audio-frequency design recently and my filter theory classes were a while ago...

Look again at Amir's first post to get an idea of what various interpolation and filtering can do. Visual instead of audio but much the same concepts and you can see how things can be changed by the implementation and how better isn't always, nor is worse, depnding upon your point of view and what you like to see (hear).

I do not have a DAC with selectable image filters. I'd love to try one out sometime for myself. In my spare time, probably after I retire, about 147.367 years from now...
This is an intetesting dac I have recently discovered. You can even design your own filters.

http://www.diyaudio.com/forums/vend...crete-r-2r-sign-magnitude-24-bit-384-khz.html

http://www.moredamfilters.info

The differences between filters are audible, if subtle.

Last edited:

#### bennetng

##### Addicted to Fun and Learning
There are also some newer image scaling algorithms make use of neural network, like waifu2x, NNEDI3 and NGU, here is a screenshot of NGU scaling. The major shortcoming of such methods is the processing quality is not as consistent and predictable as traditional methods.