• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Up Sampling

Excellent guys, thanks for the links. The video was excellent, I was quite literally on the edge of my seat (the dog jumped up and took up most of it). I was able to understand why I was thinking about a stepped output from a digital input so I don’t think I’ll be worrying about that anymore, let alone trying to explain my upsampling decision to friends. At a music night get together I was asked what DSD stood for and before I could answer, one of my wise-cracking mates came up with "digital sh*t d*ck". Plenty of guffaws and thigh slaps...

From the Mojo Audio article however, I ended up focusing on "Using player software that is capable of converting PCM to DSD and upsampling it to at least Quad-Rate DSD is highly recommended."

Now I think I understand why I was ignored when I asked for Amir’s reviews to include results of feeding DACs their highest rate accepted of DSD.

I think I’ll mow my AstroTurf and ponder some more.
 
Upsampling doesn’t give you more information; so it’s not hifi-er.

But I am open for perceptual aspects that may be pleasing
Not sure I understand where any "perceptual aspects" can come into play that would modify the actual waveform of the decoded file but then I don't even begin to claim any knowledge in digital recording/playback technology.
The bottom line for me is that if you start with a 16/44 recording (or whatever) and upsample it a bazillion times, it will never contain more musical information than the original file contained. As I understand it the only possible modification of retrieved waveform would be distortions from errors.
KISS is my policy, run your bit perfect data stream of the original file into a quality DAC and it's gonna be as good as it can be.
Just a couple of the Canadian Loonie's or Toonie's my northern friend left me.
:p
 
Not sure I understand where any "perceptual aspects" can come into play that would modify the actual waveform of the decoded file but then I don't even begin to claim any knowledge in digital recording/playback technology.
The bottom line for me is that if you start with a 16/44 recording (or whatever) and upsample it a bazillion times, it will never contain more musical information than the original file contained. As I understand it the only possible modification of retrieved waveform would be distortions from errors.
KISS is my policy, run your bit perfect data stream of the original file into a quality DAC and it's gonna be as good as it can be.
Just a couple of the Canadian Loonie's or Toonie's my northern friend left me.
:p
Yep I've given up dicking about and just play CDs and some files just the way they are.
I suppose some DACs deal with different sample rates differently, but if that causes an audible difference it is probably not a very good DAC.
The quality of the original recording makes more difference than anything we can do with our hifi anyway.
I have never owned a recording with a dynamic range of anything like 16-bit, never mind 20, and of course 24 is purely academic since the analogue side, certainly power amps and speakers, are nowhere near capable of that, and probably never will be.
The background sound level in my room is 35dB. Adding the 93dB or real 16-bit audio would need a capability of 128dB which is not only deafening but completely beyond the capacity of most domestic sound systems, though I have horns which are 109dB/watt.
The only person needing 24-bit dynamic range is a lazy recording engineer who wants to be able to record every noise, from a mosquito at the end of the room to a Saturn V blast off, without adjusting the level control - ie nobody at all, it is bonkers.
 
Not sure I understand where any "perceptual aspects" can come into play that would modify the actual waveform of the decoded file but then I don't even begin to claim any knowledge in digital recording/playback technology.
The bottom line for me is that if you start with a 16/44 recording (or whatever) and upsample it a bazillion times, it will never contain more musical information than the original file contained. As I understand it the only possible modification of retrieved waveform would be distortions from errors.
KISS is my policy, run your bit perfect data stream of the original file into a quality DAC and it's gonna be as good as it can be.
Just a couple of the Canadian Loonie's or Toonie's my northern friend left me.
:p

When I approached resampling, my process was thus:

1) Many people (on the internet) claim sonic differences. Could they all be wrong?

2) Some engineers of high rank claim resampling can make a sonic difference, for example Daniel Weiss:

3) Archimago conducted a listening test on linear vs minimum phase upsampling. He concluded that further tests should be done. Interestingly, his blind test supported HQ Player’s view on which filter is best for rock vs classic (studio vs open/acoustic):
http://archimago.blogspot.com/2015/07/the-linear-vs-minimum-phase-upsampling.html?m=1

4) Besides, filters and resampling alter the original, so there is a change, though the big question is if this change is audible.

I guess I am curious by nature so I bought an HQ Player license to experiment with filters and resampling. My curiousness ended when I found out that the original file sounded too good to bother with digital tweaking by filters snd resampling.

I recommend everyone to try and nurtture their curiousness, their childlikeness, even if experiments are negative.

And in the case of filters and resampling I cannot conclude that there may be something to it though my experiment was a negative.
 
Yep I've given up dicking about and just play CDs and some files just the way they are.
I suppose some DACs deal with different sample rates differently, but if that causes an audible difference it is probably not a very good DAC.
The quality of the original recording makes more difference than anything we can do with our hifi anyway.
I have never owned a recording with a dynamic range of anything like 16-bit, never mind 20, and of course 24 is purely academic since the analogue side, certainly power amps and speakers, are nowhere near capable of that, and probably never will be.
The background sound level in my room is 35dB. Adding the 93dB or real 16-bit audio would need a capability of 128dB which is not only deafening but completely beyond the capacity of most domestic sound systems, though I have horns which are 109dB/watt.
The only person needing 24-bit dynamic range is a lazy recording engineer who wants to be able to record every noise, from a mosquito at the end of the room to a Saturn V blast off, without adjusting the level control - ie nobody at all, it is bonkers.


You have nailed it, 100%. :)

However, can we have a forum without navel-gazers?? lol.gif
 
Last edited:
You have nailed it, 100%. :)

However, can we have a forum without navel-gazers?? View attachment 13729

@Frank Dernie and @Wombat ,

many engineers are convinced that the optimal sampling rate is a bit above 44 kHz.

I can’t find the Stuart/Meridian paper, but this Lavry note makes some of the same conclusions:

http://www.lavryengineering.com/pdfs/lavry-white-paper-the_optimal_sample_rate_for_quality_audio.pdf

Lavry wrote:

«At 60 KHz sampling rate, the contribution of AD and DA to any attenuation in the audible range is negligible. Although 60 KHz would be closer to the ideal; given the existing standards, 88.2 KHz and 96 KHz are closest to the optimal sample rate. At 96 KHz sampling rate the theoretical bandwidth is 48 KHz. In designing a real world converter operating at 96 KHz, one ends up with a bandwidth of approximately 40 KHz».

So it makes sense to operate with sampling rates a bit higher than 44 kHz. I guess that also explains the use of 24 bits. Is there anything such as 16/88 or 16/96?
 
If you think in another way, it can also mean that native DSD support is not really that important. Most DACs support 88.2/96kHz anyway. Just use some software players capable of real time DSD-PCM conversion. Their quality are not necessarily poorer than hardware-based solutions. At this point you will also be free from the DSD jail since you can use PCM volume control, EQ and so on.
 
If you think in another way, it can also mean that native DSD support is not really that important. Most DACs support 88.2/96kHz anyway. Just use some software players capable of real time DSD-PCM conversion. Their quality are not necessarily poorer than hardware-based solutions. At this point you will also be free from the DSD jail since you can use PCM volume control, EQ and so on.

That's the approach I take. Not just the convenience of using a software volume control, but other dsp effects like bass eq.
 
@andreasmaan,
<snip>
With PCM digital, there is a low-pass filter placed just before the nyquist frequency (e.g. 22,050Hz for 44.1Khz PCM). This filter introduces ringing (pre- and/ or post- echo, depending on the filter). At 44.1Khz PCM, this ringing may be in the audible band (beginning around 20Khz or lower). By using a higher sample rate, the nyquist frequency, and therefore the transition range, and therefore the ringing, is pushed up to near the new (higher) nyquist frequency. For e.g. a 96Khz sample rate, this pushes the ringing up to the 40Khz range (exactly where depends on the filter type used). Get the sample rate high enough, and use an appropriate filter, and the ringing stays completely out of the audible spectrum.

Others will definitely be able to explain this in more depth, but that is what I understand to be the gist of the theoretical argument in favour of higher sample rate PCM.

The filtering issues are the reasons why higher sample rates are used during the recording.
But the sampling process leads to so-called image spectra (image spectra of the sampled audio content) centered around integer multiplies of the sampling frequency. That means, if the bandwidth restricted audio content ranges from 0 Hz to 20kHz and you sample it with 44.1 kHz then the first image will start at 44.1 kHz - 20 khz = 24.1 kHz and will end at 44.1 kHz + 20 kHz = 64.1 kHz; the next starts at (2x44.1 kHz)- 20 kHz = 68.2 kHz and will end at (2x44.1 kHz) + 20 kHz = 108.2 kHz and so on, so quite strong additional ultrasonic content is incorporated in the digital data stream.

These image spectra must be filtered out when doing over- or upsampling. If you want to get rid of all of this ultrasonics your lowpass filtering must meet exactly the same specification as it would without any resampling.

I'm not sure if/why a computer would do it better than a dedicated DAC.

It depends, in a DAC all the filtering must happen in real time while a computer could do it offline and (maybe) therefore with greater "care" means higher precision and with less interference.
 
Running the DAC faster may or may not help in the end. You must somehow create new samples between the existing samples so, while the algorithms can be pretty sophisticated, it is still a guess.<snip>.

No the new samples between aren´t "guessed" as no new information will be generated, it is just the information delivered by the existing samples as, due to the Shannon theorem, these samples represent the original signal completely.

So within the constraints of the reality behind Shannon, the "new" samples are most likely as precise (wrt original signal) as the existing samples.
Of course, we have to deal with quantization errors but if we consider a random error distribution it is a matter of probabilistics.
 
@Frank Dernie and @Wombat ,

many engineers are convinced that the optimal sampling rate is a bit above 44 kHz.

I can’t find the Stuart/Meridian paper, but this Lavry note makes some of the same conclusions:

http://www.lavryengineering.com/pdfs/lavry-white-paper-the_optimal_sample_rate_for_quality_audio.pdf

Lavry wrote:

«At 60 KHz sampling rate, the contribution of AD and DA to any attenuation in the audible range is negligible. Although 60 KHz would be closer to the ideal; given the existing standards, 88.2 KHz and 96 KHz are closest to the optimal sample rate. At 96 KHz sampling rate the theoretical bandwidth is 48 KHz. In designing a real world converter operating at 96 KHz, one ends up with a bandwidth of approximately 40 KHz».

So it makes sense to operate with sampling rates a bit higher than 44 kHz. I guess that also explains the use of 24 bits. Is there anything such as 16/88 or 16/96?
My first digital recorder experience is 16/48, which was transparent as far as I could hear. Can't see the point in a higher sampling rate than 96kHz and whilst 24 bits makes recording more idiot proof in terms off setting levels is pointless for replay.
I have been making digital recordings myself for about 25 years now and now I have 192/24 I still reckon 96/24 is fine for recording. Storage space is cheap now, but if it wasn't I would do any level shift necessary to make sure all the sound is in the 16-bits (which will be possible IMO) and probably go to 48kHz because the 48/16 file is ⅓ the size and contains all the sound I can hear.
That is just me. One thing I am sure of is that with the sort of music I record there is no audible difference. Maybe a 14 year old listening to cymbals may be able to, I hear rattling a bunch of keys produces audible differences in 96kHz over 44, but not something I listen to seriously :)
 
No the new samples between aren´t "guessed" as no new information will be generated, it is just the information delivered by the existing samples as, due to the Shannon theorem, these samples represent the original signal completely.

So within the constraints of the reality behind Shannon, the "new" samples are most likely as precise (wrt original signal) as the existing samples.
Of course, we have to deal with quantization errors but if we consider a random error distribution it is a matter of probabilistics.

What? If you oversample then an algorithm must create new samples between the original samples. That is true whether you simply zero-pad or use a complex predictive approach that can generate points that go beyond (in time and amplitude) the original samples. Shannon applies to the original samples, assuming they had no content equal to or higher in frequency than Nyquist at the original sampling rate, but the new samples may (or may not) include "new" material beyond what Shannon predicts for the original samples. Extrapolation, interpolation, prediction, and all that DSP stuff...
 
I'd call converting DSD64 to PCM352 a form of upsampling - while there is certainly no perfect DSD to PCM conversion, 96/24 seems to cover the full spectrum of DSD64.

Also, why would you down sample if not needed? This downsampling would inherently lose data. Why not just convert DSD64 directly to 96k (or 176k if you want to be really really conservative) and not have to ever deal with downsampling and save a lot of disk space too?
There are those who would disagree that DSD64 at 2.8224 MHz is upsampling when going to PCM352kHz. But, of course, the DSD samples are only 1-bit, as opposed to PCM's 24-, 32-, etc. bits. I think of it as a format conversion, not as up or downsampling. And, while 96k PCM is roughly the same bandwidth in data terms as DSD64, the big difference in what the bits represent in each format makes that bandwidth comparison irrelevant.

Recording engineers have found PCM352k to be the most transparent format/sampling rate to use in editing of DSD, and they consistently use that, going back and forth between them only in short snippets of a few samples as necessary in mastering of native DSD recordings. It is common practice, since DSD does not lend itself to much editing. But, they generally avoid doing that in DSD mastering as much as possible.

In my scenario as I described it, I said I use JRiver's fixed standard conversion of DSD64 to PCM352K, downsampling the 352k to 176k only for compatibility with Dirac Live's current 192k upper limit. I also said I would eliminate the downconversion to 176k when the latest Dirac version comes out, using 352k directly, which my DAC easily supports. Sample rate downconversions may be transparent, but they also cannot possibly provide any sonic benefit.

I also prefer using even integer multiples of 44.1k, which is also the basis of DSD sample rates. 88k vs. 96k, 176k vs. 192k, etc. probably do not make much difference these days, however. So, I use the even multiple, and I make no claims that it sounds better than non-integer multiples. But, there is also no compelling reason not to use the even multiples.

I did experiment with downsampling to 88k vs. 176k, and I subjectively and anecdotally slightly preferred the higher one. No big deal, though. Most SACD players, AV processors, etc. do use 88k if they convert DSD to PCM in hardware. The math is simpler, but the sonic result from instead using non-integer multiples may be indistinguishable.

Up or down conversions and format conversions may be relatively benign, but I agree with you, why do them at all unless they are absolutely necessary.
 
What? If you oversample then an algorithm must create new samples between the original samples. That is true whether you simply zero-pad or use a complex predictive approach that can generate points that go beyond (in time and amplitude) the original samples. Shannon applies to the original samples, assuming they had no content equal to or higher in frequency than Nyquist at the original sampling rate, but the new samples may (or may not) include "new" material beyond what Shannon predicts for the original samples. Extrapolation, interpolation, prediction, and all that DSP stuff...

Yes, new samples must be created between the existing ones - hence it is still mathematically correct called interpolation - but, in constrast to other interpolation cases, this one is special as Shannon´s Sampling Theorem gives us the formula to reconstruct the original signal from the samples and therefore we only have to use the "reconstruction formula" to calculate the samples for the new sample rate. It is called bandlimited interpolation.
 
@Frank Dernie and @Wombat ,

many engineers are convinced that the optimal sampling rate is a bit above 44 kHz.

I can’t find the Stuart/Meridian paper, but this Lavry note makes some of the same conclusions:

http://www.lavryengineering.com/pdfs/lavry-white-paper-the_optimal_sample_rate_for_quality_audio.pdf

Lavry wrote:

«At 60 KHz sampling rate, the contribution of AD and DA to any attenuation in the audible range is negligible. Although 60 KHz would be closer to the ideal; given the existing standards, 88.2 KHz and 96 KHz are closest to the optimal sample rate. At 96 KHz sampling rate the theoretical bandwidth is 48 KHz. In designing a real world converter operating at 96 KHz, one ends up with a bandwidth of approximately 40 KHz».

So it makes sense to operate with sampling rates a bit higher than 44 kHz. I guess that also explains the use of 24 bits. Is there anything such as 16/88 or 16/96?
That opinion of Lavry comes from j_j who is a member here. You could ask him about it.

Lavry is unusual in that he starts a slower roll off at 30 khz for 96k sample rates.
 
Yes, new samples must be created between the existing ones - hence it is still mathematically correct called interpolation - but, in constrast to other interpolation cases, this one is special as Shannon´s Sampling Theorem gives us the formula to reconstruct the original signal from the samples and therefore we only have to use the "reconstruction formula" to calculate the samples for the new sample rate. It is called bandlimited interpolation.

Shannon's Theorem (I call it the Nyquist Theorem, or Nyquist-Shannon Theorem, but it is all the same thing), relates back to information theory and defines the information bandwidth that can be successfully recovered from sampled data. You could apply simple interpolation, which does generate "new" samples in the sense that you are creating data that was not there in the original, but those samples would not contain any additional information so perhaps the problem is simply the way I define "new" samples. Even if you do nothing but zero-pad the data, the added zeros are "new" data not present in the original bit (sample) stream.

When upsampling data, which is what I thought was the discussion, the "new" samples must fit the theorem with respect to the new (faster) sampling rate, thus now the samples are no longer bound by the lower rate. That is, the new values may not fit the constraints of the lower sample rate, but can have information bandwidth greater than the original. In other words, the additional ("new") samples created during upsampling can contain information not in the original samples due to the new (higher) Nyquist rate. To use numbers, if I upsample a CD-rate bit stream from 44.1 kS/s to 88.2 kS/s, I can now create data with frequencies up to 44 kHz where before it was limited to 22 kHz. Some papers use "extrapolation" to describe data that contains frequency information beyond the original signal. There are predictive algorithms that do try to add higher-frequency data based on the trend of the original samples, at least in the RF world; audio-rate converters are not my day job.

At this point I suspect we are far beyond what most of the readership cares about and down to differences in whatever courses and experience we have in defining the terms...
 
I am not seeing all of the posts regarding higher sample rate and bit depth but as a reminder they are two different things used for different reasons:
  • Higher sampling rates allow greater signal bandwidth and/or simpler filters. Greater bandwidth is hopefully clear; you can capture higher frequencies (not a big deal to me at my age!) Simpler filters, or just filters with higher frequency rolloff, can impact the desired signal band less than when higher-order filters are required to be very near the signal band. Moving the filters away (above) the signal band benefits the signal band itself because real filters have finite rolloff.
  • Higher resolution (more bits) allows more dynamic range. This means you can capture more of the original signal, and perhaps as or more importantly allows more headroom during the mixing and mastering process to combine and balance the level of instruments (voices, explosions, whatever) in the mix before creating the final master.
Going from 16-bit/44.1 kS/s CD rate to 24-bit/192 kS/s hi-res means (theoretically) you increase SNR (signal-to-noise ratio) from ~98 dB to ~146 dB and bandwidth from ~22 kHz to 96 kHz. In practice both dynamic range and bandwidth will be less than theory, natch.

HTH - Don
 
Back
Top Bottom