• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Time resolution of Redbook (16/44) PCM

Status
Not open for further replies.

danadam

Addicted to Fun and Learning
Joined
Jan 20, 2017
Messages
976
Likes
1,519
I would like to point out that the examples shared in this thread are somewhat simplified as compared to real music and sound, as the examples are using:

- Single sine wave, as opposed to multiple sine waves of different frequency and phase mixed up;
- Static sine wave, instead of changing frequency and phase over time;
- Large/unlimited number of samples, instead of potentially short lived sine waves (2-3 samples).
A single sine wave? Have you checked the spectrogram of my file?:
imp.all.44.png
 

PaulD

Senior Member
Joined
Jul 22, 2018
Messages
453
Likes
1,341
Location
Other
I am reasonably familiar with the case of single impulse, effects of minimum vs linear phase filtering while reconstructing, etc. My question involves a more complex scenario, let me try to formulate it a bit better.

- Let sampling frequency be Fs=44.1KHz and sampling depth be D=16 bits;
- Let there be a signal generator consisting of a large number (say, 2^12=4096) sine wave generators, with a constantly changing frequency (between, say Fs/4 and Fs/2, to make it a bit more difficult), amplitude (0 to (D/12)^2), and phase (between 0 and 360). All the sine waves are added and there is no clipping;
- We sample the output of the signal generator at Fs and apply reconstruction filter, then FFT to extract the original values of Frequency, Amplitude, and Phase, for each of the 4096 sine waves.

In the process, we end up with sequences of (frequency, amplitude, phase) at the generator input and (frequency, amplitude, phase) at the output of FFT.

Clearly there will be a difference between the input values of the signal generator and the output values after reconstruction and FFT, as there are only 16 bits of information emitted per sample, right? Will the difference reduce if we sample at 24 bits? Will the difference further reduce if we sample at 2*Fs, 4*Fs, etc.?
I would suggest watching the Monty video. I will put the link below.

For the initial point, there is no limitation to the timing resolution of a 44.1KHz sampled waveform from a band-limited input.
The video has the time set in it to play from the part where this is demonstrated in the analog domain - you can watch it on an analog scope.

I would recommend watching it a few times from the beginning. It will show why a dithered signal has resolution below 1 bit. It also demonstrates this in the analog domain.

I think this video would answer all of your questions with a demonstration.
 
Last edited:

PierreV

Major Contributor
Forum Donor
Joined
Nov 6, 2018
Messages
1,448
Likes
4,812
It's amazing to consider that kids accept that a quadratic equation has one solution (or two identical solutions if one really want to nit pick) when the discriminant is zero but that many audiophiles seem to have fundamental problems with the unique solution that shows up when sampling a band limited signal...
 

charleski

Major Contributor
Joined
Dec 15, 2019
Messages
1,098
Likes
2,240
Location
Manchester UK
I am reasonably familiar with the case of single impulse, effects of minimum vs linear phase filtering while reconstructing, etc. My question involves a more complex scenario, let me try to formulate it a bit better.

- Let sampling frequency be Fs=44.1KHz and sampling depth be D=16 bits;
- Let there be a signal generator consisting of a large number (say, 2^12=4096) sine wave generators, with a constantly changing frequency (between, say Fs/4 and Fs/2, to make it a bit more difficult), amplitude (0 to (D/12)^2), and phase (between 0 and 360). All the sine waves are added and there is no clipping;
- We sample the output of the signal generator at Fs and apply reconstruction filter, then FFT to extract the original values of Frequency, Amplitude, and Phase, for each of the 4096 sine waves.

In the process, we end up with sequences of (frequency, amplitude, phase) at the generator input and (frequency, amplitude, phase) at the output of FFT.

Clearly there will be a difference between the input values of the signal generator and the output values after reconstruction and FFT, as there are only 16 bits of information emitted per sample, right? Will the difference reduce if we sample at 24 bits? Will the difference further reduce if we sample at 2*Fs, 4*Fs, etc.?
The equation for this is given here. It can be simplified as:
tmin = 1/(π * fs * (2^b -1))
Where fs = sampling rate and b = no of bits

Needless to say, this is a lot less than the 1/πfs figure touted by those who claim resolution is simply limited by sampling rate, but it's not 0.
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,880
Likes
16,666
Location
Monument, CO
The equation for this is given here. It can be simplified as:
tmin = 1/(π * fs * (2^b -1))
Where fs = sampling rate and b = no of bits

Needless to say, this is a lot less than the 1/πfs figure touted by those who claim resolution is simply limited by sampling rate, but it's not 0.

That should be signal frequency, not sampling frequency, in the equation... Aperture time does not depend upon sampling rate.
 

audio2design

Major Contributor
Joined
Nov 29, 2020
Messages
1,769
Likes
1,830
with a constantly changing frequency (between, say Fs/4 and Fs/2, to make it a bit more difficult), amplitude (0 to (D/12)^2), and phase (between 0 and 360).

- Define "changing". Whether you are changing the frequency, or the gain, or the phase, you are applying a modulation function, which will generate harmonics, which may exceed FS/2.

Clearly there will be a difference between the input values of the signal generator and the output values after reconstruction and FFT, as there are only 16 bits of information emitted per sample, right? Will the difference reduce if we sample at 24 bits? Will the difference further reduce if we sample at 2*Fs, 4*Fs, etc.?

- Whether 16 or 24, the error will be define by the SNR

- 2xFS or 4xFS will only help if you broke the system and created harmonics above FS/2 as above.
 

audio2design

Major Contributor
Joined
Nov 29, 2020
Messages
1,769
Likes
1,830
The equation for this is given here. It can be simplified as:
tmin = 1/(π * fs * (2^b -1))
Where fs = sampling rate and b = no of bits

Needless to say, this is a lot less than the 1/πfs figure touted by those who claim resolution is simply limited by sampling rate, but it's not 0.


Don't hold me to this but it has been a while since I went through the math, but if I am not mistaken, then the timing accuracy is a factor of SNR, not bit depth (though related) as dither can increase SNR which increases timing resolution.
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,880
Likes
16,666
Location
Monument, CO
Don't hold me to this but it has been a while since I went through the math, but if I am not mistaken, then the timing accuracy is a factor of SNR, not bit depth (though related) as dither can increase SNR which increases timing resolution.

Timing "resolution", if we use that term ("accuracy" is not what we are measuring here), is a function of the signal frequency and converter resolution (bit depth). Sampling rate does not matter, and SNR is not an explicit part of the equation. @charleski provided the correct equation except it is a function of signal frequency, not sampling frequency (under certain defined conditions blah blah blah but this is "the" equation generally used).

If you have timing errors on the order of tmin, like jitter, then you will reduce the SNR, but that is something different. Dither typically reduces the SNR (you are adding noise, after all) but allows you to resolve signals that are less than 1 lsb. Think of dither as providing a little noise so now the signal, instead of being always below 1 lsb, flirts with the lsb level enough that you can reconstruct (processing using DSP or your ears and brain) the signal from the noise.
 
OP
j_j

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Clearly there will be a difference between the input values of the signal generator and the output values after reconstruction and FFT, as there are only 16 bits of information emitted per sample, right? Will the difference reduce if we sample at 24 bits? Will the difference further reduce if we sample at 2*Fs, 4*Fs, etc.?

The difference will be simple. There will be a flat (assuming white TPD dither) noise floor in the FFT.

For every doubling of the sampling rate the noise floor will drop by 3dB, if your device actually does that right. Most don't, but that's a different problem.

The ***ONLY*** difference will be a drop in the noise floor.

For every bit you add to the path, you get a 6.02dB drop in the noise floor, again, if it's done right.

Yes, this does affect the time resolution, marginally. Since the time resolution of 16/44 is well below detectable levels in the ear, this isn't likely to sound any different.
 
OP
j_j

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Don't hold me to this but it has been a while since I went through the math, but if I am not mistaken, then the timing accuracy is a factor of SNR, not bit depth (though related) as dither can increase SNR which increases timing resolution.

What Charleski meant, I do believe, was

time_resolution = 1 / ( 2 *pi * bandwidth * 2^(B)) where 'B' is the number of bits. the -1 is not necessary. There is an additional sqrt(1/12) and a small factor greater than one present if one wishes to be precise, but the small factor greater than one must presume a quantizer loading.
 
OP
j_j

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
Timing "resolution", if we use that term ("accuracy" is not what we are measuring here), is a function of the signal frequency and converter resolution (bit depth). Sampling rate does not matter, and SNR is not an explicit part of the equation. @charleski provided the correct equation except it is a function of signal frequency, not sampling frequency (under certain defined conditions blah blah blah but this is "the" equation generally used).

If you have timing errors on the order of tmin, like jitter, then you will reduce the SNR, but that is something different. Dither typically reduces the SNR (you are adding noise, after all) but allows you to resolve signals that are less than 1 lsb. Think of dither as providing a little noise so now the signal, instead of being always below 1 lsb, flirts with the lsb level enough that you can reconstruct (processing using DSP or your ears and brain) the signal from the noise.

Actually signal bandwidth. But other than that, yes.
 

voodooless

Grand Contributor
Forum Donor
Joined
Jun 16, 2020
Messages
10,371
Likes
18,281
Location
Netherlands
What Charleski meant, I do believe, was

time_resolution = 1 / ( 2 *pi * bandwidth * 2^(B)) where 'B' is the number of bits. the -1 is not necessary. There is an additional sqrt(1/12) and a small factor greater than one present if one wishes to be precise, but the small factor greater than one must presume a quantizer loading.

I think some things are conflated here. You and @charleski put in the sample rate into the equation, but that is not correct. The only factors determining the time resolution are bit depth and frequency of the signal that you want to encode. Sample rate is not part of it. That is also exactly what the linked article says.

A frequent claim by detractors of digital audio is that the time resolution is equal to the sampling interval, 22.7 μs for the CD format. This is incorrect. Although there is a limit, it is much smaller, and it does not depend on the sample rate.
 
OP
j_j

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
I think some things are conflated here. You and @charleski put in the sample rate into the equation, but that is not correct. The only factors determining the time resolution are bit depth and frequency of the signal that you want to encode. Sample rate is not part of it. That is also exactly what the linked article says.

You notice I said "bandwidth". That is not "sampling frequency". If your signal is a pure sine wave, how do you then change the phase?

The question is the bandwidth of the system. Not the sampling rate.

Hint: You don't, then it's not a pure sine wave.

What you're missing is the impulse response of a system with a given bandwidth. NOT sampling rate (although we all know there is a limit) but BANDWIDTH.
 

voodooless

Grand Contributor
Forum Donor
Joined
Jun 16, 2020
Messages
10,371
Likes
18,281
Location
Netherlands
You notice I said "bandwidth". That is not "sampling frequency". If your signal is a pure sine wave, how do you then change the phase?

The question is the bandwidth of the system. Not the sampling rate.

Hint: You don't, then it's not a pure sine wave.

What you're missing is the impulse response of a system with a given bandwidth. NOT sampling rate (although we all know there is a limit) but BANDWIDTH.

I think the main takeaway should be that the time resolution is frequency dependent. What exactly are you implying with bandwidth? Can you give an example?
 
OP
j_j

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,279
Likes
4,786
Location
My kitchen or my listening room.
I think the main takeaway should be that the time resolution is frequency dependent. What exactly are you implying with bandwidth? Can you give an example?

What is the biggest CHANGE you can put through a system with a given bandwidth. It's that simple.
 

voodooless

Grand Contributor
Forum Donor
Joined
Jun 16, 2020
Messages
10,371
Likes
18,281
Location
Netherlands
What is the biggest CHANGE you can put through a system with a given bandwidth. It's that simple.

Yes, but that is just a maximum, right?
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,880
Likes
16,666
Location
Monument, CO
Actually signal bandwidth. But other than that, yes.

Yes, good catch, I keep thinking of audio as a DC-starting baseband signal and conflated the two. Should know better since the last time I really wrestled through the derivation it was for a bandpass converter at X-band and "bandwidth" was the important factor for that one! Actually, most of my converter design career involved RF signal bandwidths around a carrier so yeah, no excuse...
 
Last edited:

charleski

Major Contributor
Joined
Dec 15, 2019
Messages
1,098
Likes
2,240
Location
Manchester UK
You're right that this is, strictly speaking, a function of signal rather than sampling frequency. But mansr's model is looking at the best case resolution which is, clearly, a function of sampling rate in a band-limited digital system. I should have specified that though. It's perhaps more important to note that this is a also a function of signal amplitude, but since such discussions focus on full-scale impulse or step functions this falls away.

Even with these caveats the timing accuracy of 16/44.1 digital is at least an order of magnitude greater than even the most accurate analog system.
 
Status
Not open for further replies.
Top Bottom