• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Stereo and higher sampling rates - "time domain" question

teapea

Member
Joined
Feb 13, 2021
Messages
30
Likes
22
I was speaking to a friend of mine the other night, who is a proper electrical engineer geek, damn smart and likes theory. He's far from an audiophile but like well engineered kit for the geekery and agrees there is so much bs in the audio industry. He also agrees that 44kHz is fine for any music and CD quality is absolutely fine for frequency and 16Bit is fine for both noise and dynamic range etc etc. Also completely agrees that when comparing kit that level matching is paramount and the human ear will no doubt pick a louder device as better quality. So far so good.

However this is the curve-ball he threw out. He claims that 96kHz is far better for stereo in the "time domain" and it's better, otherwise there can be incorrect/unexpected phase shifts between the channels, but by going higher the phase shifts are much smaller.

My immediate thought was, there is no time information in the signal as such, apart from the samples themselves, but timing is handled by the DAC (or I guess the crystal for the clock?) But then I realised I kinda had no real idea about how a stereo signal was different from a mono signal.

Is there any truth to his claim? Is he talking bs? Apparently he's done the calculations and he's right.

We stopped the conversation and carried on drinking after this, as it was clear we were at an impasse!
 
I was speaking to a friend of mine the other night, who is a proper electrical engineer geek, damn smart and likes theory. He's far from an audiophile but like well engineered kit for the geekery and agrees there is so much bs in the audio industry. He also agrees that 44kHz is fine for any music and CD quality is absolutely fine for frequency and 16Bit is fine for both noise and dynamic range etc etc. Also completely agrees that when comparing kit that level matching is paramount and the human ear will no doubt pick a louder device as better quality. So far so good.

However this is the curve-ball he threw out. He claims that 96kHz is far better for stereo in the "time domain" and it's better, otherwise there can be incorrect/unexpected phase shifts between the channels, but by going higher the phase shifts are much smaller.

My immediate thought was, there is no time information in the signal as such, apart from the samples themselves, but timing is handled by the DAC (or I guess the crystal for the clock?) But then I realised I kinda had no real idea about how a stereo signal was different from a mono signal.

Is there any truth to his claim? Is he talking bs? Apparently he's done the calculations and he's right.

We stopped the conversation and carried on drinking after this, as it was clear we were at an impasse!
Take a look at this thread.
 
OK - maybe I need a primer in stereo then - because aren't all these examples just a simple sine wave which would just be a single mono channel?
 
OK - maybe I need a primer in stereo then - because aren't all these examples just a simple sine wave which would just be a single mono channel?
That's what Stereo audio is. Two single Mono channels put into one container.

The independence of sample rate and timing/phase accuracy, does not change just because you put two audio channels into a file for Stereo playback.

They're perfectly accurate in isolation, hence also perfectly accurate to each other.
 
However this is the curve-ball he threw out. He claims that 96kHz is far better for stereo in the "time domain" and it's better, otherwise there can be incorrect/unexpected phase shifts between the channels, but by going higher the phase shifts are much smaller.

My immediate thought was, there is no time information in the signal as such, apart from the samples themselves, but timing is handled by the DAC (or I guess the crystal for the clock?) But then I realised I kinda had no real idea about how a stereo signal was different from a mono signal.

Is there any truth to his claim? Is he talking bs? Apparently he's done the calculations and he's right.
Without knowing what "calculations" your friend performed, it's hard to know where they went wrong. But it seems to boil down to the same old nonsense about "timing resolution", which as pointed out in the sources linked above is nonsense.
 
So, my assumption of what he's trying to say then, are that the 2 mono channels can somehow be out of phase, and a shorter time interval between samples would make them better aligned. But since there isn't any issue with "time" in a single channel, then there simply can't be with 2 either.
 
...
However this is the curve-ball he threw out. He claims that 96kHz is far better for stereo in the "time domain" and it's better, otherwise there can be incorrect/unexpected phase shifts between the channels, but by going higher the phase shifts are much smaller.
...
Your friend basically said that, if I understand correctly, that the timing resolution of 44.1 kHz sampled signal has audible implications, which is false. The smallest detectable interaural timing difference (ITD) is a few μs. CD can give you sub-1 μs resolution.


[Edit]
PS. Sound travels at ~343 m/s. Which means, in 1 μs, the sound wave moves 0.34 mm. Do you think you can align your head with the left and right speakers with a distance difference less than that?
 
Last edited:
So, my assumption of what he's trying to say then, are that the 2 mono channels can somehow be out of phase, and a shorter time interval between samples would make them better aligned. But since there isn't any issue with "time" in a single channel, then there simply can't be with 2 either.
You can shift the phase between Left and Right, but it's a conscious choice or operator error, not a naturally occuring error of digital audio.

If a timing error was introduced while creating the audio file, then increasing the sample rate would preserve the error 1:1.

And starting the creation at a higher sample rate from the get-go, does not reduce the chance of timing/phase errors because again: such errors are a conscious choice or operator error. Sample rate is unrelated.
 
Perfect, thanks all.
I will try and have a reasoned conversation and work out why he feels there is a time domain issue now I have some better understanding of where he might be going wrong with an assumption.

"They're perfectly accurate in isolation, hence also perfectly accurate to each other." I think that line is the one I need!
 
Your friend basically said that, if I understand correctly, that the timing resolution of 44.1 kHz sampled signal has audible implications, which is false. The smallest detectable interaural timing difference (ITD) is a few μs. CD can give you sub-1 μs resolution.

Also consider how these ITD thresholds are tested. From the link:
The stimulus that yielded the lowest threshold ITD was Gaussian noise, band-pass filtered from 20 to 1400 Hz, presented at 70 dB sound pressure level. The best method was a two-interval procedure with an interstimulus interval of 50 ms. The average threshold ITD for this condition at the 75% correct level was 6.9 μs for nine trained listeners and 18.1 μs for 52 un-trained listeners.
These studies find a test signal to get the lowest possible detection threshold. I'm not aware if any studies have been performed with music signals, but I'm confident the detection threshold would be far higher outside of the contrived testing conditions. And still, 44.1kHz is more than sufficient even for the testing conditions. Although, if I remember correctly, "timing resolution" isn't even really dependent on the sampling rate anyway. Sampling rate is entirely about bandwidth (if you want to capture signals up to 20kHz, then you need 20kHz * 2 = 40kHz sampling rate).
 
  • Like
Reactions: NTK
Perfect, thanks all.
I will try and have a reasoned conversation and work out why he feels there is a time domain issue now I have some better understanding of where he might be going wrong with an assumption.

"They're perfectly accurate in isolation, hence also perfectly accurate to each other." I think that line is the one I need!
This animation clearly shows how digital audio can achieve sub-sample timing accuracy:

Should help drive the point home.
 
who is a proper electrical engineer geek, damn smart and likes theory
Sometimes a little knowledge and theory can get you into trouble. ;) A lot of 'Audiophools' are scientists, engineers, or doctors, etc. (And a lot of audio "engineers" have the opposite problem of not being educated in science or traditional engineering.)

Besides the fact that the left & right are clocked together, he needs to think about the speed of sound and acoustic wavelengths. At 10kHz the wavelength is about 1.4 inches. So a 1.4 inch difference between left & right is a 360 degree phase shift! You ears are farther apart than that (although with the sound arriving form an angle the difference is less than the space between your ears).

If you play a 5-10kHz test-tone you'll hear the loudness change drastically with slight movements of your head as the left, right, and reflected soundwaves combine in-and-out of phase. We don't notice it with music, presumably because music is complex and we are used to it. Once, I was doing some high-frequency measurements with an SPL meter on a mic stand. The measurements were changing by several dB when I moved around behind the meter! That surprised me but it was my body affecting the reflected waves and how they combined in-and-out of phase with the direct waves.

And, we often over-estimate our hearing ability. The guys who do blind ABX test have pretty-well demonstrated that we generally can't hear the difference between a high-resolution original and a copy down-sampled to "CD quality". And of course, those listening tests are normally stereo. Even good-quality MP3 (lossy compression) can often sound identical to the original (in a proper blind listening test) or you might have to listen very carefully to hear a difference.

And phase shifts aren't normally an issue except when they are relative to something. For example, if you reverse the connections to one speaker to flip the polarity/phase 180 degrees, the bass sound waves cancel almost completely and at higher frequencies the waves combine in-and-out of phase depending on where you are in the room and the how the reflected waves interact and you get a "spacey-phasey" sound and sometimes a stereo "widening" effect. If you flip the polarity of both speakers everything is back in-phase and it sounds normal again.

Or since speaker crossovers usually introduce phase shifts, if the design isn't done properly you can end-up with the woofer & tweeter out-of-phase at the crossover frequency and you can get a dip in the response where they are both operating. Sometimes the solution is to invert the tweeter connections, putting them back in-phase at the crossover frequency.

Does Phase Distortion/Shift Matter In Audio?
 
Last edited:
Where high-res is useful is for live recording which will be subject to editing. Stretching a 16-bit sample to double its amplitude in the mix will space each sample at the same difference as it was an 8-bit sample, for example. But I think this has more to do with bit depth than sampling rate, given that we rarely stretch or shrink wavelengths the way we stretch or shrink amplitudes.

Recording at 96/24, however, obviates any issues from making large digital adjustments during mixing, before downsampling it to 44/16 for putting on a CD.

Rick "in digital photos the effect is called posterization" Denney
 
And what does this follow from? The sampling period of the СD is 22.7 µsec.
Nyquist sampling enables sub-sample timing accuracy:
 
if I remember correctly, "timing resolution" isn't even really dependent on the sampling rate anyway.
This is true. For a sine wave in the non-dithered case, it depends on the frequency and effective bit depth (see the first link in post #2). With proper dither, I believe it should be identical to a non-quantized, continuous time (i.e. analog) signal with the same SNR.
 
Back
Top Bottom