• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Master Thread: Are Measurements Everything or Nothing?

also

These have everything you want to know about upsampling and downsampling. Ignore the "hardware" part in Rabiner and Gold, it's slightly outdated, but look at the date of the book. Between these two references you've got it all.
 
also

These have everything you want to know about upsampling and downsampling. Ignore the "hardware" part in Rabiner and Gold, it's slightly outdated, but look at the date of the book. Between these two references you've got it all.
Yes, Rabiner, Gold, Oppenhieim and Shafer. I learned DSP from those books at around 40 years old, the transformation from S to Z domain...

My post title was "A case against High bit rate digital audio". I am not teaching DSP.
I am pointing out at gear on the market claiming better audio due to higher bit rate. Doing up-sampling, a digital computation in the DA hardware is the better option. One does not need to carry the extra dead weight. You can calculate (interpolate, filter) the "missing samples" from the received data. Converters operating at 768KHz sample rate for audio seem ridicules to be polite.
 
Yes, Rabiner, Gold, Oppenhieim and Shafer. I learned DSP from those books at around 40 years old, the transformation from S to Z domain...

My post title was "A case against High bit rate digital audio". I am not teaching DSP.
I am pointing out at gear on the market claiming better audio due to higher bit rate. Doing up-sampling, a digital computation in the DA hardware is the better option. One does not need to carry the extra dead weight. You can calculate (interpolate, filter) the "missing samples" from the received data. Converters operating at 768KHz sample rate for audio seem ridicules to be polite.
I know You know them, I put up references for people who need to understand time/frequency sharpness tradeoffs.
 
I know You know them, I put up references for people who need to understand time/frequency sharpness tradeoffs.
When I saw your response, pointing to DSP books I thought: JJ got board after the first paragraph, he did not read my post.

I tried to keep things simple but correct, for more people to understand. Most of my post would bore you. The theory is the foundation. Assume theoretically perfect 96KHz converter, 384KHz converter and perfect filter. If you up sample 96KH to 384KH you get the identical result. The higher rate demand memory, slow transfer faster digital audio communication … The claim that 384KH is better is false.
 
The case against higher bit rates (continued):

The DSP books are for all signals, not audio specific. The fun was to figure ways to implement the concepts, not always an easy task. Digital folks see ones and zeros. Analog folks talk about rise time transition from 0 to 1, ringing and much more.

Funny thing, you can’t do anything in zero time. The sampling theorem begins with “points in time” for each sample. The value is zero elsewhere. A sample in” electronic language” is a zero-width pulse.

Using a very narrow pulse (an approximation) does not work. It carries tiny energy, not enough to be useful. Also narrow pulse means wide bandwidth. A 1ns pulse with 100ps rise/fall times require electronics bandwidth of 3.5GHz).

So we end up with pulse as wide as possible, holding the sampled value all the way till the next sample. This is a deviation from the “zero pulse width”. Using math we know what the penalty is, loss of amplitude at higher frequencies.

For a DA the amplitude loss at 20KHz is -3.22dB for 44.1KHz sample rate. 2.7dB at 48KHz, 0.66dB at 96KHz, 0.16dB at 192KHz, 0.04dB at 384KHz…

Up sampling has been adopted as the smart solution. A digital solution would require a different “curve” (computation or lookup table) for each sample rate. Analog rules restrict proper compensation. Up sample is the way to go. Up sample 96KHz X4 = 384KHz yields 0.04dB flatness response to 20KHz and plenty of range for a realizable filter.

384KHz converters for audio offer conversion bandwidth for signals approaching 192KHz. I already pointed out the drawback: file size, internet and up/down load time, 50MHz hardware (transmitter, cable, receiver).

Another issue: 384KHz conversion extra bandwidth above hearing (say 20KHz to 192KHz) may contains energy above human hearing. Any energy in that frequency range is not what we heard. No need to record it. The safe bet is to avoid bandwidth you don’t need. Analog circuits tend to lose some linearity at higher frequencies. Any higher frequency non linearity (in the signal path) may spill some of the high frequency energy to the hearing range (inter-modulation).

Putting all hardware considerations aside, I am yet to find a real reason for higher bit rate conversion and an a lot of unused bandwidth.
 
(puts on filter designer hat) there are other problems with high sampling rates. In particular, look at how many bits of mantissa you need to make a 3rd order HP butterworth at 15Hz at 44, 48 96 and 384.

Nobody's doing that in a normal processor at 384. Takes 48 bits to get back a good solid 20 at 96.

It's a real problem.

Also, when you double the sampling rate of an FIR, two things happen:

1) you have to calculate twice as many outputs
2) each filter has to be approximately twice as long.

Yes, that's a factor of 4 in FLOPS, and maybe an increase in mantissa, since you start to lose bits to roundoff as you make filters longer and longer and longer, and, yes, I know how to do this as well as possible, but it doesn't matter, the more non-'1' coefficients, the worse it gets, without limit.
 
The people listening to 192K, 384KHz, 768KHz sampling hear music. So do I. But as a technical person it bothers me greatly because I see in my mind that Amplitude vs frequency plot with almost all of the information channel capacity being wasted. The consumer ends up with huge files for no good reason. As an added bonus, you need a lot more memory, with an added bonus of slower up/down load.

But it can be worse. 1.536MHz or 3.072MHz sample rates anyone? (I am not talking about one bit)
 
Another issue: 384KHz conversion extra bandwidth above hearing (say 20KHz to 192KHz) may contains energy above human hearing. Any energy in that frequency range is not what we heard. No need to record it. The safe bet is to avoid bandwidth you don’t need. Analog circuits tend to lose some linearity at higher frequencies. Any higher frequency non linearity (in the signal path) may spill some of the high frequency energy to the hearing range (inter-modulation).
And then there are the folks that really believe (because their ears tell them so) that an unfiltered sample-and-hold signal at 44.1 sounds 'the best' to them.
They believe (because the illegal square-wave signal and Dirac pulse looks so nice) that this sounds much better. Handily forgetting the huge amounts of mirrored not harmonically related crap that reaches the amp and drivers is inconsequential. :D

High sample rates and bit depth are fine for archiving (and even makes sense in the recording phase) but not needed for music enjoyment. Unfortunately people are less worried about storage space and bandwidth today as that is cheap and like the idea to have 'the best possible' signal as their ears and gears deserve that and as it is often available it makes sense to them buy and play those, not really needed, recordings.
That + all the other nonsense is the current state of 'high-end audio' and if you tell them 44.1/16 is enough they won't believe you as their ears and friends tell them you need the highest resolution available.

There appears to be a substantial discrepancy between 'real world engineering' and 'audiophiles and audiophools' which is impossible to bridge.
I prefer to hear/read about the engineering side rather than what some persons opinion is about what they perceive.
 
Last edited:
With all the blind testing that supports the notion that people can't tell the difference between 16/44 Redbook and higher resolution sound files, I'm sorta thinking that 24/96 provides all the margin for golden ears needed and therefore represents the "best possible".

I do record at 24/96, because I can remix the sound without worrying about a big move in EQ or mix causing an aliasing artifact. For the same reason, I scan film at 48-bit color depth (16 bits in each color channel) to avoid posterization if I make a big move in tonal rendition. But then the playback version is 16/44 played through a good DAC, or even MP3 at a good quality. And when I print a photo, it goes to the printer in 24-bit color.

Rick "wondering at the server farms the users of 100-megapixel cameras must maintain to store all those image files" Denney
 
With all the blind testing that supports the notion that people can't tell the difference between 16/44 Redbook and higher resolution sound files, I'm sorta thinking that 24/96 provides all the margin for golden ears needed and therefore represents the "best possible".

I do record at 24/96, because I can remix the sound without worrying about a big move in EQ or mix causing an aliasing artifact. For the same reason, I scan film at 48-bit color depth (16 bits in each color channel) to avoid posterization if I make a big move in tonal rendition. But then the playback version is 16/44 played through a good DAC, or even MP3 at a good quality. And when I print a photo, it goes to the printer in 24-bit color.

Rick "wondering at the server farms the users of 100-megapixel cameras must maintain to store all those image files" Denney
I agree with you that 96KHz is a good choice.
24 bits? That is not the case.

First, 24 bits means 144db dynamic range, that is not the case.

Second, almost all manufactures specify the dynamic range with A weighting, which adds a couple or more dB to the specifications.

Third, the dynamic range specification is only one thing. The various distortions specs are very importent (not just a 1KHz sine wave).

Also, many applications require low latancy. It takes a lot of parameters to specify a converter well.

But I agree with you that 96KHz is the best choice for AD. I avoided making AD with 192KHz. The market forced it, so I did it on my last gold AD.

The dynamic range is 126dB (20-20zKH no A Weighting), thus true 21 bits performance. The distortions are around %0.0003, very low latancy.
It was not easy...
 
Let me amplify (quantify) my comment about why 24 bits is not real:

24 bit means 144dB. (logarithmic scale).
144dB means 63 x 10^ -9 (63 nano) noise floor.

For a full pro level (24dBu, 12.28V rms) the noise voltage is 846nV, less then 1uV

846nV is the noise level generated internaly by a single 372 Ohm resistor (floating at mid air, room temperature 25 C).

24 bits means that the digital format (AES/EBU or SPDIF and more) can accomodate 24 bit of data. The audio does not offer 24 bits. I think 21 real bits is the state of the art.

24 bits is misunderstood by many. Too bad people fall for such a hype.
 
Let me amplify (quantify) my comment about why 24 bits is not real:

24 bit means 144dB. (logarithmic scale).
144dB means 63 x 10^ -9 (63 nano) noise floor.

For a full pro level (24dBu, 12.28V rms) the noise voltage is 846nV, less then 1uV

846nV is the noise level generated internaly by a single 372 Ohm resistor (floating at mid air, room temperature 25 C).

24 bits means that the digital format (AES/EBU or SPDIF and more) can accomodate 24 bit of data. The audio does not offer 24 bits. I think 21 real bits is the state of the art.

24 bits is misunderstood by many. Too bad people fall for such a hype.

I forgot to mention that the bandwidth for my calculations was 20KHz. So a 374 Ohm resistor noise makes the noise of a true 20-20KHz 24 bit converter.
 
I forgot to mention that the bandwidth for my calculations was 20KHz. So a 374 Ohm resistor noise makes the noise of a true 20-20KHz 24 bit converter.
I an sorry, I rechecked my calculations, the resistance noise equivalent is a round 2KOhm (812nV noise, 5.75nV/sqrHz, 20KHz BW...)
It is impossible to design real world stuff under such a constrain. Electronic components generate noise, and 144dB is huge, around a part per 16 million.
 
I agree with you that 96KHz is a good choice.
24 bits? That is not the case.

First, 24 bits means 144db dynamic range, that is not the case.

Second, almost all manufactures specify the dynamic range with A weighting, which adds a couple or more dB to the specifications.

Third, the dynamic range specification is only one thing. The various distortions specs are very importent (not just a 1KHz sine wave).

Also, many applications require low latancy. It takes a lot of parameters to specify a converter well.

But I agree with you that 96KHz is the best choice for AD. I avoided making AD with 192KHz. The market forced it, so I did it on my last gold AD.

The dynamic range is 126dB (20-20zKH no A Weighting), thus true 21 bits performance. The distortions are around %0.0003, very low latancy.
It was not easy...
If a relevant portion of a recorded track was at half of full scale, and the mixer wanted to digitally amplify it at full scale, a 16-bit peak sample would have only used 8 bits, which would now be scaled up to full scale. That channel would be limited to 48dB S/N, it seems to me, and that bit of gain-riding could lead to noise pumping or noticeable gaps in smooth crescendos and dimenuendos, depending on how it was done. Yes, that’s an extreme example, but I’ve seen worse moves during mixing in post, sometimes from big changes in EQ. If the sample had 24 bits of depth, that scaled-up sample would start with 12 bits—a big difference.

(Yes, all the surrounding hardware may do no better than 20 bits, but the mixing move is done in the digital domain these days. My Yamaha pro-audio digital PEQ does an AD on the way in and a DA on the way out, and at 16/48. But it’s used in a playback chain—different requirements.)

Are any of these effects audible? Depends on the mix, and whether other content masks it. But my idea for recording in 24 bits is to allow for those subsequent large adjustments. Once those are made, I downsample it for playback. Recording digitally from live analog recordings or directly from live performance seems to me a different use case that leads to different requirements.

For recording straight from analog sources that won’t be remixed just to make a digital version, I would agree 16 bits is fine and probably 20-30 dB better than the source anyway.

And maybe I’m wrong, but sampling the signal at a higher bit depth than its analog performance adds about 50% to the file size compared to 16 bits, which may be less than the performance of analog sources especially when heavily manipulated. Low cost for mitigating a low risk, perhaps.

Rick “has had to gain-ride low tracks from field recordings more than once” Denney
 
Last edited:
If a relevant portion of a recorded track was at half of full scale, and the mixer wanted to amplify it at full scale, a 16-bit peak sample would have only used 8 bits, which would now be scaled up to full scale. That channel would be limited to 48dB S/N, it seems to me, and that bit of gain-riding could lead to noise pumping or noticeable gaps in smooth crescendos and dimenuendos, depending on how it was done. Yes, that’s an extreme example, but I’ve seen worse moves during mixing in post, sometimes from big changes in EQ. If the sample had 24 bits of depth, that scaled-up sample would start with 12 bits—a big difference.
Not sure I understand. A change from half of FS to FS is just 1 bit. If you have a signal at half of FS in 16-bit format, that's 15-bits used and about 90 dB S/N.
 
Not sure I understand. A change from half of FS to FS is just 1 bit. If you have a signal at half of FS in 16-bit format, that's 15-bits used and about 90 dB S/N.
Yeah, that’s right. My thinking is better in analog. Nevertheless, I’m thinking of signals close to the floor that are being recovered. In the analog world, the noise would come up with it, of course, but my thinking is a big stretch in the digital domain leads to gaps. More but depth provides much smaller gaps to begin with.

I don’t think it’s about the minuscule level of a sample at 01h, but about the space between levels after a big stretch.

Like I said, maybe I’m wrong. But the way to find stuff out is to debate it.

I’ve certainly experienced this profoundly in other areas of digitized analog inputs, such as photography and in data collection devices that I’ve had to cobble together over the years.

Rick “prefers to debate with people who know more” Denney
 
Yeah, that’s right. My thinking is better in analog. Nevertheless, I’m thinking of signals close to the floor that are being recovered. In the analog world, the noise would come up with it, of course, but my thinking is a big stretch in the digital domain leads to gaps.
The only thing that "hides" in the gaps is quantization error, which with dither is just noise. Stretching the gaps just increases the noise, just like in analog. If you have a low level signal whose noise floor is at, let's say, -80 dBFS, then after amplifying it to full scale there will be no significant difference whether it was 16 or 24 bits. Of course if the noise floor in the original was lower than that, then 24 bits would start showing advantage.

It's usually the other direction where people talk about the difference between analog and digital. When you reduce the signal's amplitude, then the noise goes down with it in analog (well, as far as physics allows it :-)) but in digital it will be capped at quantization level.
 
With all the blind testing that supports the notion that people can't tell the difference between 16/44 Redbook and higher resolution sound files, I'm sorta thinking that 24/96 provides all the margin for golden ears needed and therefore represents the "best possible".
One can, barely, construct a situation in THEORY where 50khz sampling with a 20khz bandwidth might just be indetectable, relying on fantastically overestimated ability of the listener in a dead quiet anechoic room. Note "theory", not "evidence" there is NO EVIDENCE.

Based on that, I have advocated 64khz for years. It's got "margin" in all even potentially fantastical situations, including little kids and such.

So 96 is fine, I do think, even for the rare person over 5 years old with an undamaged first millimeter in the basilar membrane. That would be true at 64khz too.
 
Back
Top Bottom