• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The limits of 44.1k, filters and latency

BeerBear

Active Member
Joined
Mar 9, 2020
Messages
264
Likes
252
Is 44.1kHz fine as a delivery format and we don't need anything more?
Just about the only objection from reasonable people I see is that it's hard to filter content in a DAC with that sample rate...

Now, I only have a very superficial understanding of how a (modern) DAC works. From what I've read so far, the input digital signal gets oversampled, which creates imaging of the signal above nyquist (link). That content should then be low-pass filtered. Ideally the filter would not touch content until 20k (for the human ears to enjoy the "full range"), but remove everything above 22.05.
It's not hard to employ such a filter on a PC; a high quality software SRC can do it easily. But it's harder to do it in a DAC and the main reasons are processing power and latency. Supposedly even the SoX resampler, which is high quality and fast, has around 20ms of latency. That's fine for casual music listening, but not for interactive use (such as making music, for instance).
Does all this sound about right?

My question then is, where are we currently with filters in DACs? Are there DACs that can filter really well at low latency? Let's say... flat to 20k and -50dB at 22.05k, with no more than 1ms latency (latency of the DAC itself, not of the additional components or drivers). Does/would such a DAC use a lot of power?

For example, RME's ADI-2 filters are under 1ms, but looking at some graphs, even the sharpest one is still at -13dB at 22.05k. But according to the graph here, it attenuates really well at 24k (dunno if it's the same sharp filter or not).

Mind you, I don't claim that content at 22kHz and above is audible. So in the end this debate could all just be theoretical. I'm imagining some worst case scenarios, like a (young) person with exceptional hearing or -- more realistically -- signals that could somehow end up affecting the equipment and/or leak down into the audible range. For example in that first link, bennetng mentioned DACs with ASRC, which oversample at non-integer ratios and create aliasing in addition to imaging. But I have no idea if DACs like this are common at all...
 

Frgirard

Major Contributor
Joined
Apr 2, 2021
Messages
1,737
Likes
1,043
OMG, why reopen a subject thousands of times mentioned.
The response is the Nyquist frequency.
 

maverickronin

Major Contributor
Forum Donor
Joined
Jul 19, 2018
Messages
2,527
Likes
3,310
Location
Midwest, USA
My question then is, where are we currently with filters in DACs? Are there DACs that can filter really well at low latency? Let's say... flat to 20k and -50dB at 22.05k, with no more than 1ms latency (latency of the DAC itself, not of the additional components or drivers). Does/would such a DAC use a lot of power?

For example, RME's ADI-2 filters are under 1ms, but looking at some graphs, even the sharpest one is still at -13dB at 22.05k. But according to the graph here, it attenuates really well at 24k (dunno if it's the same sharp filter or not).

In case you were considering and ADI-2 DAC, all those figures are out of date now since they had to switch to using an ESS chip due to parts shortages.
 
OP
B

BeerBear

Active Member
Joined
Mar 9, 2020
Messages
264
Likes
252
@Frgirard: I don't think DAC filter latency has been discussed that much, because, understandably, it's not a great concern to most people here. It's only mentioned in passing here and there. So I wanted to have a more general overview of the situation.

@maverickronin: Thanks, but I'm not looking at any specific product right now, that was just an example. RME tends to prioritize low latency, so I expect most others to be worse, but I'm just guessing.


This is more of a discussion about how current and potential future DACs behave with 44.1k playback and whether a higher sample rate could be beneficial. Double-blind testing is great, but not always easy to do... so I'm looking at more theoretical answers, worst case edge scenarios and so on. IMD was mentioned in another thread, for instance. That could happen with strong content near nyquist and insufficient imaging attenuation. Likely to happen with typical music? Not at all, but it's still interesting to me.

It looks like for people who don't care about latency, the problem is essentially solved, because they could just use high quality software (on their computer, phone, CD player...) to resample 44.1 music to 88.2 before sending it to the DAC. That way, a slow filter of a DAC can do its job with no imaging and with all the audible content preserved faithfully.
But for people who do care about latency (or power usage), that's a significant price to pay.
 

AnalogSteph

Major Contributor
Joined
Nov 6, 2018
Messages
3,386
Likes
3,337
Location
.de
My question then is, where are we currently with filters in DACs? Are there DACs that can filter really well at low latency? Let's say... flat to 20k and -50dB at 22.05k, with no more than 1ms latency (latency of the DAC itself, not of the additional components or drivers). Does/would such a DAC use a lot of power?
Here's what ESS has implemented (e.g. ES9026PRO):
ess-filter-apod.png

That's 0.793 ms at 44.1k, and 0.461fs is 22330 Hz. The downside is that passband ripple is a bit crap (unfortunately there are no graphs provided for this part so we don't know how much of it is periodic and with what kind of frequency).
At the same group delay, using a conventional half-band filter gives these specs instead:
ess-filter-fastlin.png

It does allow some aliasing but not a lot... basically no signal components below 0.45fs = 19845 Hz will alias at all. A more than worthwhile tradeoff for a much flatter passband in my book, which is why this type of filter traditionally is the default.

(And that still isn't super flat compared to some DAC filters of yore. AD1955 and PCM1795 are spec'd at +/-0.0002 dB at 43.38fs and 38/fs respectively, and the SM5847AF dedicated digital filter from the '90s specifies less than +/-0.00002 dB of passband ripple and >117 dB of stopband attenuation past 0.5465fs at 46/fs if I have my math right.)

You can reduce group delay a whole lot if you say screw linear phase and go minimum phase instead:
ess-filter-fastmin.png

Then your group delay no longer is constant across the whole bandwidth though, and you can expect a steep increase as you are approaching the passband edge.

The CS4398 filter is some sort of hybrid job by the looks of it:
cs4398-filt.png


I think chip real estate has traditionally been a greater concern than power consumption, as each additional filter tap means you need storage for an additional sample (for the SM5847AF that was 169 + 69 + 17 taps), and you need a table of filter coefficients as well (half-band filters are pretty neat in this respect, as every other coefficient is zero, and the non-zero coefficients are symmetrical about the center of the impulse response).

If you look at the SM5847 datasheet, it's not exactly a terrible power hog once run at 3.3 V, and that's an over 20-year-old chip - with today's electronics running at 1.8 V, power consumption should be quite modest even for multichannel at 192 kHz.

Delay in a complete audio interface tends to be much more than just the DAC's. There may be other processing going on, plus some inevitable buffering as the computer may have better things to do than just paying attention to audio output all the time. RME is using FPGAs and I bet they're watching their driver overhead like a proverbial hawk.

Incidentally, for pure playback applications group delay virtually is a complete non-issue. As long as things don't literally take seconds to get going, nobody cares. In the olden days we would be setting our ASIO buffer size to as much as 2048 samples (that's ~46.4 ms at 44.1 kHz) if required for stability, as it simply did not matter. It's just a minor delay between software saying it's playing spot X and sound at spot X actually coming out.

Delay only becomes an issue once you are doing live monitoring through a complete A/D-D/A chain. Apparently, a latency of even 3ms can be disconcerting to a vocalist. This is why optional zero latency monitoring (analog routing of audio in to audio out) is a thing in audio interfaces. If you don't have the choice of going that route, then your best chance is increasing sample rate as far as either the converters will manage or the computer can keep up with. At 192 kHz, even the massive 63/fs group delay of an AK5394's filter will shrink to a measly 0.33 ms.

This issue also is part of why ADC digital filters are a bit crap these days (*cough* AK55xx *cough*). I wouldn't mind "low-latency" filters at all if we were at least given a "traditional" option to choose from as well. You ideally want to be running AK55xx ADCs at 384 kHz (or failing that, 192), and there are some interfaces using them that only support 96 kHz max, so that's a bit lame. RME had to leverage some DSP prowess to tackle at least a good part of the filter passband ripple.
 

maverickronin

Major Contributor
Forum Donor
Joined
Jul 19, 2018
Messages
2,527
Likes
3,310
Location
Midwest, USA
Incidentally, for pure playback applications group delay virtually is a complete non-issue. As long as things don't literally take seconds to get going, nobody cares. In the olden days we would be setting our ASIO buffer size to as much as 2048 samples (that's ~46.4 ms at 44.1 kHz) if required for stability, as it simply did not matter. It's just a minor delay between software saying it's playing spot X and sound at spot X actually coming out.

Delay only becomes an issue once you are doing live monitoring through a complete A/D-D/A chain. Apparently, a latency of even 3ms can be disconcerting to a vocalist. This is why optional zero latency monitoring (analog routing of audio in to audio out) is a thing in audio interfaces. If you don't have the choice of going that route, then your best chance is increasing sample rate as far as either the converters will manage or the computer can keep up with. At 192 kHz, even the massive 63/fs group delay of an AK5394's filter will shrink to a measly 0.33 ms.

Another application where latency is a big concern is gaming. Unfortunately I don't think there's any good data, or even consistent anecdotes, on this at all. On the one hand the requirements don't seem to be as strict as for musicians since computer games almost always use the standard OS mixer of whatever they are running on but on the other hand you definitely have to stay low enough to prevent lipsync issues.

That's generous enough that the filter in any individual DAC chip probably isn't going to be a problem by itself, but in an age of sophisticated EQ, room correction, and digital monitor speakers with the possibility multiple AD/DA loops, ASRC, and especially FIR filters it can really add up.

There's also video editing as well, at least depending on what kind of video you're making. I used to make anime music videos and would time effects down to a single frame at 29.97FPS. This was back in the day with a CRT and vanilla PCI SoundBlaster so latency was a non issue. If I was still doing that today and wasn't already a gamer I'd have to get a dedicated gaming monitor instead of graphics/video production monitor with better color. Bringing it back to audio I certainly couldn't use any standard FIR based room correction like Dirac and would have leave the extended phase linearization on something like the Kii 3 turned off.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
bennetng mentioned DACs with ASRC, which oversample at non-integer ratios and create aliasing in addition to imaging. But I have no idea if DACs like this are common at all...
That naive illustration is simply an explanation that integer upsampling is not going to introduce aliasing, just imaging. In real world, as shown on some ESS chips or external ASRC circuitry, high quality ASRC can actually improve thing, for example jitter. Don't know how it is going to affect latency though, I am also curious about it.
The downside is that passband ripple is a bit crap (unfortunately there are no graphs provided for this part so we don't know how much of it is periodic and with what kind of frequency).
Someone checked it on a ES9038.
 
OP
B

BeerBear

Active Member
Joined
Mar 9, 2020
Messages
264
Likes
252
Here's what ESS has implemented (e.g. ES9026PRO):
View attachment 188395
That's 0.793 ms at 44.1k, and 0.461fs is 22330 Hz. The downside is that passband ripple is a bit crap (unfortunately there are no graphs provided for this part so we don't know how much of it is periodic and with what kind of frequency).
At the same group delay, using a conventional half-band filter gives these specs instead:
View attachment 188408
It does allow some aliasing but not a lot... basically no signal components below 0.45fs = 19845 Hz will alias at all. A more than worthwhile tradeoff for a much flatter passband in my book, which is why this type of filter traditionally is the default.
It seems like those should look similar to filters 1 and 5 here:

index.php


index.php


So... either great attenuation at 22k, but with some attenuation/ripple in the "audible range" (0.461fs is 20330Hz, isn't it?) or great reproduction until 20k, but with low attenuation at 22k (still around -10dB).

Again, I don't want to make any claims about practical audible differences, just looking at graphs...


I think chip real estate has traditionally been a greater concern than power consumption, as each additional filter tap means you need storage for an additional sample
I see. I had to look up what the taps do exactly, quoting Amir:
The number of taps is a figure of merit for FIR filters. The more taps, the lesser the ripple, and sharper the response. The drawback is increased memory, computational horsepower and latency through the system.

So what I wonder is: is there no way out of this? Is it a physical/mathematical limitation that says you need more taps for a sharper/cleaner filter and that inevitably increases latency?
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
So... either great attenuation at 22k, but with some attenuation/ripple in the "audible range" (0.461fs is 20330Hz, isn't it?) or great reproduction until 20k, but with low attenuation at 22k (still around -10dB).
Not only passband ripple and attenuation at Nyquist, but also attenuation depth of all frequency after Nyquist. Within the same filter length, a steeper filter usually means poorer attenuation depth as well, as shown here:

For a practical approach in terms of getting good wideband measurement results, DAC and ADC always have distortion and noise at the analog side, the filter just need to exceed the analog performance to a certain degree to not introduce bottleneck. For example, my soundcard with TI PCM1794A has a filter with 130dB attenuation to ensure not to bottleneck analog performance.

Interestingly, the current generation of chips often only have like 80-120dB attenuation, but they seem to have greatly improved other things like modulator and such. Things like idle tones and rise of noise at ultrasonic range are pretty common on older converters, but usually greatly improved on newer chips.

Older chips like WM8741 also have a longer filter with low ripple and high attenuation at Nyquist, but only mentioned -100dB THD performance.
 
OP
B

BeerBear

Active Member
Joined
Mar 9, 2020
Messages
264
Likes
252
That's generous enough that the filter in any individual DAC chip probably isn't going to be a problem by itself, but in an age of sophisticated EQ, room correction, and digital monitor speakers with the possibility multiple AD/DA loops, ASRC, and especially FIR filters it can really add up.
Yep, and if you're applying DSP only to some speakers in the set you need to keep this in mind to prevent timing/phase issues. Same in the unlikely case that you're using different DACs at once.
Another major offender is Bluetooth. Its encoding/decoding adds significant latency (with some codecs performing better than others).
 

AnalogSteph

Major Contributor
Joined
Nov 6, 2018
Messages
3,386
Likes
3,337
Location
.de
It seems like those should look similar to filters 1 and 5 here:

index.php


index.php
It takes the juxtaposition of these two graphs to get the big picture, so great job compiling them like this. Let's play a who's who of digital filters:
#5 is the fast roll-off linear phase filter, I agree.
#3 is the fast roll-off minimum phase filter.
#2 is slow roll-off minimum phase, #4 its linear phase cousin.
#1 is the apodizing job, also agree on that one.
#6 is brickwall.
#7 is hybrid.
You will often find passband behavior mirrored in the stopband (equiripple filter). It just looks a bit funny on a dB scale - 1 is 0 dB, 0.99 is -0.08 dB, 0 is -infinity dB, 0.01 is -40 dB.

So... either great attenuation at 22k, but with some attenuation/ripple in the "audible range" (0.461fs is 20330Hz, isn't it?) or great reproduction until 20k, but with low attenuation at 22k (still around -10dB).
Oops, mea culpa - pardon the tpyo.

There's a few different tradeoffs in there, but you've pretty much got the gist.
I had to look up what the taps do exactly, quoting Amir:
Let me borrow something from Wikipedia:
1280px-FIR_Filter.svg.png

A direct form discrete-time FIR filter of order N. The top part is an N-stage delay line with N + 1 taps. Each unit delay is a z−1 operator in Z-transform notation.
A delay line in turn is the parallel version of a shift register. That's a whole bunch of flip-flops right there. (You can also build delay line filters in the analog domain though, one notable example being SAW filters that are finding use for RF band filtering in the mid-hundreds of MHz to low GHz.)

So what I wonder is: is there no way out of this? Is it a physical/mathematical limitation that says you need more taps for a sharper/cleaner filter and that inevitably increases latency?
As long as you're staying in the world of linear phase (= FIR) filters, that's pretty much the case.

You can "cheat" by going partially or fully minimum phase (= IIR, recursive), of course. IIR filters are most closely related to the kind we are most often seeing in the analog world, think LC and the like. Historically, IIR filters were unpopular in DSP as introducing feedback allows rounding errors to accumulate. That's a major concern when all you have is 24-bit fixed point data like the classic Motorola 56K architecture, so generally FIR it was. Today's computers that will juggle 32-bit or even 64-bit floating point with ease are a much different story.
 

dc655321

Major Contributor
Joined
Mar 4, 2018
Messages
1,597
Likes
2,235
As long as you're staying in the world of linear phase (= FIR) filters, that's pretty much the case.

You can "cheat" by going partially or fully minimum phase (= IIR, recursive), of course. IIR filters are most closely related to the kind we are most often seeing in the analog world, think LC and the like.

I think you know this, but FIR filters can be made minimum phase (or other) too. That property is not exclusive to IIR filters.
 

DVDdoug

Major Contributor
Joined
May 27, 2021
Messages
3,024
Likes
3,980
Latency is delay - When I'm listening to a Beatles recording with 60 years of "latency" a few more milliseconds don't bother me! :D :D :D

Latency is ONLY a problem when you are recording and monitoring yourself through the computer. If there is too much latency the delay in your headphones makes it difficult to perform. Some audio interfaces have zero-latency direct-hardware monitoring where the monitoring path doesn't go-through the computer (but you can still monitor the backing-track from the computer) or you can set-up a separate monitoring system, etc.

Latency is related to buffering and a larger buffer (more latency) can help to prevent "glitches" so it can be a good thing!

In the old days of analog tape the space between the record head and playback/monitoring head (on a 3-head machine) created a delay so the performer would NEVER be monitoring through the tape machine. (And, you could loop-back to make an echo effect.)

A funny observation about filtering - I had a soundcard with no filtering! It was on a computer at work that I used for casual listening but I never heard anything wrong! One day I was doing some experiments and I connected an oscilloscope. (I don't remember what the experiments were about.) I saw a stair-stepped waveform! I was shocked! But after I got over the shock I realized that the noise is above the audio range plus it's filtered by the limitation of the speakers.
 

mieswall

Member
Joined
Nov 10, 2019
Messages
65
Likes
112
OMG, why reopen a subject thousands of times mentioned.
The response is the Nyquist frequency.
Even so, taking that rme example with its gentle roll off at the expense of going some 2khz beyond nyquist fr... wouldn't it be a suitable strategy to move the filter 2 khz back? The cut-off at 19 khz would be 1db, and some 10db at 20 khz, but with no potential aliasing beyond nyquist fr. That would "measure" ugly, but probably would sound better, as no real music can reach full amplitude at those frecuencies and then this "invasive" filter wouldn't be doing any harm, right?
 
OP
B

BeerBear

Active Member
Joined
Mar 9, 2020
Messages
264
Likes
252
Even so, taking that rme example with its gentle roll off at the expense of going some 2khz beyond nyquist fr... wouldn't it be a suitable strategy to move the filter 2 khz back? The cut-off at 19 khz would be 1db, and some 10db at 20 khz, but with no potential aliasing beyond nyquist fr. That would "measure" ugly, but probably would sound better, as no real music can reach full amplitude at those frecuencies and then this "invasive" filter wouldn't be doing any harm, right?
Well, the assumption is that humans can hear 20kHz. Most of us adults can't, but that's usually the ideal to strive for. You can of course make your own rules and decide what's good enough for you.
As I said before, I don't even think this is a problem for music playback, where low latency is not necessary. Because there you can apply high quality software digital filters that eliminate imaging without affecting the audible range—at least if you're using a computer (or something like that) as a source.

But for music production, if low latency is required, 44.1kHz just won't cut it, assuming the goal is to make a "technically perfect" product. 48kHz might just be enough, maybe. 88.2kHz would be playing it safe.
From what I see, most music production does not include this obsessive chase for technical perfection, which is why a lot of it is still being produced at 44.1kHz today. I'm talking about the sample rate used during mixing and mastering and so on, not just the delivery format. But technical perfection is probably not required for audible perfection (aka transparency), so maybe this is all theoretical. Nevertheless, technical perfection is cheap enough nowadays that it should be attractive to the audiophile crowd.


And it would be interesting to know just how much latency is needed at 44.1kHz to get a nice sharp filter. 1ms looks to not be enough, but I believe that 20ms is, because I've read that's roughly the latency of the SoX resampler.
 

AnalogSteph

Major Contributor
Joined
Nov 6, 2018
Messages
3,386
Likes
3,337
Location
.de
And it would be interesting to know just how much latency is needed at 44.1kHz to get a nice sharp filter. 1ms looks to not be enough, but I believe that 20ms is, because I've read that's roughly the latency of the SoX resampler.
Well, here's some ADC filters that I would generally consider "44.1 kHz proof" (many of them EOL for years)...

TypePBSBPRSAGD
AK539420.00 kHz24.10 kHz+/-0.001 dB-120 dB63/fs = 1.43 ms
AK5383 & AK539320.00 kHz24.10 kHz+/-0.001 dB-110 dB38.7/fs = 0.877 ms
AK538519.75 kHz24.35 kHz+/-0.005 dB-100 dB43.2/fs = 0.980 ms
AK5397 (Sharp)20.15 kHz24.04 kHz+0.00015 / -0.00010 dB-100 dB41.5/fs = 0.941 ms
PCM4220/2 (Classic)20.00 kHz24.10 kHz+/-0.00015 dB-100 dB39/fs = 0.884 ms
PCM4202/4, PCM180419.98 kHz24.12 kHz+/-0.005 dB-100 dB37/fs = 0.838 ms
CS539620.30 kHz24.44 kHz+/-0.005 dB-117 dB34/fs = 0.771 ms
CS539717.46 kHz21.96 kHz+/-0.005 dB-117 dB34/fs = 0.771 ms

Filter design is a tradeoff between multiple performance parameters for a given number of taps, so you have to specify several of them before the effort required can be estimated. It seems safe to say that little over a hundred samples worth of delay should go a long way though.

The Foobar2000 SoX resampler plugin also permits changing the phase response all the way from linear phase to minimum phase, which should be accompanied by a corresponding difference in in-band group delay if it does what it says on the tin.

As an aside, ADC stopband attenuation may be compromised by clock jitter. Might be part of why values in excess of 100 dB are rarely being pursued these days, alongside the usual group delay concerns.
 
OP
B

BeerBear

Active Member
Joined
Mar 9, 2020
Messages
264
Likes
252
Those numbers in the chart don't look that good to me, unless the filter slope is straight/linear.
You can get similar (around -100dB at 24kHz) attenuation with typical "sharp" filters used in current DACs, but due to the convex shape of the slope, attenuation at 22kHz is only around -10dB. Example.
 

Sokel

Master Contributor
Joined
Sep 8, 2021
Messages
6,093
Likes
6,134
For AK5385 I can measure if you like as it's what E-MU has (as long as you tell me what to do in Multitone software,total newbie)
That's how I measured Khadas tone1 filters:



Khadas filters.PNG
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
Looks like this one is clipped (should be ESS minimum phase fast rolloff filter).
khadas.png


Also, most analog measurements (including AP) cannot reveal the whole stopband shape without using more advanced methods, and give a false impression that most filters in your screenshot for example only have 90dB attenuation. You can see that how @pkane used a specialized signal to get 120dB attenuation, the real attenuation depth of for example, ESS linear phase fast rolloff filter.

 
Top Bottom