The limits of 44.1k, filters and latency

BeerBear · Feb 21, 2022

Is 44.1kHz fine as a delivery format and we don't need anything more?
Just about the only objection from reasonable people I see is that it's hard to filter content in a DAC with that sample rate...

Now, I only have a very superficial understanding of how a (modern) DAC works. From what I've read so far, the input digital signal gets oversampled, which creates imaging of the signal above nyquist (link). That content should then be low-pass filtered. Ideally the filter would not touch content until 20k (for the human ears to enjoy the "full range"), but remove everything above 22.05.
It's not hard to employ such a filter on a PC; a high quality software SRC can do it easily. But it's harder to do it in a DAC and the main reasons are processing power and latency. Supposedly even the SoX resampler, which is high quality and fast, has around 20ms of latency. That's fine for casual music listening, but not for interactive use (such as making music, for instance).
Does all this sound about right?

My question then is, where are we currently with filters in DACs? Are there DACs that can filter really well at low latency? Let's say... flat to 20k and -50dB at 22.05k, with no more than 1ms latency (latency of the DAC itself, not of the additional components or drivers). Does/would such a DAC use a lot of power?

For example, RME's ADI-2 filters are under 1ms, but looking at some graphs, even the sharpest one is still at -13dB at 22.05k. But according to the graph here, it attenuates really well at 24k (dunno if it's the same sharp filter or not).

Mind you, I don't claim that content at 22kHz and above is audible. So in the end this debate could all just be theoretical. I'm imagining some worst case scenarios, like a (young) person with exceptional hearing or -- more realistically -- signals that could somehow end up affecting the equipment and/or leak down into the audible range. For example in that first link, bennetng mentioned DACs with ASRC, which oversample at non-integer ratios and create aliasing in addition to imaging. But I have no idea if DACs like this are common at all...

Frgirard · Feb 21, 2022

OMG, why reopen a subject thousands of times mentioned.
The response is the Nyquist frequency.

maverickronin · Feb 21, 2022

BeerBear said:
My question then is, where are we currently with filters in DACs? Are there DACs that can filter really well at low latency? Let's say... flat to 20k and -50dB at 22.05k, with no more than 1ms latency (latency of the DAC itself, not of the additional components or drivers). Does/would such a DAC use a lot of power?

For example, RME's ADI-2 filters are under 1ms, but looking at some graphs, even the sharpest one is still at -13dB at 22.05k. But according to the graph here, it attenuates really well at 24k (dunno if it's the same sharp filter or not).

In case you were considering and ADI-2 DAC, all those figures are out of date now since they had to switch to using an ESS chip due to parts shortages.

BeerBear · Feb 21, 2022

@Frgirard: I don't think DAC filter latency has been discussed that much, because, understandably, it's not a great concern to most people here. It's only mentioned in passing here and there. So I wanted to have a more general overview of the situation.

@maverickronin: Thanks, but I'm not looking at any specific product right now, that was just an example. RME tends to prioritize low latency, so I expect most others to be worse, but I'm just guessing.

This is more of a discussion about how current and potential future DACs behave with 44.1k playback and whether a higher sample rate could be beneficial. Double-blind testing is great, but not always easy to do... so I'm looking at more theoretical answers, worst case edge scenarios and so on. IMD was mentioned in another thread, for instance. That could happen with strong content near nyquist and insufficient imaging attenuation. Likely to happen with typical music? Not at all, but it's still interesting to me.

It looks like for people who don't care about latency, the problem is essentially solved, because they could just use high quality software (on their computer, phone, CD player...) to resample 44.1 music to 88.2 before sending it to the DAC. That way, a slow filter of a DAC can do its job with no imaging and with all the audible content preserved faithfully.
But for people who do care about latency (or power usage), that's a significant price to pay.

AnalogSteph · Feb 22, 2022

BeerBear said:
My question then is, where are we currently with filters in DACs? Are there DACs that can filter really well at low latency? Let's say... flat to 20k and -50dB at 22.05k, with no more than 1ms latency (latency of the DAC itself, not of the additional components or drivers). Does/would such a DAC use a lot of power?

Here's what ESS has implemented (e.g. ES9026PRO):

That's 0.793 ms at 44.1k, and 0.461fs is 22330 Hz. The downside is that passband ripple is a bit crap (unfortunately there are no graphs provided for this part so we don't know how much of it is periodic and with what kind of frequency).
At the same group delay, using a conventional half-band filter gives these specs instead:

It does allow some aliasing but not a lot... basically no signal components below 0.45fs = 19845 Hz will alias at all. A more than worthwhile tradeoff for a much flatter passband in my book, which is why this type of filter traditionally is the default.

(And that still isn't super flat compared to some DAC filters of yore. AD1955 and PCM1795 are spec'd at +/-0.0002 dB at 43.38fs and 38/fs respectively, and the SM5847AF dedicated digital filter from the '90s specifies less than +/-0.00002 dB of passband ripple and >117 dB of stopband attenuation past 0.5465fs at 46/fs if I have my math right.)

You can reduce group delay a whole lot if you say screw linear phase and go minimum phase instead:

Then your group delay no longer is constant across the whole bandwidth though, and you can expect a steep increase as you are approaching the passband edge.

The CS4398 filter is some sort of hybrid job by the looks of it:

I think chip real estate has traditionally been a greater concern than power consumption, as each additional filter tap means you need storage for an additional sample (for the SM5847AF that was 169 + 69 + 17 taps), and you need a table of filter coefficients as well (half-band filters are pretty neat in this respect, as every other coefficient is zero, and the non-zero coefficients are symmetrical about the center of the impulse response).

If you look at the SM5847 datasheet, it's not exactly a terrible power hog once run at 3.3 V, and that's an over 20-year-old chip - with today's electronics running at 1.8 V, power consumption should be quite modest even for multichannel at 192 kHz.

Delay in a complete audio interface tends to be much more than just the DAC's. There may be other processing going on, plus some inevitable buffering as the computer may have better things to do than just paying attention to audio output all the time. RME is using FPGAs and I bet they're watching their driver overhead like a proverbial hawk.

Incidentally, for pure playback applications group delay virtually is a complete non-issue. As long as things don't literally take seconds to get going, nobody cares. In the olden days we would be setting our ASIO buffer size to as much as 2048 samples (that's ~46.4 ms at 44.1 kHz) if required for stability, as it simply did not matter. It's just a minor delay between software saying it's playing spot X and sound at spot X actually coming out.

Delay only becomes an issue once you are doing live monitoring through a complete A/D-D/A chain. Apparently, a latency of even 3ms can be disconcerting to a vocalist. This is why optional zero latency monitoring (analog routing of audio in to audio out) is a thing in audio interfaces. If you don't have the choice of going that route, then your best chance is increasing sample rate as far as either the converters will manage or the computer can keep up with. At 192 kHz, even the massive 63/fs group delay of an AK5394's filter will shrink to a measly 0.33 ms.

This issue also is part of why ADC digital filters are a bit crap these days (*cough* AK55xx *cough*). I wouldn't mind "low-latency" filters at all if we were at least given a "traditional" option to choose from as well. You ideally want to be running AK55xx ADCs at 384 kHz (or failing that, 192), and there are some interfaces using them that only support 96 kHz max, so that's a bit lame. RME had to leverage some DSP prowess to tackle at least a good part of the filter passband ripple.

maverickronin · Feb 22, 2022

AnalogSteph said:
Incidentally, for pure playback applications group delay virtually is a complete non-issue. As long as things don't literally take seconds to get going, nobody cares. In the olden days we would be setting our ASIO buffer size to as much as 2048 samples (that's ~46.4 ms at 44.1 kHz) if required for stability, as it simply did not matter. It's just a minor delay between software saying it's playing spot X and sound at spot X actually coming out.

Delay only becomes an issue once you are doing live monitoring through a complete A/D-D/A chain. Apparently, a latency of even 3ms can be disconcerting to a vocalist. This is why optional zero latency monitoring (analog routing of audio in to audio out) is a thing in audio interfaces. If you don't have the choice of going that route, then your best chance is increasing sample rate as far as either the converters will manage or the computer can keep up with. At 192 kHz, even the massive 63/fs group delay of an AK5394's filter will shrink to a measly 0.33 ms.

Another application where latency is a big concern is gaming. Unfortunately I don't think there's any good data, or even consistent anecdotes, on this at all. On the one hand the requirements don't seem to be as strict as for musicians since computer games almost always use the standard OS mixer of whatever they are running on but on the other hand you definitely have to stay low enough to prevent lipsync issues.

That's generous enough that the filter in any individual DAC chip probably isn't going to be a problem by itself, but in an age of sophisticated EQ, room correction, and digital monitor speakers with the possibility multiple AD/DA loops, ASRC, and especially FIR filters it can really add up.

There's also video editing as well, at least depending on what kind of video you're making. I used to make anime music videos and would time effects down to a single frame at 29.97FPS. This was back in the day with a CRT and vanilla PCI SoundBlaster so latency was a non issue. If I was still doing that today and wasn't already a gamer I'd have to get a dedicated gaming monitor instead of graphics/video production monitor with better color. Bringing it back to audio I certainly couldn't use any standard FIR based room correction like Dirac and would have leave the extended phase linearization on something like the Kii 3 turned off.

bennetng · Feb 22, 2022

BeerBear said:
bennetng mentioned DACs with ASRC, which oversample at non-integer ratios and create aliasing in addition to imaging. But I have no idea if DACs like this are common at all...

That naive illustration is simply an explanation that integer upsampling is not going to introduce aliasing, just imaging. In real world, as shown on some ESS chips or external ASRC circuitry, high quality ASRC can actually improve thing, for example jitter. Don't know how it is going to affect latency though, I am also curious about it.

AnalogSteph said:
The downside is that passband ripple is a bit crap (unfortunately there are no graphs provided for this part so we don't know how much of it is periodic and with what kind of frequency).

Someone checked it on a ES9038.

Topping D90SE Review (Balanced DAC)

Nice @dsnyder0cnn. Suggest changing the Topping D90SE to Filter 5 "Fast-Roll Linear" (best IMO) to remove all that rippling in the frequency response from the apodizing filter (Filter 1). Thanks for the tip. I plan to measure the amplitude and time-domain behavior of all seven filters…perhaps...

www.audiosciencereview.com

BeerBear · Feb 22, 2022

AnalogSteph said:
Here's what ESS has implemented (e.g. ES9026PRO):
View attachment 188395
That's 0.793 ms at 44.1k, and 0.461fs is 22330 Hz. The downside is that passband ripple is a bit crap (unfortunately there are no graphs provided for this part so we don't know how much of it is periodic and with what kind of frequency).
At the same group delay, using a conventional half-band filter gives these specs instead:
View attachment 188408
It does allow some aliasing but not a lot... basically no signal components below 0.45fs = 19845 Hz will alias at all. A more than worthwhile tradeoff for a much flatter passband in my book, which is why this type of filter traditionally is the default.

It seems like those should look similar to filters 1 and 5 here:

So... either great attenuation at 22k, but with some attenuation/ripple in the "audible range" (0.461fs is 20330Hz, isn't it?) or great reproduction until 20k, but with low attenuation at 22k (still around -10dB).

Again, I don't want to make any claims about practical audible differences, just looking at graphs...

I think chip real estate has traditionally been a greater concern than power consumption, as each additional filter tap means you need storage for an additional sample

I see. I had to look up what the taps do exactly, quoting Amir:

The number of taps is a figure of merit for FIR filters. The more taps, the lesser the ripple, and sharper the response. The drawback is increased memory, computational horsepower and latency through the system.

So what I wonder is: is there no way out of this? Is it a physical/mathematical limitation that says you need more taps for a sharper/cleaner filter and that inevitably increases latency?

dc655321 · Feb 22, 2022

BeerBear said:
Is it a physical/mathematical limitation that says you need more taps for a sharper/cleaner filter and that inevitably increases latency?

Yes.
With caveats, of course.

bennetng · Feb 23, 2022

BeerBear said:
So... either great attenuation at 22k, but with some attenuation/ripple in the "audible range" (0.461fs is 20330Hz, isn't it?) or great reproduction until 20k, but with low attenuation at 22k (still around -10dB).

Not only passband ripple and attenuation at Nyquist, but also attenuation depth of all frequency after Nyquist. Within the same filter length, a steeper filter usually means poorer attenuation depth as well, as shown here:

SMSL SU-9 Balanced DAC Review

Can someone address this Reddit post on this subject.. is this guy right? I’m not an EE and honestly don’t know. This person claims anything besides brick wall is wrong wrong wrong. He conveniently forgets minimum phase filters are used in minimum latency applications like for stage or other...

www.audiosciencereview.com

For a practical approach in terms of getting good wideband measurement results, DAC and ADC always have distortion and noise at the analog side, the filter just need to exceed the analog performance to a certain degree to not introduce bottleneck. For example, my soundcard with TI PCM1794A has a filter with 130dB attenuation to ensure not to bottleneck analog performance.

Interestingly, the current generation of chips often only have like 80-120dB attenuation, but they seem to have greatly improved other things like modulator and such. Things like idle tones and rise of noise at ultrasonic range are pretty common on older converters, but usually greatly improved on newer chips.

Older chips like WM8741 also have a longer filter with low ripple and high attenuation at Nyquist, but only mentioned -100dB THD performance.

BeerBear · Feb 23, 2022

maverickronin said:
That's generous enough that the filter in any individual DAC chip probably isn't going to be a problem by itself, but in an age of sophisticated EQ, room correction, and digital monitor speakers with the possibility multiple AD/DA loops, ASRC, and especially FIR filters it can really add up.

Yep, and if you're applying DSP only to some speakers in the set you need to keep this in mind to prevent timing/phase issues. Same in the unlikely case that you're using different DACs at once.
Another major offender is Bluetooth. Its encoding/decoding adds significant latency (with some codecs performing better than others).

AnalogSteph · Feb 23, 2022

BeerBear said:
It seems like those should look similar to filters 1 and 5 here:

It takes the juxtaposition of these two graphs to get the big picture, so great job compiling them like this. Let's play a who's who of digital filters:
#5 is the fast roll-off linear phase filter, I agree.
#3 is the fast roll-off minimum phase filter.
#2 is slow roll-off minimum phase, #4 its linear phase cousin.
#1 is the apodizing job, also agree on that one.
#6 is brickwall.
#7 is hybrid.
You will often find passband behavior mirrored in the stopband (equiripple filter). It just looks a bit funny on a dB scale - 1 is 0 dB, 0.99 is -0.08 dB, 0 is -infinity dB, 0.01 is -40 dB.

BeerBear said:
So... either great attenuation at 22k, but with some attenuation/ripple in the "audible range" (0.461fs is 20330Hz, isn't it?) or great reproduction until 20k, but with low attenuation at 22k (still around -10dB).

Oops, mea culpa - pardon the tpyo.

There's a few different tradeoffs in there, but you've pretty much got the gist.

BeerBear said:
I had to look up what the taps do exactly, quoting Amir:

Let me borrow something from Wikipedia:

A direct form discrete-time FIR filter of order N. The top part is an N-stage delay line with N + 1 taps. Each unit delay is a z−1 operator in Z-transform notation.

A delay line in turn is the parallel version of a shift register. That's a whole bunch of flip-flops right there. (You can also build delay line filters in the analog domain though, one notable example being SAW filters that are finding use for RF band filtering in the mid-hundreds of MHz to low GHz.)

BeerBear said:
So what I wonder is: is there no way out of this? Is it a physical/mathematical limitation that says you need more taps for a sharper/cleaner filter and that inevitably increases latency?

As long as you're staying in the world of linear phase (= FIR) filters, that's pretty much the case.

You can "cheat" by going partially or fully minimum phase (= IIR, recursive), of course. IIR filters are most closely related to the kind we are most often seeing in the analog world, think LC and the like. Historically, IIR filters were unpopular in DSP as introducing feedback allows rounding errors to accumulate. That's a major concern when all you have is 24-bit fixed point data like the classic Motorola 56K architecture, so generally FIR it was. Today's computers that will juggle 32-bit or even 64-bit floating point with ease are a much different story.

dc655321 · Feb 23, 2022

AnalogSteph said:
As long as you're staying in the world of linear phase (= FIR) filters, that's pretty much the case.

You can "cheat" by going partially or fully minimum phase (= IIR, recursive), of course. IIR filters are most closely related to the kind we are most often seeing in the analog world, think LC and the like.

I think you know this, but FIR filters can be made minimum phase (or other) too. That property is not exclusive to IIR filters.

DVDdoug · Feb 23, 2022

Latency is delay - When I'm listening to a Beatles recording with 60 years of "latency" a few more milliseconds don't bother me!

Latency is ONLY a problem when you are recording and monitoring yourself through the computer. If there is too much latency the delay in your headphones makes it difficult to perform. Some audio interfaces have zero-latency direct-hardware monitoring where the monitoring path doesn't go-through the computer (but you can still monitor the backing-track from the computer) or you can set-up a separate monitoring system, etc.

Latency is related to buffering and a larger buffer (more latency) can help to prevent "glitches" so it can be a good thing!

In the old days of analog tape the space between the record head and playback/monitoring head (on a 3-head machine) created a delay so the performer would NEVER be monitoring through the tape machine. (And, you could loop-back to make an echo effect.)

A funny observation about filtering - I had a soundcard with no filtering! It was on a computer at work that I used for casual listening but I never heard anything wrong! One day I was doing some experiments and I connected an oscilloscope. (I don't remember what the experiments were about.) I saw a stair-stepped waveform! I was shocked! But after I got over the shock I realized that the noise is above the audio range plus it's filtered by the limitation of the speakers.

mieswall · Aug 27, 2022

Frgirard said:
OMG, why reopen a subject thousands of times mentioned.
The response is the Nyquist frequency.

Even so, taking that rme example with its gentle roll off at the expense of going some 2khz beyond nyquist fr... wouldn't it be a suitable strategy to move the filter 2 khz back? The cut-off at 19 khz would be 1db, and some 10db at 20 khz, but with no potential aliasing beyond nyquist fr. That would "measure" ugly, but probably would sound better, as no real music can reach full amplitude at those frecuencies and then this "invasive" filter wouldn't be doing any harm, right?

BeerBear · Aug 28, 2022

mieswall said:
Even so, taking that rme example with its gentle roll off at the expense of going some 2khz beyond nyquist fr... wouldn't it be a suitable strategy to move the filter 2 khz back? The cut-off at 19 khz would be 1db, and some 10db at 20 khz, but with no potential aliasing beyond nyquist fr. That would "measure" ugly, but probably would sound better, as no real music can reach full amplitude at those frecuencies and then this "invasive" filter wouldn't be doing any harm, right?

Well, the assumption is that humans can hear 20kHz. Most of us adults can't, but that's usually the ideal to strive for. You can of course make your own rules and decide what's good enough for you.
As I said before, I don't even think this is a problem for music playback, where low latency is not necessary. Because there you can apply high quality software digital filters that eliminate imaging without affecting the audible range—at least if you're using a computer (or something like that) as a source.

But for music production, if low latency is required, 44.1kHz just won't cut it, assuming the goal is to make a "technically perfect" product. 48kHz might just be enough, maybe. 88.2kHz would be playing it safe.
From what I see, most music production does not include this obsessive chase for technical perfection, which is why a lot of it is still being produced at 44.1kHz today. I'm talking about the sample rate used during mixing and mastering and so on, not just the delivery format. But technical perfection is probably not required for audible perfection (aka transparency), so maybe this is all theoretical. Nevertheless, technical perfection is cheap enough nowadays that it should be attractive to the audiophile crowd.

And it would be interesting to know just how much latency is needed at 44.1kHz to get a nice sharp filter. 1ms looks to not be enough, but I believe that 20ms is, because I've read that's roughly the latency of the SoX resampler.

AnalogSteph · Aug 30, 2022

BeerBear said:
And it would be interesting to know just how much latency is needed at 44.1kHz to get a nice sharp filter. 1ms looks to not be enough, but I believe that 20ms is, because I've read that's roughly the latency of the SoX resampler.

Well, here's some ADC filters that I would generally consider "44.1 kHz proof" (many of them EOL for years)...

Type	PB	SB	PR	SA	GD
AK5394	20.00 kHz	24.10 kHz	+/-0.001 dB	-120 dB	63/fs = 1.43 ms
AK5383 & AK5393	20.00 kHz	24.10 kHz	+/-0.001 dB	-110 dB	38.7/fs = 0.877 ms
AK5385	19.75 kHz	24.35 kHz	+/-0.005 dB	-100 dB	43.2/fs = 0.980 ms
AK5397 (Sharp)	20.15 kHz	24.04 kHz	+0.00015 / -0.00010 dB	-100 dB	41.5/fs = 0.941 ms
PCM4220/2 (Classic)	20.00 kHz	24.10 kHz	+/-0.00015 dB	-100 dB	39/fs = 0.884 ms
PCM4202/4, PCM1804	19.98 kHz	24.12 kHz	+/-0.005 dB	-100 dB	37/fs = 0.838 ms
CS5396	20.30 kHz	24.44 kHz	+/-0.005 dB	-117 dB	34/fs = 0.771 ms
CS5397	17.46 kHz	21.96 kHz	+/-0.005 dB	-117 dB	34/fs = 0.771 ms

Filter design is a tradeoff between multiple performance parameters for a given number of taps, so you have to specify several of them before the effort required can be estimated. It seems safe to say that little over a hundred samples worth of delay should go a long way though.

The Foobar2000 SoX resampler plugin also permits changing the phase response all the way from linear phase to minimum phase, which should be accompanied by a corresponding difference in in-band group delay if it does what it says on the tin.

As an aside, ADC stopband attenuation may be compromised by clock jitter. Might be part of why values in excess of 100 dB are rarely being pursued these days, alongside the usual group delay concerns.

BeerBear · Aug 31, 2022

Those numbers in the chart don't look that good to me, unless the filter slope is straight/linear.
You can get similar (around -100dB at 24kHz) attenuation with typical "sharp" filters used in current DACs, but due to the convex shape of the slope, attenuation at 22kHz is only around -10dB. Example.

Sokel · Aug 31, 2022

For AK5385 I can measure if you like as it's what E-MU has (as long as you tell me what to do in Multitone software,total newbie)
That's how I measured Khadas tone1 filters:

bennetng · Aug 31, 2022

Looks like this one is clipped (should be ESS minimum phase fast rolloff filter).

Also, most analog measurements (including AP) cannot reveal the whole stopband shape without using more advanced methods, and give a false impression that most filters in your screenshot for example only have 90dB attenuation. You can see that how @pkane used a specialized signal to get 120dB attenuation, the real attenuation depth of for example, ESS linear phase fast rolloff filter.

Beta Test: Multitone Loopback Analyzer software

Would it be relatively simple in Multitone to have it show not just FR, but also phase? I know you do this in Deltawave which works quite well, but testing phase with a spaced tones using a test signal would be a nice addition. Or should I just keep using Deltawave for this?

www.audiosciencereview.com

The limits of 44.1k, filters and latency

Active Member

Major Contributor

Major Contributor

Active Member

Major Contributor

Major Contributor

Major Contributor

Active Member

Major Contributor

Major Contributor

Active Member

Major Contributor

Major Contributor

Major Contributor

Member

Active Member

Major Contributor

Active Member

Master Contributor

Major Contributor

Similar threads