• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Kernel Streaming, ASIO, WASAPI... and music players (Foobar, JRiver...)

maty

Major Contributor
Joined
Dec 12, 2017
Messages
4,596
Likes
3,160
Location
Tarragona (Spain)
I've recently completed a 60 day trial of Roon (offered via Qobuz) and could not find any difference in sound quality between JRiver 25 and Roon 1.6 even when using the volume control in each as long as the levels were matched...

Some weeks ago I had installed JRMC v25.0.* and v24.0.78, 64 bits. With identical configuration but v24 sounded better in my system. Why? I do not know.

I have never heard Roon, much less in my systems. Too expensive!
 

dc655321

Major Contributor
Joined
Mar 4, 2018
Messages
1,597
Likes
2,235
This is very interesting! I’d always thought the delay created by a FIR filter was a simple function of the sample rate and tap length:

Delay (in samples) = (N - 1) / 2

Presumably, fast convolution can’t happen faster than this; does that mean that direct convolution is actually slower?

@andreasmaaan - you are referring to the group delay imposed by an FIR filter, whereas @edechamps is referring to the computational complexity of direct vs FFT-based convolution algorithms.

Not quite apples 'n oranges... Maybe apples 'n lizards? ;)

Fast convolution is faster for block sizes of N > 32-64 vs direct convolution.
 

Ron Texas

Master Contributor
Joined
Jun 10, 2018
Messages
6,078
Likes
8,914
Based on my experience, ASIO has a bit more resistance to skipping when other applications are being used on the playback computer. Otherwise, it sounds just like WASPI.
 

edechamps

Addicted to Fun and Learning
Forum Donor
Joined
Nov 21, 2018
Messages
910
Likes
3,620
Location
London, United Kingdom
@andreasmaaan - you are referring to the group delay imposed by an FIR filter, whereas @edechamps is referring to the computational complexity of direct vs FFT-based convolution algorithms.

Yes. But even then, claiming that a FIR filter has an inherent signal delay of (N - 1) / 2 is wrong in general, because it assumes a centered main pulse (such as a linear phase filter). It's perfectly possible to design a FIR filter that has a main pulse closer to the beginning of the impulse response, resulting in lower overall delay. They're just less common, I guess.

The discussion around delay gets complicated with Fast Convolution however, because Fast Convolution is a block-based algorithm, which means you need to accumulate multiple samples before the algorithm can run (larger blocks = less computationally expensive), which in turn means that Fast Convolution introduces its own delay. An equivalent way to look at it is to say that using Fast Convolution has the effect of prepending zeroes to the beginning of the impulse response.
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,399
@andreasmaaan - you are referring to the group delay imposed by an FIR filter, whereas @edechamps is referring to the computational complexity of direct vs FFT-based convolution algorithms.

Not quite apples 'n oranges... Maybe apples 'n lizards? ;)

Fast convolution is faster for block sizes of N > 32-64 vs direct convolution.

Yes. But even then, claiming that a FIR filter has an inherent signal delay of (N - 1) / 2 is wrong in general, because it assumes a centered main pulse (such as a linear phase filter). It's perfectly possible to design a FIR filter that has a main pulse closer to the beginning of the impulse response, resulting in lower overall delay. They're just less common, I guess.

The discussion around delay gets complicated with Fast Convolution however, because Fast Convolution is a block-based algorithm, which means you need to accumulate multiple samples before the algorithm can run (larger blocks = less computationally expensive), which in turn means that Fast Convolution introduces its own delay. An equivalent way to look at it is to say that using Fast Convolution has the effect of prepending zeroes to the beginning of the impulse response.

Ok, so what we're talking about here is latency as opposed to group delay - correct?
 

dc655321

Major Contributor
Joined
Mar 4, 2018
Messages
1,597
Likes
2,235
They're just less common, I guess.

In my admittedly limited experience, much less common than a linear-phase filter.

The discussion around delay gets complicated with Fast Convolution however, because Fast Convolution is a block-based algorithm, which means you need to accumulate multiple samples before the algorithm can run (larger blocks = less computationally expensive), which in turn means that Fast Convolution introduces its own delay. An equivalent way to look at it is to say that using Fast Convolution has the effect of prepending zeroes to the beginning of the impulse response.

As if the terms themselves were not confusing enough, I think perhaps you're conflating delay with latency.
 

edechamps

Addicted to Fun and Learning
Forum Donor
Joined
Nov 21, 2018
Messages
910
Likes
3,620
Location
London, United Kingdom
In my admittedly limited experience, much less common than a linear-phase filter.

I suspect non-linear-phase filters are quite common in the specific application of room correction, for example. Room correction requires very long FIR filters to accurately correct low frequencies, but if you want to use it on real-time signals, the 500ms+ delay induced by a 65K tap linear-phase filter would be impractical. The FIR filters I'm using in my home system are 32K taps in length but the main pulse appears at 3 ms, not in the middle of the impulse response.

As if the terms themselves were not confusing enough, I think perhaps you're conflating delay with latency.

Well, maybe. To me they're the same thing, but evidently we're using different definitions :)

Ok, so what we're talking about here is latency as opposed to group delay - correct?

Well I'm confused now since we seem to be using subtly different definitions. To me, the component of group delay that is constant vs. frequency is the definition of "latency" or just "delay". I would also accept a definition of latency or delay that is based on measuring the time it takes for the peak of an impulse to travel through the system under test. I don't make a distinction between "latency" and "delay", but it looks like you guys do.
 
Last edited:

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,399
Well I'm confused now since we seem to be using subtly different definitions. To me, the component of group delay that is constant vs. frequency is the definition of "latency" or just "delay". I would also accept a definition of latency or delay that is based on measuring the time it takes for the peak of an impulse to travel through the system under test.

Me too! I think the definitions are generally a bit unclear, probably because the concept of group delay harks back to a time before digital signal processing. I think we're all conceptually on the same page here in any case - the miscommunications simply came down to terminology.

Generally I would have thought of group delay as something inherent to a filter and independent of computation, and latency as a result of computational processing time. But perhaps I'm wrong to have made this distinction.
 

edechamps

Addicted to Fun and Learning
Forum Donor
Joined
Nov 21, 2018
Messages
910
Likes
3,620
Location
London, United Kingdom
Generally I would have thought of group delay as something inherent to a filter and independent of computation, and latency as a result of computational processing time.

Yes and no. It's more complicated that. As I explained, Fast Convolution adds its own delay on top of the delay that is inherent to the impulse response of the filter itself, and that delay is caused by the mathematical properties of the Fast Convolution algorithm, not because of processing cost. And you can increase the computational complexity of a filter without necessarily increasing latency (assuming we mean the same thing by "latency"), but I guess that depends on the design of the system (and I suspect the answer to that question is different for a computer software audio pipeline as opposed to hardware DSP design, for example).

I think it would be best to discuss specific concrete, practical example of real filters running in real systems, otherwise this is going to get even more abstract and confusing.
 

dc655321

Major Contributor
Joined
Mar 4, 2018
Messages
1,597
Likes
2,235
I don't make a distinction between "latency" and "delay", but it looks like you guys do

I'm afraid there is nothing unique about the use of those terms in this thread.
They're (ab)used throughout the audio stack, from production to reproduction :confused:

Generally I would have thought of group delay as something inherent to a filter and independent of computation, and latency as a result of computational processing time. But perhaps I'm wrong to have made this distinction.

If you're wrong, then we both are - that is the distinction I had in mind as well.
According to one definition, latency encompasses all delays, from bits entering an application to the point of acoustic (or electric, depending on reference) signal production.

Fast Convolution adds its own delay on top of the delay that is inherent to the impulse response of the filter itself, and that delay is caused by the mathematical properties of the Fast Convolution algorithm, not because of processing cost.

Are you referring to Fast Convolution's block-oriented nature with this statement?

I think it would be best to discuss specific concrete, practical example of real filters running in real systems, otherwise this is going to get even more abstract and confusing.

I think this is a good idea in principle, but somewhat tricky in practice (depending on region of measurement).
I was recently looking for benchmarks of BruteFIR performance (found nothing), as I'm developing a convolution engine for my own use...
 

edechamps

Addicted to Fun and Learning
Forum Donor
Joined
Nov 21, 2018
Messages
910
Likes
3,620
Location
London, United Kingdom
According to one definition, latency encompasses all delays, from bits entering an application to the point of acoustic (or electric, depending on reference) signal production.

Sure, that gets us a definition of latency. But then what's your definition of "delay"? Are you referring purely to the region in the impulse response of the filter definition that is before the main pulse?

Are you referring to Fast Convolution's block-oriented nature with this statement?

Yes.

(By the way: to complicate things further, there are more even more advanced algorithms besides Direct Convolution and Fast Convolution, such as hybrids between the two, that have different tradeoffs in terms of delay/latency and computational cost. See this paper for example. I have not looked into these in detail. I suspect one reason why these are not as widely discussed is because there might be patents involved.)
 
Last edited:

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,399
If you're wrong, then we both are - that is the distinction I had in mind as well.
According to one definition, latency encompasses all delays, from bits entering an application to the point of acoustic (or electric, depending on reference) signal production.

I think we may both have beeen wrong ;)

But then what's your definition of "delay"? Are you referring purely to the region in the impulse response of the filter that is before the main pulse?

Having thought about it a bit more, a possible point of distinction between latency and group delay is that the former is not frequency dependent - at least not in any sense in which I've seen it used.
 

DDF

Addicted to Fun and Learning
Joined
Dec 31, 2018
Messages
617
Likes
1,355
I'm interested in knowing how you can do achieve this low CPU usage, because everytime I use Foobar or JRiver with Convolver, the fan gets going after 10 seconds and never stops. I have a HP Envy laptop with i7, 8 Mo RAM, bought last year.

I was able to significantly lower cpu spikes and fan activation in my i5 (but only 4g ram) music laptop by reassigning every single program and os scheduled task to 3am in the windows scheduler. Its surpring how often Chrome alone "phones home". No guarantees but I suggest you give it a shot.
 

edechamps

Addicted to Fun and Learning
Forum Donor
Joined
Nov 21, 2018
Messages
910
Likes
3,620
Location
London, United Kingdom
Having thought about it a bit more, a possible point of distinction between latency and group delay is that the former is not frequency dependent - at least not in any sense in which I've seen it used.

Sure, but to me "delay" and "group delay" are not quite the same term. To me "delay" and "latency" implies a constant delay irrespective of frequency, while "group delay" implies a delay that can potentially vary with frequency. Maybe that explains the confusion.
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,399
Sure, but to me "delay" and "group delay" are not quite the same term. To me "delay" and "latency" implies a constant delay irrespective of frequency, while "group delay" implies a delay that can potentially vary with frequency. Maybe that explains the confusion.

I think that nails it. I was reading your "delay" as "group delay", when you (very reasonably) intended it to mean "latency".

Now who can come up with a plausible way to distinguish "constant group delay" from "latency", then? :p
 

dc655321

Major Contributor
Joined
Mar 4, 2018
Messages
1,597
Likes
2,235
See this paper for example.

Yes, Gardner's paper is a "classic", describing a non-uniform partitioning overlap-save algorithm (NUPOLS).
There are more modern treatments presented with greater clarity and depth (see F. Wefers et al).

My current project uses UPOLS (uniform partitioning), but the blocksize can be chosen as small as you like is advantageous.

But then what's your definition of "delay"? Are you referring purely to the region in the impulse response of the filter that is before the main pulse?

My definition of delay is anything DSP-related, from zero-stuffing (pure delay) to group delay.
One can and does always have latencies associated with digital audio, but there may be zero internal delays (by my definition, above).
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,386
Likes
24,749
Location
Alfred, NY
"Delay" as a standalone is a broad term. Time delay, group delay, phase delay are more precise and less likely to cause confusion.
 
OP
daftcombo

daftcombo

Major Contributor
Forum Donor
Joined
Feb 5, 2019
Messages
3,687
Likes
4,068
In Foobar, should one select "use 64 bit ASIO driver"?
Is it better/worse/same?
 

edechamps

Addicted to Fun and Learning
Forum Donor
Joined
Nov 21, 2018
Messages
910
Likes
3,620
Location
London, United Kingdom
In Foobar, should one select "use 64 bit ASIO driver"?
Is it better/worse/same?

That is again about which CPU instruction set is used. Some ASIO drivers (especially old ones) are only usable in 32-bit. Some poorly-written ASIO drivers might have bugs in 64-bit that don't trigger in 32-bit, or vice-versa. It's possible that 64-bit might be slightly faster from a CPU perspective. Aside from that, it shouldn't make any difference, and it definitely shouldn't make any difference to audio quality.
 
OP
daftcombo

daftcombo

Major Contributor
Forum Donor
Joined
Feb 5, 2019
Messages
3,687
Likes
4,068
On my computer, with my Apogee Duet 2, with ASIO4All, there are small glitches (like small clicks) before the music starts when I use 64-bit mode and sometimes the very beginning of a song is eaten. I use the "Affix silence" plug-in in order to make Foobar wait 2 seconds before starting playback so that the clicks is played before the beginning of the song and it plays in full. It only happens when I launch playback manually though, not between tracks. No big deal.
It does not happen with my Focusrite Scarlett 2i4 2nd Gen nor Topping D10 used with their respective drivers (with ASIO4All i'm not sure, IIRC think it's fine too).

With the Duet I tried 32-bit this morning and didn't notice any glitch.
 
Top Bottom