• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

GPU Audio - the future of DSP?

Ra1zel

Addicted to Fun and Learning
Joined
Jul 6, 2021
Messages
531
Likes
1,048
Location
Poland
Honestly I would very much like to try FIR convolver because I think you could make a remarkable HiFi setup without having the problem induced by upsampling and linear phase filters with many TAPs (I think of some great crossovers).
I have no idea how it is possible to reduce inherent latency of high tap linear phase FIR filters. Thats why all studio converters use minimum phase. Sounds like people at ASR try to fight with math and physics again.

btw, I only ever achieved stable behavior below 1ms latency with AES67 networked converters.
 

ZolaIII

Major Contributor
Joined
Jul 28, 2019
Messages
4,069
Likes
2,409
1 ms is a huge latency (equal to disastrous) in the world of modern HPC systems PCI bus considered as highest offender and that's why faster links and switches are developed. Latency is cumulative (all parts take their latency in final equation) and traditional GPU's aren't suitable for latency critical operations.
Audio isn't latency critical but GPU's aren't really suitable for such operations that's why they have deticated (usually outdated) DSP's for such purposes (for example AMD still uses old Tensilica ones). Modern SoC's with modern interconnects are much more suitable for the purpose. Unfortunately we never got a good developer bord for such a purpose with full documentation and mainlined in order to such development even ever start. Some come close HiSilicone with Tensilica P5 P6 (never got mainlined and future work is dropped thanks to sanctions), Broadcom SoC's on the more potent Pi boards with V Core ADSP which all do old could have carried a lot of development burden for which all documentation disappeared (purposely), Old Ti opma panda boards... Industry obviously doesn't want this to happen so it remains the paper dream up to this day's (especially when it comes to GPU's for which Nv is one to blame the most).
Still there is hope as modern CPU's are becoming more and more SoC alike like recent Intel graphic units on the dye (partly based on Creative DSP design), hopefully one day someone will brake the chains. Best regards and have a good time.
 
OP
Davide

Davide

Senior Member
Joined
Jul 6, 2020
Messages
460
Likes
171
Location
Milan, Italy
As far as I'm concerned, no one here tries to fight anything, much less physics and math.
It is a forum, and here we discuss. Obviously within the limits of everyone's knowledge and skills. And this serves to increase these.
Your comments are interesting indeed, it is nice to receive quick technical information on non-mainstream issues.
What, as a layman, I struggle to understand, is why GPUs cannot be considered suitable for audio processing, when today they provide adequate latencies in video games (from the point of view of human interaction).
 

voodooless

Grand Contributor
Forum Donor
Joined
Jun 16, 2020
Messages
10,227
Likes
17,806
Location
Netherlands
What, as a layman, I struggle to understand, is why GPUs cannot be considered suitable for audio processing, when today they provide adequate latencies in video games (from the point of view of human interaction).
It has nothing to do with the GPU, but rather the filter technique itself: looking into the future has an inherent time penalty.
 

pierre

Addicted to Fun and Learning
Forum Donor
Joined
Jul 1, 2017
Messages
962
Likes
3,048
Location
Switzerland
I don't know if you've ever tried mastering in a DAW with 7 or 8 plugins that do x32 upsampling and 64-bit processing (sometimes they are necessary so that the chain of these does not excessively degrade the audio quality).

The CPU runs at very high loads and the latency is just as high. And the problem is that if you have 1 second of latency and you are there fiddling with the effects, you cannot directly perceive the effect of what you are doing (it's a bit like ABX tests).

This is why taking advantage of GPUs, that already exist widely, is a valid solution.
if you have a modern CPU with 32 or 64 cores, the problem goes away if your DAW knows how to use them efficiently. The other option is to buy a DSP card from AVID or BlackMagick or Waves. PT or Resolve scale well.

For GPU, I am not sure I understand how they want to leverage them. They have a lot of small cores, so it is good for CPU expensive operations like video encoding but for Audio most operation are (relatively) lite on the CPU. I had in mind that most of the latency was from FIR filters now or operations that are sequential and less due to CPU being the bottleneck. To be efficient, you also would like the audio stream to come in and out from the GPU, possibly on HDMI.
 
Last edited:

ZolaIII

Major Contributor
Joined
Jul 28, 2019
Messages
4,069
Likes
2,409
@Dlomb11 let's put it this way; hard to program, not well documented at all, made really for something else and include ASIC's and DSP's (that you again don't have access to) for such, big, chunky, expensive and noisy. Now you get a better picture. Now compare that to the SoC powering something like Samsung late buds (complete analog/digital system with DSP and all) that you could implement even in something like large 6.35 mm jack hosting. So why on earth would you prefer to use GPU? We are stuck with development because there is a need for Linux (Kernel) mainlined SoC - development board with potent DSP that is well documented and potent enough everything else to get it out public or open source if you wish. Today you at best get 64 bit FP (not for a sakes of audio precision but as wide as that) procsing done on CPU FPU - MPC units. Even moving to SIMD's (again not really easy to program) would be a big gain in both efficiency and processing power as you would have 128~256 bit vectors to pack them (with standard length instructions and even mixed). Of course this is for architectures which incorporate SIMD's and per each one independently. Future evolution of multipurpose multi functional accelerators should be based on flexible DSP's (in consumer grade products) and again only well documented and with appropriate toll chains disregarding of their additional graphic processing capabilities or if they are taylored for such in the first place (GPU's are large DSP areas after all).
Described Nv series are future more tailored for graphic processing and even lose badly in duble precision floting point calculations to average desktop CPU's with good optimised code because their FP duble precision performance is deliberately crippled severely by manufacturers (if you want unlocked FP64 you have to pay for it and get Quadro or such and the same thing goes for AMD).
To be fair they are much faster in 32 bit FP and integer (24, 32 & 64 bit) operations.
GPU's on propetry snake legs (no documentation no access on higher level) architecture is not worth of development time (including pretty much all GPU's and DSP architectures from such as QC).
 
Last edited:
OP
Davide

Davide

Senior Member
Joined
Jul 6, 2020
Messages
460
Likes
171
Location
Milan, Italy
@Dlomb11 let's put it this way; hard to program, not well documented at all, made really for something else and include ASIC's and DSP's (that you again don't have access to) for such, big, chunky, expensive and noisy. Now you get a better picture. Now compare that to the SoC powering something like Samsung late buds (complete analog/digital system with DSP and all) that you could implement even in something like large 6.35 mm jack hosting. So why on earth would you prefer to use GPU? We are stuck with development because there is a need for Linux (Kernel) mainlined SoC - development board with potent DSP that is well documented and potent enough everything else to get it out public or open source if you wish. Today you at best get 64 bit FP (not for a sakes of audio precision but as wide as that) procsing done on CPU FPU - MPC units. Even moving to SIMD's (again not really easy to program) would be a big gain in both efficiency and processing power as you would have 128~256 bit vectors to pack them. Of course this is for architectures which incorporate SIMD's and per each one independently. Future evolution of multipurpose multi functional accelerators should be based on flexible DSP's and again only well documented and with appropriate toll chains disregarding of their additional graphic processing capabilities or if they are taylored for such in the first place (GPU's are large DSP areas after all).
Described Nv series are future more tailored for graphic processing and even lose badly in duble precision calculations floting point calculations to average desktop CPU's with good optimised code because their FP duble precision performance is deliberately crippled severely by manufacturers (if you unlocked FP64 you have to get Quadro or such and the same thing goes for AMD).
To be fair they are much faster in 32 bit FP and integer (24, 32 & 64 bit) operations.
GPU's on propetry snake legs (no documentation no access on higher level) architecture is worth of development time (including pretty much all GPU's and DSP architectures from such as QC).
Thank you @ZolaIII .
Very explanatory.
 

changster

Member
Joined
May 6, 2022
Messages
96
Likes
110
Location
Taipei
Latency as the key selling point? That sounds very unattractive. There’s a thing called buffering for this. :rolleyes:
 

pierre

Addicted to Fun and Learning
Forum Donor
Joined
Jul 1, 2017
Messages
962
Likes
3,048
Location
Switzerland
Latency as the key selling point? That sounds very unattractive. There’s a thing called buffering for this. :rolleyes:

Can you explain what you mean? Latency is important in some use case, like recording or mixing, The lower the better, the best tail latency, the better. Buffering is increasing latency …
 

abdo123

Master Contributor
Forum Donor
Joined
Nov 15, 2020
Messages
7,425
Likes
7,941
Location
Brussels, Belgium
if you have a modern CPU with 32 or 64 cores.

It's really funny that you classify a EPYC Ryzen powered servers as 'modern CPU'.

You're not technically wrong but that's not typically a word i normally see to describe these.
 

Ra1zel

Addicted to Fun and Learning
Joined
Jul 6, 2021
Messages
531
Likes
1,048
Location
Poland
It's really funny that you classify a EPYC Ryzen powered servers as 'modern CPU'.
Threadrippers also have 32 cores, either way if we are considering today's hardware good enough even for most advanced audio applications give it 5 more years and it's hard to imagine any audio use will be considered more than pedestrian. Just now next generation of Intel and amd cpus should launch this year with 30% performance increase.
 
Top Bottom