That is getting into the range where it can matter. Do stand alone DSP boxes with ASICs or FPGAs do any better?
It depends on what type of filtering they use. IIR filters cause minimal latency but won't be phase linear. FIR filters can be PL but cause delay proportional to filter resolution and sample rate.
Units like minidsp can do FIR filtering but simply don't have the processing power (sharc DSP) to achieve decent filter resolution, so realistically they are IIR only and have manageable latency.
I recently asked Uli (acourates designer) about latency in his software
Alan,
a lot of questions
1. Asio buffer
This simply means the number of samples handled by the Asio driver in the buffer. So the input samples are collected until the buffer is full. Then Asio switches to a second buffer for fill-up. In the meantime the first buffer can be read and processed. The same happens with the output buffers. The player writes into a buffer until it is full and stops writing. In the meantime Asio sends the content of a second buffer to the output sequentually. If the buffer is empty Asio switches to the first buffer. So the buffer switch time is buffersize/samplerate. With small buffers the time is shorter and thus the computer has to process more switches. Too many switches rise the CPU load.
2. FFT partition
The FFT calculation and the required CPU load or power depends on the partition size. Samller partitions increase the CPU load. Usually AC uses a partition of
32768 samples. This means that every partitionsize/samplerate a new FFT calculation is done. So to fill one partition there are 32 Asio buffer switches (with Asio buffersize 1024).
3. Filter delay
With the Acourate (near) linearphase filters the filter delay is half filterlength/samplerate. This is independent of the Asio buffersize or FFT partitionsize.
4. Total delay
The total delay can be assumed by sum of filter delay + 2*partitionsize + 2*buffersize. The delay can be reduced by using a minimumphase filter (downside = no phase correction, amplitude correction only. This is anyway ok with video as the brain is busy by picture processing). Furthermore the minphase filterlength can be reduced, this allows to also reduce the FFT partition and the Asio buffersize
5. Non-uniform FFT partitioning
This is the high end of FFT processing. But AC does not use it because up to now I have not fully understood how to implement it including multi-threading. It starts with short partitions and thus reduces the latency very much. It requires minphase filters anyway.
As Acourate basically intends to improve the udio playback it uses (near) linearphase filters and thus the actual implementation with uniform partition sizes is IMHO ok.
6. CPU load
This is simply the load index as also shown by the task manager. A CPU can be very busy but still do nearly nothing e.g. by a bad algorithm. If the buffers are too small the CPU load goes higher because of a lot of buffer switches and thus program interruptions.
7. Realtime index
The index is an information about how quick the calculation is done. A simple example: a CD player takes 30 minutes to read and play a CD. Realtime index = 1x. Now a CD ripper does it with a speadup of factor 10. The index is 10x.You can also define the index in this case as 1/10 = 0.1 or 10%. In your given example (picture) the convolution time is 60% of the Asio bufferswitch time. The buffer is pretty small with 256 samples and thus also the CPU load is high. In case of (near) linearphase filters this does not really make sense as the filter latency is much higher. You still recognize a big latency.
8. Multicore CPU
AC uses multi-threading. Thus you get the best behaviour if you have a CPU core for each channel as each convolution is carried out on its own core. Hyperthreading does not help for this case as hayperthreaded cores do not contain a numeric processor, they simply cannot process the convolution maths.
9. Latency optimization
You can use powerful CPUs with many cores. This helps to lower the CPU load and to do the calculations quickly. But the main brake is the phase correction. So if you can dispense with phase correction use minimumphase filters. Then shorten the filter if necessary and reduce the buffer sizes.
10: Video optimization
Some video players like JRiver allow to delay the picture. So JRiver even allows to use linearphase filters because it knows about picture+audio+filter delay. Wheras AC does not know anything about the picture.
- Uli