• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Streamer vs Computer

It's what the dac does to the analog signal that makes all the differnce?
As far as the streamer is concered, yes. My raspberry pi (which I can get for around £45 all in) acting as a streamer will sound exactly the same, into the same DAC as any other streamer at any price.

But as @Doodski says - pretty much all modern DACs are audibly transparent. Meaning the output is an audibly perfect (for everyone - no matter how golden the ears) representation of the input, so must by definition, sound the same.

Even those that don't quite meet that target of audible perfection for everyone, are so close as to render the differences are mostly meaningless even for the few who can hear it. They might be able to just about tell the difference in a blind test with fast switching - but it is not sufficient to make a meaningful difference to the enjoyment of music.

Obviously there will be a few that are so bad by design that no-one should touch them with a bargepole (say the last half of the red section of the chart)
 
Audio is a very undemanding application for a modern PC.
QUESTION:

While I have play around with a few music playback software applications (lms/squeezelite, roon, jriver, musicbee, Audirvana etc), it's always been on a "bring your own" PC.

Thus I have never used (aside from my old Slimdevices Transporter) any "hardware first" systems (i.e. Lumin, aurender, sotm, eversolo etc).

So the question is, with reference to beasts like Taiko servers (and related software), is there any playback software/systems that are actually multi-threaded (i.e. playback scales concurrently across multiple cores)? Even the much vaunted HQPlayer appears to be single threaded.

And I don't mean separate programs running/pinned to separate cores and pipelined together but rather a single program running across multiple cores (i.e. work required split into say four equal subsets, each subset runs concurrently on different cores and the resulting subset output is combined into a single result)

If not, what is the point of having more than say 4 cores... as noted you are just wasting money, resources, heat and noise for no gain? (it's really a rhetorical question unless one can be pointed out)


Peter


PS I have used "ffmpeg -threads 4" for video generation work but never playback...not sure it makes sense or works on playback?
 
Last edited:
is there any playback software/systems that are actually multi-threaded
IMHO just decoding any audio format to PCM requires so little CPU that the multithreading overhead for this would eliminate any performance gains. Resampling or DSP is a different story, each channel can be resampled separately.

Typically GUI apps are multithreaded - for serving the UI events, not for processing the stream. Players are usually multithreaded too - reading thread, processing thread, playing thread, control thread.

what is the point of having more than say 4 cores... as noted you are just wasting money, resources, heat and noise for no gain?
None, provided the existing cores handle the overall load without buffer issues.
 
IMHO just decoding any audio format to PCM requires so little CPU that the multithreading overhead for this would eliminate any performance gains. Resampling or DSP is a different story, each channel can be resampled separately.

Typically GUI apps are multithreaded - for serving the UI events, not for processing the stream. Players are usually multithreaded too - reading thread, processing thread, playing thread, control thread.


None, provided the existing cores handle the overall load without buffer issues.

I guess I used the wrong terminology (getting old)....I really should have said parallel processing....as I noted, a single programs workload split into several UOW running in parallel.

What you list (not criticizing) is co-dependant/pipelined programs running separately.

So with respect to DSP (outside of potentially pro DAW's etc which I have no knowledge of) is consumer DSP done with parallel techniques?

Thanks,

Peter
 
QUESTION:

While I have play around with a few music playback software applications (lms/squeezelite, roon, jriver, musicbee, Audirvana etc), it's always been on a "bring your own" PC.

Thus I have never used (aside from my old Slimdevices Transporter) any "hardware first" systems (i.e. Lumin, aurender, sotm, eversolo etc).

So the question is, with reference to beasts like Taiko servers (and related software), is there any playback software/systems that are actually multi-threaded (i.e. playback scales **** concurrently across multiple cores)? Even the much vaunted HQPlayer appears to be single threaded.

If not, what is the point of having more than say 4 cores... as noted you are just wasting money, resources, heat and noise for no gain? (it's really a rhetorical question unless one can be pointed out)


Peter


**** And I don't mean separate programs pipelined together but rather a single program running across multiple cores.
More cores means more heat and less computing power per core. So like it was said audio is undemanding for modern PS its pointless to scale audio player software to multicore pipeline. processing. On other side if you render/calculate whatever complicated task , you want it in fastest possible way and multicore/processor capabilities are crucial in this case. PC is just a tool ,like hammer, pliers etc.. and you need to pick the right tool for the job. 4 core are pretty enough for audio and home applications overall.
 
More cores means more heat and less computing power per core. So like it was said audio is undemanding for modern PS its pointless to scale audio player software to multicore pipeline. processing. On other side if you render/calculate whatever complicated task , you want it in fastest possible way and multicore/processor capabilities are crucial in this case. PC is just a tool ,like hammer, pliers etc.. and you need to pick the right tool for the job. 4 core are pretty enough for audio and home applications overall.
agreed, I have always leverage low powered, fanless, diskless, RAM boot Linux based Intel PC's as end points.

I have fingerprinted many playback software apps and know their resource profiles well but as noted never used "whole system" playback devices.

Peter
 
Simple example how much is undemanding audio processing - Trinnov Altitude 16 uses entry level Intel CPU with 2GB of RAM. That's for 16 channels !
 
Very most of software today use at least SMP2 but can spread workers trough available ones during cold start if enabled. For example browsers. In audio most routines and plugs will also be up to SMP4 and newer ones will use FPU units in that manner. Still didn't get to SIMD's, AVX or embaded DSP's on pretty much any architecture and dominant is ARM (streamers, players, DAP's...). That's the way we are going.
 
Last edited:
I guess I used the wrong terminology (getting old)....I really should have said parallel processing....as I noted, a single programs workload split into several UOW running in parallel.

What you list (not criticizing) is co-dependant/pipelined programs running separately.
TBH I do not understand the difference. Audio playback has some steps which cannot run parallelly without pipelining each stage to the other. In audio the timing is crucial (not jitter, but "before the buffer runs out") and running each critical path on separate cores gives more time headroom. It's perfectly possible a single core with OS context-switching can handle all of those processes in time just fine, more cores give more safety.
 
TBH I do not understand the difference. Audio playback has some steps which cannot run parallelly without pipelining each stage to the other. In audio the timing is crucial (not jitter, but "before the buffer runs out") and running each critical path on separate cores gives more time headroom. It's perfectly possible a single core with OS context-switching can handle all of those processes in time just fine, more cores give more safety.
The software doesn't determine that, it's on OS level including scheduling, concurency ensuring that requests needing priority get it and others don't fail out of the queue and so on. In software it's about menaging cost and making loops run faster. For example semaphores with defined conditionals tied to machine (hardware/assembly) calls for instructions.
It takes talent and cuple of years of hard work if you truly want to understand it. When you do that on Linux mainline level closely checking the progress eventually even contributing for a long time you are there.
 
So the question is, with reference to beasts like Taiko servers (and related software), is there any playback software/systems that are actually multi-threaded (i.e. playback scales concurrently across multiple cores)? Even the much vaunted HQPlayer appears to be single threaded.

And I don't mean separate programs running/pinned to separate cores and pipelined together but rather a single program running across multiple cores (i.e. work required split into say four equal subsets, each subset runs concurrently on different cores and the resulting subset output is combined into a single result)

I can not see any reason why a programmer needs to overcomplicate his software when it runs perfectly fine on one core. Since we have @HenrikEnquist here (author of Camilla) maybe he can chime in and explain whether there are any advantages to what you suggest. Just because I can think of none does not mean there are none.

If not, what is the point of having more than say 4 cores... as noted you are just wasting money, resources, heat and noise for no gain? (it's really a rhetorical question unless one can be pointed out)

My answer would be: "to soothe irrational audiophile anxieties". Bigger numbers must mean better. Lower CPU usage must mean better. I guess it helps them sleep better at night.
 
The software doesn't determine that, it's on OS level including scheduling, concurency ensuring that requests needing priority get it and others don't fail out of the queue and so on. In software it's about menaging cost and making loops run faster. For example semaphores with defined conditionals tied to machine (hardware/assembly) calls for instructions.
Well,I still do not understand the difference asked about in https://www.audiosciencereview.com/...treamer-vs-computer.44847/page-5#post-2243064 ? Please would you mind explaining?
It takes talent and cuple of years of hard work if you truly want to understand it. When you do that on Linux mainline level closely checking the progress eventually even contributing for a long time you are there.
My first commit series to the kernel was in 2007, not a major one though, that's true.
Since we have @HenrikEnquist here (author of Camilla) maybe he can chime in and explain whether there are any advantages to what you suggest.
Henrik has already split the processing thread to multiple threads https://github.com/HEnquist/camilladsp/commit/8fc02513a03b302b3857bdf4f5debf6d4e3aa9d1 . It's convenient for larger DSP, that's why he made it optional.
 
@phofman there is nothing more to explain. I simply stated in which shape code currently is without over complicating it. And that's SMP2 to SMP4 for most and still on integer and going to FP but true FPU at best and all on general purpose CPU cores. People having fantasies about DSP's newer tryed to program or even instruction pack one. It's not even remotely expected that small audio companies would be able to do such work. Instead the more clever one's eventually use already written software and interfaces exposing it to user space and even that is going slowly. Example QACT speakers PEQ thru USB bridge Topping did with D50 III. I whose QC developer, I also did independent DSP evolution and recommendations for a likes of CEVA and Tensilica as no one would ever fiddle with Hexagon. I whose also there when Video Core documentation disappeared over the night. So even if you would like to develop DSP specific code you won't have competitive development board to do it. Hopes faded with HiSilicone ban right before second attempt mainlinig regarding Linux (as those had Tensilica P5 or P6 one's with great documentation and still all together hold design win).
 
Last edited:
What you list (not criticizing) is co-dependant/pipelined programs running separately.
Not really, I meant standard multi-threaded programs, not multiple single-threaded programs joined with a pipe. Decent well-written players are multithreaded because that's the nature of what they do. Reading may be joined with decoding/processing into one thread, but typically the writing thread is separate because output devices differ largely in their buffer sizes and reading e.g. over network can be paced differently (large buffered blocks of data) to the continuous writing the sound device requires.
 
@ZolaIII : IIUC you are talking about truly hardware devices, with no general-purpose OS, but IIUC @fatoldgit talks about standard-OS based streamers (typically running linux/android such as all those specifically mentioned: Lumin, Aurender, SOtM, Eversolo).
 
@ZolaIII : IIUC you are talking about truly hardware devices, with no general-purpose OS, but IIUC @fatoldgit talks about standard-OS based streamers (typically running linux/android such as all those specifically mentioned: Lumin, Aurender, SOtM, Eversolo).
Well I already talked about general purpose implementations and state of it (so many times it's tiring) so touch of gray and into truly heterogeneous many cores or better say luts and matrix. You take good maintained lib compile, interface it and limitedly use it.
Problem is when there is no any or they simply don't work as they should. When it comes to audio one's it's wild forest and in a bad way (pulse audio and such) when it comes to Android in particularly. Many big OEM's making their own property shit shattering the space future on. It's beyond hope and so little useful for end user. Hope this makes you happy. If you want to dig find a repo and we'll enjoy it.
 
I guess I used the wrong terminology (getting old)....I really should have said parallel processing....as I noted, a single programs workload split into several UOW running in parallel.
Here is a screen capture of my m4 macmini pro’s activity monitor while playing a song within Apple Music.:

Activity monitor.jpg


Or an even simpler view - the load on the cpu:
CPU Usage .jpg

Most cores are doing nothing.

Playing back audio is not even a blip to the cpu.
 
Simple example how much is undemanding audio processing - Trinnov Altitude 16 uses entry level Intel CPU with 2GB of RAM. That's for 16 channels !
Another simple example.

Raspberry pi zero 2 with 512MB ram (available for around £15 (if you exclude case PSU, storage etc - about £45 if they are included) is quite capable of running Picoreplayer server/client combo.
 
Another example: Radxa Rock Pi S with 512MB RAM (quad 1.3GHz, 15 USD) can capture 48kHz/32bit/16channels over USB gadget, resample all 16ch to 384kHz using float32 precision, play via I2S at 384kHz/32bit/16ch -> I2S capture 10 channels (max. I/O I2S combination of that SoC), and send 5 of those captured 384kHz channels back via USB gadget to the host (5ch @ 384kHz/32bit being UAC2 bandwidth limit). Powered via the USB port from the USB host, <500mA https://audiosciencereview.com/foru...c-sw-dsd-direct-vs-any-dac.55386/post-2024214
 
I can not see any reason why a programmer needs to overcomplicate his software when it runs perfectly fine on one core. Since we have @HenrikEnquist here (author of Camilla) maybe he can chime in and explain whether there are any advantages to what you suggest
There isn't much to add to what phofman already wrote:
Typically GUI apps are multithreaded - for serving the UI events, not for processing the stream. Players are usually multithreaded too - reading thread, processing thread, playing thread, control thread.
The only part here it makes sense to parallelize is the processing. But parallelizing work means that data has to be sent between threads, and this isn't for free, and isn't instant. It helps if the work can be split up is reasonable sized independent tasks, that each take at least some milliseconds to finish. It also depends on the algorithm. For example lets take some heavy form of processing that can be split so left and right are processed in parallel in separate threads, but doing both channels in the same thread may enable some optimizations, so that the parallelized version uses considerably more cpu resources in total. I could imagine that pcm-to-dsd conversion works like that.
 
Back
Top Bottom