Interesting discussion for the most part, and one that illustrates the difficulties that this type of project always seems to run into - too many ideas/possibilities and lack of formal agreement on scope/requirements. Some are looking for a simple 2-channel streamer, some looking for multi-input switching capability, some looking for multiple outputs for active crossovers, others for multiple outputs for multi-sub integration etc. Nailing down the specific feature set has to be step 1.
I went through all of this a couple years back when deciding on an active crossover setup for our living room. I've used Linux for active xovers experiments dating back to Y2k (running BruteFIR on a Pentium III), do software professionally, have multiple PI and x86 boxes lying around and my solution was to buy a MiniDSP SHD
. IMHO there are 4 key aspects that drive the complexity - external inputs, sample rate handling, remote control and Room Correction. Deciding on the approach to these 4 really sets the direction and scope of the solution.
External inputs: is this strictly a streamer for stored digital content, or do you need to interface with other devices? - a TV, integrate into an HT system with HT bypass, analog input for a turntable etc. Analog or digital inputs? Analog inputs aren't too bad since the device can remain the clock master. Digital inputs either mean SRC on the inputs, or changing the clock source which can be problematic. Most multi-input audio interfaces don't include SRC on digital inputs (Okto doesn't, Motu doesn't aside from the 8D from what I can tell, Focusrite doesn't).
Sample rate handling: DSP is inherently sensitive to sample rate. If you change the sample rate, all DSP processing has to be updated appropriately. This is surprisingly tricky to do on-the-fly, and it's why MiniDSP always applies SRC to all inputs in order to run at a fixed internal sample rate. An RPi based solution generally won't have hardware SRC available, and so this becomes a software problem. See what the SuperPlayer project is doing as an example of the problem - they have to hack the player to send a sample-rate change event so that the DSP engine can reconfigure itself - for IIR filters this means re-computing the coefficients; for FIR filters it means having 'equivalent' filters available for each rate, or else running the filter itself through SRC. Particularly problematic for external digital inputs which can change sample rates on the fly, but a consideration even for straight streaming with various hi-res formats in the library.
Remote Control: seemingly a minor point, but for anyone intending this to be a 'family room' device, it has to be family-friendly. Harmony integration at a minimum would seem to be required. Also implied here is 'reliability' - if the system goes AWOL when I'm out of town and the wife can't watch TV - well, however much I saved through DIY wasn't worth it.
Room Correction. DIY / open source solutions are available, but not point-and-shoot. DRC-Fir can give results ranging from 'spectacularly good' to 'unlistenable' and has about 500 knobs to twiddle. Rephase is extremely manual, and really more of a 'speaker correction' than 'room correction' tool. REW and it's auto-eq is probably the best from what I can tell, and can export biquads which are about as universal as you get. Almost certainly best handled with a separate PC to do the measurement/manipulation rather than trying to build it into the device. Pretty good bet that nothing open source will be as 'easy' as Dirac.
I didn't include output channels as a driver, but it does come into play. If the understanding is 'more than 2', then it puts you into the realm of needing an audio interface of some sort, or else just using HDMI. Interface support under Linux is better than it has been in the past, but still full of traps. The Okto is the obvious choice if you don't need analog inputs; the Motu M4 is an attractive budget choice, although my reading is that there are some fixes in kernel 5.8 that address problems and most ARM distros I checked still seem to be at 5.4.x. Other motu interfaces 'should work' in theory, but direct experience seems to be rare. HDMI should be automatically supported, but may require .asouncrd work and as we've seen in Amir's testing receivers with top-flight performance are rare although they may be 'good enough'.
So, the above is really my laundry list of the things that make doing this sort of thing more difficult than it may first appear, and what ultimately drove me to just buy a MiniDSP SHD. Reading through the thread, it really does seem that an SHD Studio with 2 or 4 more output channels and having Volumio replaced by a better streamer (Moode or RopieeeXL or PiCorePlayer) is more or less what is being asked for. (at half the price, of course, and ideally not needing the additional MiniDSP software)
It does seem like 'the community' is already moving towards something that largely covers these requirements with CamillaDSP - both Moode and PiCorePlayer (in the form of SuperPlayer) seem to be close to having reasonable integration. I would definitely suggest thoroughly auditing these efforts as I suspect they're way ahead of where a from-the-ground-up effort would get to in any reasonable time frame. If you start from one of these though, I guess the crux of the question is where/how the ASR community can provide a value add over the basic platform. My thoughts:
- validated configurations (specific sound card setups that work, maybe Roon Bridge configuration, external vs internal volume control)
- extending the platform to handle external inputs (plus something like 'HT Bypass mode' which is a requested feature on the SHD; basically switch to analog in and set to fixed volume)
- working on making Room Correction easier/more seamless; probably already happening but routines to upload REW coefficients, convert etc
- Remote control - either via standard LIRC configs or even an ESP32 integration - so that a Harmony could be easily set up.
- multi-sub setup and config. IMHO there is a real gap here. everyone 'knows' that multiple distributed subs is 'best', but actually achieving it is still fully manual from what I can tell. tools for semi-automating this might be interesting.