Exactly. The AVR contains a computer that's capable of receiving data through a network, and producing analogue signals for the speakers. It doesn't matter which network technology is used, as long as the speed is enough to supply the AVR's DACs with the required bitstream. But note that AirPlay is proprietary and includes DRM, so it's not a suitable stack for universal interconnection.
I see where you are coming from.
The interconnects I am talking about aren't to send from an iPad to an AVR. We are talking about two very different ecosystems.
Take, for example, the connectivity between a TV and an AVR for the TV to send the extracted audio from the video stream it is rendering to a sound bar (to understand this get into the next level of detail of what is being transfered - the TV receives and MP4 stream of video). Now figure out what needs to do to send the sound out (in terms of formats and encodings) to a sound bar from that stream to use a network and how it would enforce sync. Please get to the next level of detail on what actually happens between the two, what is the format (including encoding) of the stream going from the TV to the soundbar in this scenario? You will realize that this doesn't fit your model
Or take a stand-alone streamer that would want to use an external DAC like Okto or Topping because they are better than what is in an AVR to do the digital to analog conversion but we need the AVR to get the networked stream (say a streamed MP3 file) and use its codecs because the Okto and Topping are not designed to have all the codecs ever needed (that would destroy the case for having such devices at any reasonable price because they all would have to license the codecs). This doesn't fit your model either.
You seem to be saying all of those are "dumb" devices and so make all of them be able to take a mp3 file instead over the network, then you would not have those devices at all or everything would turn into an all-in-one receiver (at least for digital stages). This is not a realistic scenario in my opinion and so I beg to leave me out of going down that path.
Why not jump on that bandwagon with an Open Source stack and high-quality smart speakers?
Because what you have in mind is an integrated all-in-one solution of a single device that takes an encoded music file stream and does everything it needs to within itself. It is possible in some simple use cases of using a smartphone to listen to a MP3 file. But when the requirements go into requiring a better DAC than what a smartphone has or you want a surround sound processor that can play through a multi-channel speaker system, etc., then either you have to assume your smartphone acquires all those capabilities or it is talking to a single box that does all of these things and so no digital interconnects are needed (it is done via i2s buses inside the box which is fine because there is already a relatively universal connector there!).
I am not dismissing your model in the simpler use case. It is happening now. But I don't see its relevance for the above use cases where everything isn't happening in one box, as you assume.
You are not talking about a new form of interconnects but a different ecosystem of devices themselves. I suggest you start a thread separately on how that would work to replace all of the current "dumb" devices. The interconnects are not relevant in that world but that new ecosystem design is not relevant to this existing world.
So I am going to stop here with your line of thinking. Thanks for your contribution.