I’ll repeat once again on this thread how the TV video is in sync with the audio in an AVP. And this is the same case whether we are talking about an HDMI v1.0 based AVP from 2008, AVP-16, or DVP-16. The TV connects to the HDMI outputs of the AVP! Because of this the TV isn’t the source. The sources are the HDMI based devices connected to the HDMI inputs of the AVP. And inside the AVP, the ASRC aligns the clocks from all inputs to the clock of the HDMI output back to the TV, compensating for the latency introduced by the DSP. When hardware based audio over IP boards are introduced, the hardware based audio over IP boards get a feed from the exact same ASRC clock that is fed to the HDMI outputs. Therefore aligning all the clocks of the HDMI outs, and the “grandmaster”or “Leaderclock” in the audio over IP network. The end results are all audio over IP based devices on the network (DAC’s active speakers etc) are within nanosecond sync with the video output coming from the HDMI output port on the AVP.
Now let’s discuss how the clocking works if eARC is used from a connected TV, if the TV is used as a switching device for all source gear (Apple tv, blueray player, PS5 etc). This is important as the Hyperion line of AVP’s have this option as well:
The AVP (as eARC receiver) processes audio (e.g., Atmos decoding, room correction), introduces latency, and uses CMDC to inform the TV of the delay. The TV then adjusts its video timing to match. This isn't a literal "clock signal" sent back (eARC doesn't transmit video or clocks bidirectionally), but rather synchronization metadata derived from the AVP's internal clock. The embedded clock in DMAC ensures low-jitter audio delivery (<0.25 UI peak-to-peak). And it does this continuously in real time. So let’s say you enable Dirac ART on the fly with a movie playing. And let’s say the Dirac adds 100ms of additional latency. The metadata feed to the tv via eARC will instantly let the TV know this and it will instantly and seamlessly delay the video the additional 100ms to compensate.
Based on the maximum safe and stable amount of latency compensation via eARC on all the top brands of TV’s, the maximum compensation an AVP manufacturer would want to utilize is 100ms. Beyond that you can run into stability issues with some TV’s. 100ms latency means the maximum taps you can use in FIR based post-processing/room correction is 9600 @ 48kHz.
So if a manufacturer wanted to go beyond 9600 taps, yet maintain lip sync with the video, it’s essential to have the source connected to the AVP, then connect the TV to the HDMI out like all traditional AVP’s since 2006. When done this way any amount of latency can be compensated for.
For example the Trinnov Altitude is capable of having the room correction set all the way up to 32768 taps. If not compensated for this would add 341ms of latency. The Storm EVO with Dirac ART can be cranked up to 8192 taps. Adding 85ms of latency. Both of these settings would be beyond the tolerable threshold of human lip-sync tolerance (which is around 40ms) if the latency wasn’t compensated for internally with the internal HDMI switching/buffering.
For this reason a box like the DPR-16 connected via eARC will never be able to use any post-processing that exceeds 9600 taps. Where the Audiocontrol DPR-16 connected via its 4 HDMI input ports and HDMI out could theoretically compensate for far higher latency. Of course the internal DSP would need to have the power to run it as well. And it’s unknown at this time what they’re using for DSP inside.
Hopefully this clears things up. But if any industry gurus want to explain better please chime in.