• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Topping D50 III optical input bug with 44.1kHz files

Do you notice audible drop-outs using the Topping D50 III optical input with 44.1kHz files?

  • Yes. I can hear drop-outs with the optical input (44.1kHz files)

    Votes: 3 17.6%
  • No. I do not hear audible drop-outs with the optical input (44.1=kHz files)

    Votes: 4 23.5%
  • I own a Topping D50 III but have not used the optical input yet

    Votes: 4 23.5%
  • I do not own a Topping D50 III

    Votes: 6 35.3%

  • Total voters
    17
I'm certain the DAC chip in this device doesn't do that. And I doubt that any device would be able to function at all employing that strategy. I've heard what happens when a DAC loses the clock and starts dropping/duplicating samples - which is effectively what such a strategy is. It is not pretty.
Well, while it's not pretty, it is used. Let me quote e.g. https://highendbyoz.com/wp-content/uploads/2022/07/Operating_Manual_Maximinus.pdf:

Re-clocking
– Enable this function in the user menu (see “Menu structure” below) . This very
important feature of the DAC allows for all jitter to be removed from the input source.
Data is read onto the device’s memory and then independently read out using a ultra
stable clock. When enabled, this option will completely replace the incoming clock with
an ultra low jitter TCXO based clock. The DSP monitors the incoming sample frequency
and detects standard sample rate signals - 44.1 kHz, 48 kHz, 88.2 kHz, 96 kHz, 176.4 kHz, 192
kHz, 352.8 kHz and 384 kHz. The on-board clock then completely replaces the incoming
clock. The source’s clock is used for other sampling frequencies. The DSP allocates a
huge internal FIFO buffer (1/2 second at 44.1), that stores the incoming audio to
decouple the incoming and outgoing data streams. Long absolute digital silences in the
music stream, such as between tracks and during pauses, are selectively shortened or
lengthened by the DSP to maintain data synchronization.
This results in a significant delay
between the audio source and the analog audio. You will not normally notice this delay
unless video is synchronized with the audio. For this reason you may want this feature to
be turned off when watching video, or the video might be delayed.

IIRC a FIFO reclocker board quite popular in RPi community does it too (using built-in FPGA), but I cannot find any quote now so will not pursue this path.

Another HW example which tracks no silence at all: https://www.mouser.com/pdfdocs/Semtech_GS2970A_DS.pdf p. 88 chapter 4.19.3.3 Audio FIFO Block:

The position of the write pointer with respect to the read pointer is monitored
continuously. If the write pointer is less than 6 samples ahead of the read pointer (point
A in Figure 4-46), a sample is repeated from the read-side of the FIFO. If the write pointer
is less than 6 samples behind the read pointer (point B in Figure 4-46), a sample is
dropped. This avoids buffer underflow/overflow conditions.

This method of crude async resampling (again without tracking silence) is commonly used in software, e.g. in gstreamer https://gstreamer-devel.narkive.com...o-sample-rate-conversion-with-gstreamer#post5 while a proper adaptive resampler has never been implemented https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/2681 .

I would believe the same method (again without tracking silence) is applied in PTP-clocked RTP reception of Dante receivers as these need very short latency (i.e. employ only short FIFO internally) and low computing overhead (unlikely to run proper adaptive resampling algorithms in their FPGAs).

Since Topping D50 III has a rather powerful XMOS microprocessor, this part would easily be able to perform this crude reclocking of I2S stream coming from the SPDIF receiver, to avoid having to switch clocks between the XMOS (USB) and the SPDIF receiver (which often produces audible artefacts). No idea whether it actually does.
 
Quiet or loud can't make any difference to digital interface dropouts. The hardware / software of the interface has no knowledge of the music, level or otherwise, that the data bits are carrying.
Well try it for yourself with a D50 III. Because my D50 III replaced a Chord Mojo which did not have this problem. It was a straight swap.
 
Not true.

By interface - I am meaning the Toslink interface components which are likely to generate bit errors, not a complete audio interface device.

In which case, the only difference I can see them experiencing is a bunch of zeros in the audio part of the data stream would result in a lower average frequency in the RZ encoding. If in the unlikely case this were enough to make a difference, I'd have thought this would result in fewer errors, rather than more.

Or what am I missing?
 
Well, while it's not pretty, it is used. Let me quote e.g. https://highendbyoz.com/wp-content/uploads/2022/07/Operating_Manual_Maximinus.pdf:



IIRC a FIFO reclocker board quite popular in RPi community does it too (using built-in FPGA), but I cannot find any quote now so will not pursue this path.

Another HW example which tracks no silence at all: https://www.mouser.com/pdfdocs/Semtech_GS2970A_DS.pdf p. 88 chapter 4.19.3.3 Audio FIFO Block:



This method of crude async resampling (again without tracking silence) is commonly used in software, e.g. in gstreamer https://gstreamer-devel.narkive.com...o-sample-rate-conversion-with-gstreamer#post5 while a proper adaptive resampler has never been implemented https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/2681 .

I would believe the same method (again without tracking silence) is applied in PTP-clocked RTP reception of Dante receivers as these need very short latency (i.e. employ only short FIFO internally) and low computing overhead (unlikely to run proper adaptive resampling algorithms in their FPGAs).

Since Topping D50 III has a rather powerful XMOS microprocessor, this part would easily be able to perform this crude reclocking of I2S stream coming from the SPDIF receiver, to avoid having to switch clocks between the XMOS (USB) and the SPDIF receiver (which often produces audible artefacts). No idea whether it actually does.


Every day is a school day. And having read the descriptions of them (and if I am understanding correctly), they are nothing like the massive error rate I was describing.

However these drop/repeat sample strategies shouldn't result in dropouts in the music. They will result in a single sample duration squeezing or stretching of the wave form. (some distortion), rather than a short silence. The frequency of this will depend on the degree of mismatching of the clocks.

With your first example above with the half second buffer, and silence tracking if the clock difference is only 50ppm (I've no idea if this is realistic or not - just a typical crystal oscillator tolerance) then according to my fag packet calculation there could be over an hour before the buffer over/underflows. Plenty to reset the buffer between periods of silence.

Of course if the clocks were derived from different frequency oscillators, and were only matched say to 0.1% then that would shorten to about 8 minutes.
 
I'm going to do a video highlighting this problem over the weekend and upload it.
 
I'm going to do a video highlighting this problem over the weekend and upload it.
I for one will be interested in seeing it.
 
I would believe the same method (again without tracking silence) is applied in PTP-clocked RTP reception of Dante receivers as these need very short latency (i.e. employ only short FIFO internally) and low computing overhead (unlikely to run proper adaptive resampling algorithms in their FPGAs).
In most large PTP installation everything is in synch. Live broadcast facilities can't tolerate splats or ticks.
 
And having read the descriptions of them (and if I am understanding correctly), they are nothing like the massive error rate I was describing.
I agree. I was giving examples of the sample dropping/duplication method being used quite commonly.
However these drop/repeat sample strategies shouldn't result in dropouts in the music. They will result in a single sample duration squeezing or stretching of the wave form. (some distortion), rather than a short silence.
Well, it depends on the implementation. Audible dropouts are reported in https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/2681 as linked above:
This can be observed by running these really simple pipelines on the same machine (and waiting for 5 - 30 minutes - depending on how accurate the clock in your sound card is - to observe a glitch or drop-out):

Adding/removing just one sample frequently requires very precise measuring of the rate difference which is difficult as software buffers/FIFOs get filled/consumed in batches - hence large fluctuation of their level. So the simple algorightms (as described in the quotes above) just wait for some limit to be reached and then "fix" the buffer fill at once - logically affecting a block of many samples.
Plenty to reset the buffer between periods of silence.
Yes, that algorithm which tracks silence is quite acceptable as its corrective action (which affects many samples) is not audibly noticable. But even that is not 100% reliable as there can be a continuous stream without any silent moments and the algorithm must kick in eventually, to avoid the buffer issue (e.g. continuous live recordings, long measurement sessions, etc.). Of course proper adaptive resapling has no such problems - but increases latency, requires major computing resources or specialized hardware, etc.

IIRC we already discussed these issues in https://www.audiosciencereview.com/...im-pro-distortion-on-spdif-input.41709/page-5 which IIRC ended up with software update of the streamer fixing the issue - hinting the adaptive resampling between incoming SPDIF/I2S stream and outgoing I2S stream for the DAC was fixed by the vendor. Theoretically the same type of issue could happen in the device discussed in this thread, should the SPDIF I2S stream be handled by the XMOS processor.
 
Last edited:
I agree. I was giving examples of the sample dropping/duplication method being used quite commonly.

Well, it depends on the implementation. Audible dropouts are reported in https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/2681 as lined above:


Adding/removing just one sample frequently requires very precise measuring of the rate difference which is difficult as software buffers/FIFOs get filled/consumed in batches - hence large fluctuation of their level. So the simple algorightms (as described in the quotes above) just wait for some limit to be reached and then "fix" the buffer fill at once - logically affecting many samples.

Yes, that algorithm which tracks silence is quite acceptable as its corrective action (which affects many samples) is not audibly noticable. But even that is not 100% reliable as there can be a continuous stream without any silent moments and the algorithm must kick in eventually, to avoid the buffer issue (e.g. continuous live recordings, long measurement sessions, etc.). Of course proper adaptive resapling has no such problems - but increases latency, requires major computing resources or specialized hardware, etc.

IIRC we already discussed these issues in https://www.audiosciencereview.com/...im-pro-distortion-on-spdif-input.41709/page-5 which IIRC ended up with software update of the streamer fixing the issue - hinting the adaptive resampling between incoming SPDIF/I2S stream and outgoing I2S stream for the DAC was fixed by the vendor. Theoretically the same type of issue could happen in the device discussed in this thread, should the SPDIF I2S stream be handled by the XMOS processor.
Could this be fixed with a firmware update in that case?
 
In most large PTP installation everything is in synch. Live broadcast facilities can't tolerate splats or ticks.
OK, let's analyze what must happen in a Dante/AES67 D->A receiver with PTP sync. Very nicely described in https://www.ravenna-network.com/wp-...ynchronization-Fundamentals-in-a-Nutshell.pdf

Incoming network RTP stream with high-resolution timestamps + incoming network PTP protocol which fine-tunes the software clock of the receiver to be in precise alignment with clocks of the other receivers (hence maintaining playback synchronicity). The receiver time-aligns the received RTP packet with its local=PTP clock and needs to pass this packet to hardware DAC. The DAC must physically consume this packet at exactly the same pace as that of the PTP software clock, otherwise the next packet cannot be passed to the DAC at that exact time.

Now how to generate the physical DAC master/I2S clock at the same pace as the software clock? The above-linked presentation just shows a hardware clock controlled by the PTP clock:

1732354439930.png


That is very logical but it requires a specialized hardware - software-tunable hardware clock. I can imagine the FPGA/chip-based implementations of the receiver can include such clock. But my 2 cents are the simple sample dropping/duplication algorithm, using standard fixed clock instead, to be honest. Tiny internal buffers and tight feedback control would allow "corrupting" just single samples, maybe inaudible. I may (hopefully) be wrong, I could find no implementation details online.

But Dante also offers linux SDK https://www.getdante.com/products/solutions-for-manufacturers/dante-for-audio/
1732354673332.png

I have never seen a commercially-available soundcard (including ARM SoC I2S interfaces) with its hardware clock being software-FINE-tunable by its driver. I would be happy to learn about one because such a hardware would solve many issues which need to be handled by asynchronous resampling now. In fact it can be built quite easily - e.g. the Si5340 clock generator allows fine-tuning its clock via I2C with as low as 0.001 ppb step https://www.skyworksinc.com/-/media.../public/data-sheets/si5341-40-d-datasheet.pdf . But I have never seen a physical soundcard with a tunable clock chip like this. It could be an interesting product for embedded linux, but asynchronous resampling in software is much much cheaper (at the cost of added latency, of course).

I really hope some Dante designer would chime in and explain how the software PTP clock gets transformed into the physical DAC clock in their devices.
 
Last edited:
Yes, IF it's really the cause. I do not know that.
But we're not talking about network sync protocols here. Just simple Toslink input DAC. As far as I can tell this DAC uses an ES9038 DAC chip which - also as far as I can tell - implements a DPLL for clock recovery. Assuming the input data rate is within range of the DPLL capability - no sample repeat/drop should be necessary.

Far more likely, I would expect, is that any dropouts are caused by an unusually high error rate on the Toslink connections.

It would be interesting to know @VientoB , what Toslink data rate you have set (regardless of the sample rate of the files), and what your source is.
 
Last edited:
OK, let's analyze what must happen in a Dante/AES67 D->A receiver with PTP sync. Very nicely described in https://www.ravenna-network.com/wp-...ynchronization-Fundamentals-in-a-Nutshell.pdf

Incoming network RTP stream with high-resolution timestamps + incoming network PTP protocol which fine-tunes the software clock of the receiver to be in precise alignment with clocks of the other receivers (hence maintaining playback synchronicity). The receiver time-aligns the received RTP packet with its local=PTP clock and needs to pass this packet to hardware DAC. The DAC must physically consume this packet at exactly the same pace as that of the PTP software clock, otherwise the next packet cannot be passed to the DAC at that exact time.

Now how to generate the physical DAC master/I2S clock at the same pace as the software clock? The above-linked presentation just shows a hardware clock controlled by the PTP clock:

View attachment 408821

That is very logical but it requires a specialized hardware - software-tunable hardware clock. I can imagine the FPGA/chip-based implementations of the receiver can include such clock. But my 2 cents are the simple sample dropping/duplication algorithm, using standard fixed clock instead, to be honest. Tiny internal buffers and tight feedback control would allow "corrupting" just single samples, maybe inaudible. I may (hopefully) be wrong, I could find no implementation details online.

But Dante also offers linux SDK https://www.getdante.com/products/solutions-for-manufacturers/dante-for-audio/
View attachment 408822
I have never seen a commercially-available soundcard (including ARM SoC I2S interfaces) with its hardware clock being software-FINE-tunable by its driver. I would be happy to learn about one because such a hardware would solve many issues which need to be handled by asynchronous resampling now. In fact it can be built quite easily - e.g. the Si5340 clock generator allows fine-tuning its clock via I2C with as low as 0.001 ppb step https://www.skyworksinc.com/-/media.../public/data-sheets/si5341-40-d-datasheet.pdf . But I have never seen a physical soundcard with a tunable clock chip like this. It could be an interesting product for embedded linux, but asynchronous resampling in software is much much cheaper (at the cost of added latency, of course).

I really hope some Dante designer would chime in and explain how the software PTP clock gets transformed into the physical DAC clock in their devices.
The experience I have is with large live broadcast solutions. All devices are specialized. Getting the whole thing to work is heroic. But that's the cost of a truly site-synchronous design.
 
But we're not talking about network sync protocols here.
That's true, I am talking about merging two clock domains, no network is important here - SPDIF input vs. I2S-> DAC output. I do not know if that happens in this particular device, but meging clock domains is very common in audio devices and they handle it in various ways with various outcomes.
Just simple Toslink input DAC.
Actually it is a box with three master-clocked inputs - USB, I2S, BT and one DAC which internally supports ASRC.

As far as I can tell this DAC uses an ES9038 DAC chip which - also as far as I can tell - implements a DPLL for clock recovery. Assuming the input data rate is within range of the DPLL capability - no sample repeat/drop should be necessary.
I do not know how the master-clocked signals from the three input sources are fetched to the single-input DAC. If they are simply switched before the DAC, then I absolutely agree that the above discussion about merging clock domains is irrelevant (and I said it at the beginning). If some MCU (like the XMOS) does the merge and the I2S link between this MCU and the DAC is master-clocked, then the clock domains are being merged there (even if there is ASRC internal to the DAC, behind the master-clocked I2S).

Since the 10ch EQ performed by the XMOS is available only for USB, the input -> DAC selection is more likely not to be done in XMOS. But pictures of that device board show several large chips around XMOS, no idea what those do.

Far more likely, I would expect, is that any dropouts are caused by an unusually high error rate on the Toslink connections.
Yes, it can be the case, I do not dispute it at all. I am just showing another cause which may or may not be possible, depending on internal construction of the device (which I do not know).
 
But we're not talking about network sync protocols here. Just simple Toslink input DAC. As far as I can tell this DAC uses an ES9038 DAC chip which - also as far as I can tell - implements a DPLL for clock recovery. Assuming the input data rate is within range of the DPLL capability - no sample repeat/drop should be necessary.

Far more likely, I would expect, is that any dropouts are caused by an unusually high error rate on the Toslink connections.

It would be interesting to know @VientoB , what Toslink data rate you have set (regardless of the sample rate of the files), and what your source is.
Well, the files are coming from my NAS through a Chromecast Audio, the exact same way that my Chord Mojo was being supplied previously (I've also had an SMSL M300 SE in that position as well with no problems).
 
Well, the files are coming from my NAS through a Chromecast Audio, the exact same way that my Chord Mojo was being supplied previously (I've also had an SMSL M300 SE in that position as well with no problems).
Does the problem also occur at 48 kHz?
 
No it only seems to be 44.1kHz
That's what I suspected.
Not an unknown problem, there were compatibility problems with Chromecast Audio and some devices from the start, e.g. Cambridge, Oppo DAC/CD Player, etc., I also know that the Chord Mojo worked perfectly at the time.
Chromecast Audio only ran with interruptions at 44.1k on my Sony DSC-88 (Digital Signal Checker) workshop device, which is why I returned it.
Just to be clear, what doesn't work on a Sony DSC-88 doesn't comply with the SPDIF standard.

There are DACs that can handle a non-standard SPDIF signal better than others, very dependent on the transceiver chip used. The problem is also known from many modern TVs with SPDIF interfaces.
But neither Topping nor any other DAC manufacturer is to blame, it's not their job to be compatible with source devices that don't adhere to the standard.
You probably won't like it, but that's how it is.

I seem to remember that there was an update for Chromecast Audio regarding the SPDIF problem, but I'm not sure. You can check it.
 
That's what I suspected.
Not an unknown problem, there were compatibility problems with Chromecast Audio and some devices from the start, e.g. Cambridge, Oppo DAC/CD Player, etc., I also know that the Chord Mojo worked perfectly at the time.
Chromecast Audio only ran with interruptions at 44.1k on my Sony DSC-88 (Digital Signal Checker) workshop device, which is why I returned it.
Just to be clear, what doesn't work on a Sony DSC-88 doesn't comply with the SPDIF standard.

There are DACs that can handle a non-standard SPDIF signal better than others, very dependent on the transceiver chip used. The problem is also known from many modern TVs with SPDIF interfaces.
But neither Topping nor any other DAC manufacturer is to blame, it's not their job to be compatible with source devices that don't adhere to the standard.
You probably won't like it, but that's how it is.

I seem to remember that there was an update for Chromecast Audio regarding the SPDIF problem, but I'm not sure. You can check it.
Thanks, interesting post. I shall investigate. What would be a good replacement for the Chromecast Audio if I have to go down that route?

I need to check if the other people here who have had problems are using a Chromecast audio also...
 
Thanks, interesting post. I shall investigate. What would be a good replacement for the Chromecast Audio if I have to go down that route?

I need to check if the other people here who have had problems are using a Chromecast audio also...
The problem now exists with many source devices, not just televisions. In recent years, the problem has also been occurring more and more frequently with CD players. A bit ridiculous when you consider that it was primarily developed for these devices.

You should look at what makes sense for you: DAC or Chromecast Audio replacement.
The usual alternatives are Wiim Mini or a Raspberry Pi with streamer software, there are now many options.

If you want to keep both, you could put an AK4118 board from Aliexpress in between. This has solved the problem with TVs for some people I know, but it doesn't have to work with Chromecast Audio.
 
Back
Top Bottom