We had a long discussion a while ago why comparing the DAC output to the original is extremely difficult, the residual is most of the time fully dominated by the roll-off of any filters present (DC filter if present, and of course the combined analog and digital reconstruction filters)... and note that's the combined response of the DAC Under Test and the ADC used to digitize the output.
In the end this means the residual is dominated by simple linear errors in regions that are basically outside of what is relevant. These linear errors also include ever so slight gain drifts, again both in the DAC and ADC used.
DW's results are very much influenced by the exact settings... I've spent days to fine-tune the settings to show the best results. The most basic thing is to restrict the analysis range to 20Hz...20kHz, exact numbers, via pre and post filtering. In general the matching parameters (gain and delay) usually need to be manually tweaked for lowest residual.