There are many things that can lead to measurement differences at such sensitive levels. Unit to unit manufacturing variation, temperature, susceptibility to RF, conducted RF on the interconnects, radiated RF and how they affect other gear in the set up, even the cleanliness of the AC.
@amirm and
@WolfX-700 take care with their test set ups, and so represent a more ideal case. Buyers of gear need to recognize that their particular situation may note be so ideal, and this level of performance may not be possible. For example, as Amir has shown, so much consumer gear doesn't go through proper RF (FCC or otherwise) testing so there's no knowing how your other gizmos are bombing each other with RF, or how good they are at rejecting being bombed. Rf can modulate down to the audio band and raise the noise and add spurious low level tones. Ground loops between pc and other gear can cause noise floors to raise and USB frame rates (8kHz) to be visible on the plots.
For another example, my home has bad AC feed, and is in a RF hot zone. I characterized the RTX6001 in my home environment:
https://www.audiosciencereview.com/...udio-measurement-gear.113/page-12#post-153972
Just changing the RTX's power cord to a lower gauge with slightly poorer shielding increased the power line harmonics in loopback by 5 to 10 dB.
Temperature dependence is also a gremlin, especially when a device is using highly sensitive I/V converters (current noise is tough to stabilize over temp). Here are two measurements I took of a Lite Dac Ah back in '06 (using just a SB card), one at room temp, one after 20 mins in the freezer (not that cold).
By all means it'll be interesting to see where this leads and maybe it'll lead to a lesson learned to not accept gear from manufacturers (especially after the cherry picked pot). But I bring this up to hopefully ease minds to not to worry so much about such small measured differences and how it'll affect your listening pleasure (it won't).