FR measurements are more standardized than non-linear distortion measurements. For example, unless otherwise noted, one naturally expects a speaker FR measurement to be done in far-field anechoic free space conditions (simulated or actual). Speaker FR measurements are not expected to be level-dependent unless there is something seriously wrong with the speaker or it is driven into compression (which should not be the case unless that is explicitly stated as part of the measurement). Background noise is not a big issue with typical FR measurement techniques.
In the end, speaker FR measurements are sufficiently standardized that multiple sources with different protocols can often be compared directly, as is often done in these review threads (with S&R, etc.). Different protocols might produce results with varying accuracy (e.g. anechoic vs. quasi-anechoic vs. NFS), but on average one can expect them to converge on the same numbers, and in practice they do, as has been shown many times.
In contrast, non-linear distortion doesn't follow any widely agreed-upon standard (aside from the definition of the THD formula itself, which is a small part). You might get wildly different results depending on drive level, test signal used (frequency, shape), background noise, acoustical environment, and perhaps other factors like short-term vs. long-term distortion. Protocols are all over the place and it's rare to find two protocols that are sufficiently similar that the results can be compared directly. This is not surprising because there is still a lot we don't know about how to make non-linear distortion measurements that actually correlate well with human perception, so this area is still very much in the "experiment"/"trial and error" stage as opposed to the "standardization" stage.