I'm not sure anyone is actually interested in this topic. But since I am, and I've been mulling it over and doing a lot of reading over the past few days, I thought I'd float a few ideas and some summaries of the existing research, just in case anyone is interested
First of all, there have now been enough "typical" distortion tests of the kind I described in the OP that have shown surprisingly low distortion thresholds (i.e. in the range of 0.01%).
Here's a neat summary of some of the findings from an article by Gaskell from 2011:
View attachment 16299
As we can see, the results are all over the place. This is partly because of the wide variety of methods and stimuli used, and the fact that many of these studies have sought to find not audibility thresholds, but rather objectionability thresholds.
Still, the 1980 study by Petri-Larmi et al, which was a well-conducted study despite some limitations (primarily the use of vinyl as the source), did find that experienced listeners who had previously shown particularly good ability to discern distortion were able to reliably detect distortion as low as 0.003% "RMS" with program material. The Gaskell study from which I took that table also found some subjects could reliably discern distortion at around 0.003% "RMS".
0.003% is equivalent to 90dB below the signal. This seems surprising when we look at research into masking.
Here's a classic graph from Fastl and Zwicker, showing masking thresholds for a 1Khz tone masker at various SPLs:
View attachment 16303
Basically, what the graph shows is that, when a say 1KHz tone of 90dB is the masker tone, a maskee tone will be audible only if it is (for example) about 60dB at 1.5KHz, 55dB at 2KHz, 45dB at 5KHz, etc. etc. For a 70dB masker tone at 1KHz (for example), the audibility thresholds for a maskee tone would be about: 35dB at 1.5KHz, 25dB at 2KHz, 0dB (absolute threshold of audibility) at 5KHz, etc. etc.
The key things to note here are that:
- upward masking (masking of tones higher in frequency) is actually quite effective relative to typical levels of harmonic distortion generated by decent electronic components, especially when the maskee is close in frequency to the masker.
- as SPL increases, the bandwidth widens in which upward masking (but not downward masking) is effective.
Also note, however, it's not quite this simple. As the maskee gets closer in frequency to a tone masker, beat tones may become audible. Moreover, nonlinearities in the ear itself tend to create audible secondary beat tones at specific frequencies. The following graph shows these effects in more detail for an 80dB 1KHz masker:
View attachment 16306
Although these effects slightly complicate the picture, it's nevertheless the case that even the ear's nonlinearities do not produce audible distortions until the maskee is 60dB below the masker (0.1%) - at this masker frequency and SPL at least (although the trends are similar across the board).
Here's a similar graph using narrow-bandlimited Gaussian noise rather than a tone as the masker:
View attachment 16305
Although the noise masker tends to have a higher and wider peak (which is predicted by its wider bandwidth), farther away from the peak the masking curve is similar (for example, you can see that a 60dB tone masker and 60dB noise masker centred at 1KHz give about the same masking threshold of about -50dB at 2KHz; this is far enough away from the centre frequency of the bandlimited noise for the differences between noise masker and tone masker to be negligible).
So far I've only shown graphs for various masker SPLs at 1KHz. Now here's a graph showing masking thresholds for 60dB bandlimited Gaussian noise at a variety of frequencies:
View attachment 16301
The thing to note here is the general trend that, at lower frequencies, the bandwidth of a masker's effectiveness tends to be wider. This is well-established to be particularly the case below 500Hz.
So, to summarise, we can say the following about masking:
- The lower in frequency the masker, the more effective (wider bandwidth).
- The higher in level the masker, the more effective (wider upward bandwidth).
- Tone maskers and noise maskers behave similarly (although noise markers are slightly more effective, particularly in the frequency range close to the noise).
But perhaps what's most interesting about these data is that, taking them as a whole, it seems hard to imagine that maskees below about -70dB and between the frequency of the fundamental and H4 (i.e. 4 x the frequency of the masker) could be audible under
any circumstances. This is because, when the masker is 70dB in level or lower, any maskee below -70dB will tend to fall below the absolute threshold of audibility, while when the masker rises to levels above about 70dB, the range of upward masking widens and, accordingly, maskees higher in frequency (unless much higher in frequency) will tend to fall below the masking threshold. This
should mean that under no circumstances could any maskee below -70dB and between H1 and H4 be audible, in
any frequency range.
In turn, this should imply that so long as there are no harmonics above H4 / -70dB (0.03%), there should be no audible distortion (of course, I've not addressed here maskees lower in frequency than the masker, a classic example of which would be the IM product given by F2-F1).
Even above 0.03%, a maskee would need to be relatively far from the masker to be audible. And even above H4, a maskee would need to be above absolute thresholds of audibility (the ear becomes less sensitive at lower and higher frequencies, as shown by the dotted line in the above graphs).
Moreover, in a typical music signal, a wide range of frequencies are present. This would seem unlikely to leave a lot of space for unmasked harmonics and/or IM products to become audible, unless rather high in level (certainly compared to what good electronics are capable of).
What could be happening then in these DBTs in which subjects are able to distinguish distortions as much as an order of magnitude below 0.03%?
I have a couple of theories. The first one is that distortions combine to create a wider bandwidth noise-like signal that can rise above the masking threshold provided by the signal. I'm not aware of any studies into masking of noise by noise, but these would seem to be a good place to start to try to get a bit closer to examining this. On the other hand, the signal itself will tend to create a wide bandwidth masker wherever it is creating wideband distortion (although not necessarily, see my next idea).
Another possibility is that what subjects are reliably able to hear in these tests are IM products
lower in frequency than the spectral content of the stimuli. For example, if subjects were played a musical passage at 100dB with its spectral content concentrated in the midrange and treble, and with little content below say 500Hz, the signal content itself would do nothing to mask IM products falling below the lower cutoff of the signal content. An IM product at say -80dB and one octave below the main spectral content of the signal may be completely unmasked, meaning that so long as it rises above absolute audibility thresholds, it will be audible.
Some evidence seems to support this theory. For example, in Petri-Larmi's 1990 study, these thresholds were found for various types of musical content:
View attachment 16309
It seems plausible to speculate that perhaps the piano and choir samples were most revealing of distortion because they contained passages with the least low frequency content, but plenty of IM-producing mid-high frequency content. However, we don't know what the spectral content of these samples was, and moreover, it seems to me that the same could likely be said of the violin and harpsichord samples.
A third possible theory is that, although distortion itself may not be directly audible (i.e. audibly adding to the signal tonally) at levels in the range of 0.003-0.03%, due to the critical bands present in our auditory system, addition of distortion of these levels may result in the stimulus seeming louder than an an undistorted stimulus of the same absolute SPL. It's well-established that, even if two stimuli have the same absolute SPL level, the stimulus with the wider bandwidth, or the most even distribution of sound pressure across the widest bandwidth, will be perceived to be louder due to the functioning of our auditory system's critical bands. It may thus be speculated that, even if harmonics do not exceed audibility thresholds in their own right, if they extend the bandwidth of a stimulus or distribute sound pressure across a wider bandwidth, subjects will perceive that stimulus to be louder than a narrower bandwidth signal. It's well-established that subjects tend to prefer music that is, or that seems to be, louder.
Another factor that lends some weight to this theory is that, in many distortion audibility studies, subjects have tended to prefer musical stimuli that were distorted enough to be reliably distinguished from undistorted stimuli, so long as these were
not distorted enough for subjects to be able to identify them as "sounding distorted". It seems reasonable to at least speculate that this may have been because these distorted music stimuli
seemed louder than undistorted stimuli due to their wider bandwidth, despite having very close to the same absolute SPL.
Again, we don't know enough about the spectral content or the distortion components of the stimuli used in these studies to do any more than speculate here.
Finally (for now), all I've discussed completely sidesteps issues surrounding temporal masking, i.e. masking of a signal by a masker that occurs either before or after it in time. I don't think this should be a major factor in most electronics, but I'm far from an expert on electronics.
Anyway, I hope other members here are interested enough in this topic to offer their comments and maybe introduce new ideas or evidence that might get us a bit closer to making sense of all these data
EDITED: with a few extra thoughts and some clarifications...