Masking Studies and Relationship to Nonlinear Distortion Audibility (?)

watchnerd · Oct 25, 2019

Theriverlethe said:
Ha! If you listen to music with silent passages or nothing below 60Hz, it will be very annoying.

Sure.

But that's not really mixed with music.

And even then, it would depend on level. At -100 dB, hum may not be annoying at all, even in the situation you describe.

Theriverlethe · Oct 25, 2019

watchnerd said:
Sure.

But that's not really mixed with music.

And even then, it would depend on level. At -100 dB, hum may not be annoying at all, even in the situation you describe.

watchnerd · Oct 25, 2019

Theriverlethe said:

I'm very familiar with the piece.

Do you think hum at -100 dB is audible during those quiet parts?

If not, how much is?

Blumlein 88 · Oct 25, 2019

watchnerd said:
What do we know about a fairly limited masking scenario:

60 Hz hum with mixed with music?

Obviously depends upon the music. If it has content around 60 hz and 120 and 180 (often hum is at all three of those), it would have to be pretty high in level, or would be noticeable when the music gets quiet. One of the most annoying is listening to spoken word with even a little hum. There is nothing to hide the hum, and even if just detectable it is bothersome while the same hum in some continuously loud rock music is hard to even hear as there.

Theriverlethe · Oct 25, 2019

watchnerd said:
I'm very familiar with the piece.

Do you think hum at -100 dB is audible during those quiet parts?

If not, how much is?

It would probably reach the threshold of audibility around 50dB. So, your audio system would have to be set to play 150dB to hear it at -100dB. If any full scale sound plays at this level, your eardrums will rupture, so you probably won’t have to worry about the hum very long.

watchnerd · Oct 26, 2019

Blumlein 88 said:
Obviously depends upon the music. If it has content around 60 hz and 120 and 180 (often hum is at all three of those), it would have to be pretty high in level, or would be noticeable when the music gets quiet. One of the most annoying is listening to spoken word with even a little hum. There is nothing to hide the hum, and even if just detectable it is bothersome while the same hum in some continuously loud rock music is hard to even hear as there.

I think that's at the obviously heard level.

But....

[warning, anecdotal experience, not data below]

If I use my tube phono stage, and screw up the grounding or use a ****** tube, I can get hum that I can't hear at normal listening levels, but I can *feel it* if I put my fingers on the cone when nothing is playing. It will be vibrating softly.

So I would assume that, at some level, this is potentially mucking with low level resolution capabilities.

ahofer · Oct 26, 2019

The volume effect of harmonic distortion is interesting. Perhaps the attraction of underpowered tube amps, routinely called “more dynamic”, is that the higher level program material gets a perceived boost from all the extra harmonic distortion content.

Blumlein 88 · Oct 26, 2019

watchnerd said:
I think that's at the obviously heard level.

But....

[warning, anecdotal experience, not data below]

If I use my tube phono stage, and screw up the grounding or use a ****** tube, I can get hum that I can't hear at normal listening levels, but I can *feel it* if I put my fingers on the cone when nothing is playing. It will be vibrating softly.

So I would assume that, at some level, this is potentially mucking with low level resolution capabilities.

I don't believe any maskers can mask something louder than the masker. So it could mask frequencies near the hum. But if your already down to where you can't hear the masker could you hear something being masked? I'd say no because your ear isn't hearing the masker for it to function. I suppose you could have the hum just barely, barely inaudible, and have signal right there too so that both combined become audible barely barely. Then continuous hum might mask something that wasn't continuous. That would be a very narrow fleeting set of conditions for that to occur. And maybe you'd get the non-continuous signal modulating the continuous hum so it would come and go somewhat like the signal anyway.

I once years ago had a speaker with a passive radiator. On LP the passive radiator would be moving lots visually from TT rumble. I later had a different phono with a filter for everything below 30 hz. That quieted the passive radiator movement. I couldn't hear any difference.

Theriverlethe · Oct 26, 2019

watchnerd said:
I think that's at the obviously heard level.

But....

[warning, anecdotal experience, not data below]

If I use my tube phono stage, and screw up the grounding or use a ****** tube, I can get hum that I can't hear at normal listening levels, but I can *feel it* if I put my fingers on the cone when nothing is playing. It will be vibrating softly.

So I would assume that, at some level, this is potentially mucking with low level resolution capabilities.

Listening to vinyl is definitely mucking up your low end, well within audibility.

watchnerd · Oct 26, 2019

Blumlein 88 said:
I don't believe any maskers can mask something louder than the masker. So it could mask frequencies near the hum. But if your already down to where you can't hear the masker could you hear something being masked? I'd say no because your ear isn't hearing the masker for it to function. I suppose you could have the hum just barely, barely inaudible, and have signal right there too so that both combined become audible barely barely. Then continuous hum might mask something that wasn't continuous. That would be a very narrow fleeting set of conditions for that to occur. And maybe you'd get the non-continuous signal modulating the continuous hum so it would come and go somewhat like the signal anyway.

I once years ago had a speaker with a passive radiator. On LP the passive radiator would be moving lots visually from TT rumble. I later had a different phono with a filter for everything below 30 hz. That quieted the passive radiator movement. I couldn't hear any difference.

I know know this all makes sense...

But, when cognitive bias kicks in, when I know I can feel it with my fingers vs not....I *swear* by my non-DBT-senses, it loses some detail...

Maybe some math can shatter my cognitive dissonance.

watchnerd · Oct 26, 2019

Theriverlethe said:
Listening to vinyl is definitely mucking up your low end, well within audibility.

Oh, most definitely!

But that wasn't really the question.

amirm · Oct 26, 2019

andreasmaaan said:
But perhaps what's most interesting about these data is that, taking them as a whole, it seems hard to imagine that maskees below about -70dB and between the frequency of the fundamental and H4 (i.e. 4 x the frequency of the masker) could be audible under any circumstances. This is because, when the masker is 70dB in level or lower, any maskee below -70dB will tend to fall below the absolute threshold of audibility, while when the masker rises to levels above about 70dB, the range of upward masking widens and, accordingly, maskees higher in frequency (unless much higher in frequency) will tend to fall below the masking threshold. This should mean that under no circumstances could any maskee below -70dB and between H1 and H4 be audible, in any frequency range.

Your use of dB without unit makes these conclusions hard to follow. If you mean dBSPL, then that is just a matter of amplification. A 70 dBFS signal can be 70 dBSPL or 100 dBSPL depending on how loud I play it. The latter lifts the signal from threshold of hearing (which by the way is lower than 0 dB) to +30 dBSPL.

In literature, the max dB SPL is usually assumed to be 120 dB (based on research of how loud live music can be) . In that case, if you play your 70 dBFS signal at 120 dBSPL, then the distortion products will land at +50 dBSPL or way, way higher than threshold of hearing. Distortion products therefore need to be 120 dB or so lower to be below threshold of hearing (in mid frequencies).

In addition, since music has tons of primary tones, each creating their own harmonics, they add up in energy. This then competes with what is normally in music in that range. High frequencies drop exponentially in music so it does not take a lot to change their overall energy and with it, cause the high frequencies to be exaggerated.

We have all heard this effect in extreme distortion of amplifiers where the sound gets harsh. In lower amounts, this harshness is not there and gets replaced with exaggeration of high frequencies.

Another effect is hiding of low level detail in music. Take our multitone test:

Clearly no detail below -70 dB will be audible in the sea of intermodulation distortion here. That gives us just 50 dB of distortion-free range relative to 120 dBSPL playback.

Theriverlethe · Oct 26, 2019

amirm said:
Your use of dB without unit makes these conclusions hard to follow. If you mean dBSPL, then that is just a matter of amplification. A 70 dBFS signal can be 70 dBSPL or 100 dBSPL depending on how loud I play it. The latter lifts the signal from threshold of hearing (which by the way is lower than 0 dB) to +30 dBSPL.

In literature, the max dB SPL is usually assumed to be 120 dB (based on research of how loud live music can be) . In that case, if you play your 70 dBFS signal at 120 dBSPL, then the distortion products will land at +50 dBSPL or way, way higher than threshold of hearing. Distortion products therefore need to be 120 dB or so lower to be below threshold of hearing (in mid frequencies).

In addition, since music has tons of primary tones, each creating their own harmonics, they add up in energy. This then competes with what is normally in music in that range. High frequencies drop exponentially in music so it does not take a lot to change their overall energy and with it, cause the high frequencies to be exaggerated.

We have all heard this effect in extreme distortion of amplifiers where the sound gets harsh. In lower amounts, this harshness is not there and gets replaced with exaggeration of high frequencies.

Another effect is hiding of low level detail in music. Take our multitone test:

Clearly no detail below -70 dB will be audible in the sea of intermodulation distortion here. That gives us just 50 dB of distortion-free range relative to 120 dBSPL playback.

For the uninitiated, there is no such thing as a 70dBFS signal. DBFS is “decibels relative to full scale,” meaning that the highest possible dBFS is 0dB. This inversely correlates to dB SPL (sound pressure level), so using eg. -70dB seemed adequate for the purpose of this discussion. Amir is correct that a -70dBFS signal would produce 50dB SPL if 0dBFS produces 120dB SPL.

Where Amir goes wrong is in largely ignoring the masking research and misapplying Fletcher-Munson type curves. These curves establish an absolute threshold of human hearing using test tones in as close to a silent environment as possible. Why is it that normal human speech (60-70dB SPL), for which the human auditory system is most attuned, becomes completely inaudible at a rock concert (110dB SPL+)? This is called masking. Your friend will have to scream at you in the 90-100dB SPL range in order to be understood. Being generous, this leaves us maybe 30dB of dynamic range at very high SPL with a wide-bandwidth audio source. Any "micro-detail" in the distortion spectrum Amir shows in this chart is likely to be masked by the high-amplitude signal long before it gets masked or somehow perceptibly altered by distortion.

As SPL decreases, so does the masking effect. Fortunately, so does the kind of non-linear distortion we're talking about. Eg., an audio system that shows -70dBFS noise with a 0dBFS input signal may only show -90dBFS noise with a -10dBFS signal. Also, the 70dB SPL dynamic range figure assumes a pure tone with high-order distortion. Eg., a 7th harmonic at -70dBFS is way more likely to be heard next to a pure tone than a 2nd or 3rd harmonic closer to the masking tone. With complex signals like music, or the chart Amir helpfully provides, dynamic range of human hearing between signal and noise is likely much lower.

This non-linear distortion should not be confused with background noise like hum or hiss, which may become annoying with no input signal at -80dBFS or lower.

GrimSurfer · Oct 26, 2019

Theriverlethe said:
For the uninitiated, there is no such thing as a 70dBFS signal. DBFS is “decibels relative to full scale,” meaning that the highest possible dBFS is 0dB. This inversely correlates to dB SPL (sound pressure level), so using eg. -70dB seemed adequate for the purpose of this discussion. Amir is correct that a -70dBFS signal would produce 50dB SPL if 0dBFS produces 120dB SPL.

Where Amir goes wrong is in largely ignoring the masking research and misapplying Fletcher-Munson type curves. These curves establish an absolute threshold of human hearing using test tones in as close to a silent environment as possible. Why is it that normal human speech (60-70dB SPL), for which the human auditory system is most attuned, becomes completely inaudible at a rock concert (110dB SPL+)? This is called masking. Your friend will have to scream at you in the 90-100dB SPL range in order to be understood. Being generous, this leaves us maybe 30dB of dynamic range at very high SPL with a wide-bandwidth audio source. Any "micro-detail" in the distortion spectrum Amir shows in this chart is likely to be masked by the high-amplitude signal long before it gets masked or somehow perceptibly altered by distortion.

As SPL decreases, so does the masking effect. Fortunately, so does the kind of non-linear distortion we're talking about. Eg., an audio system that shows -70dBFS noise with a 0dBFS input signal may only show -90dBFS noise with a -10dBFS signal. Also, the 70dB SPL dynamic range figure assumes a pure tone with high-order distortion. Eg., a 7th harmonic at -70dBFS is way more likely to be heard next to a pure tone than a 2nd or 3rd harmonic closer to the masking tone. With complex signals like music, or the chart Amir helpfully provides, dynamic range of human hearing between signal and noise is likely much lower.

This non-linear distortion should not be confused with background noise like hum or hiss, which may become annoying with no input signal at -80dBFS or lower.

Good post, @Theriverlethe.

Masking is often taken out of its proper context. If one is listening to two tones (one signal and one artifact) at 80 and 10 dB respectively, it's easy to prove masking by using a calculator. The greater the disparity between the signal and artifact, the more the masking. Past a certain point (~30-40 dB) it's impossible for the human ear to discern the artifact, even as a node.

Change the frequency, and all bets are off. The disparity between the spl of the signal and artifact becomes far less of a factor. Tonality, as discerned by the human ear, will change.

Things get messy, ranging from ambiguous to downright untrue, when people make very broad assumptions about the ability to hear distortion or noise relative to reference levels when frequency data is absent. In such cases where inaudibility is truly desirable (as it should be to the audiophile), one should assume that the signal and artifacts are separated by frequency. This may not end up being correct in a specific case but erring on the side of conservatism rarely ends in disappointment, listening fatigue, or coloration, etc.

Blumlein 88 · Oct 26, 2019

http://hyperphysics.phy-astr.gsu.edu/hbase/Sound/mask.html

This link has a nice simple explanation of the various aspects of masking.
http://www2.bcs.rochester.edu/courses/crsinf/221/14.pdf

watchnerd · Oct 26, 2019

Theriverlethe said:
This non-linear distortion should not be confused with background noise like hum or hiss, which may become annoying with no input signal at -80dBFS or lower.

I find it even worse when it's right at the edge of perception.....like hearing it when I change my head direction, but not in another.

To me that's even worse than constant.

Old Listener · Nov 2, 2019

andreasmaaan said:
I'm not sure anyone is actually interested in this topic. But since I am, and I've been mulling it over and doing a lot of reading over the past few days, I thought I'd float a few ideas and some summaries of the existing research, just in case anyone is interested

First of all, there have now been enough "typical" distortion tests of the kind I described in the OP that have shown surprisingly low distortion thresholds (i.e. in the range of 0.01%).

Here's a neat summary of some of the findings from an article by Gaskell from 2011:

View attachment 16299

As we can see, the results are all over the place. This is partly because of the wide variety of methods and stimuli used, and the fact that many of these studies have sought to find not audibility thresholds, but rather objectionability thresholds.

Still, the 1980 study by Petri-Larmi et al, which was a well-conducted study despite some limitations (primarily the use of vinyl as the source), did find that experienced listeners who had previously shown particularly good ability to discern distortion were able to reliably detect distortion as low as 0.003% "RMS" with program material. The Gaskell study from which I took that table also found some subjects could reliably discern distortion at around 0.003% "RMS".

0.003% is equivalent to 90dB below the signal. This seems surprising when we look at research into masking.

Here's a classic graph from Fastl and Zwicker, showing masking thresholds for a 1Khz tone masker at various SPLs:

View attachment 16303

Basically, what the graph shows is that, when a say 1KHz tone of 90dB is the masker tone, a maskee tone will be audible only if it is (for example) about 60dB at 1.5KHz, 55dB at 2KHz, 45dB at 5KHz, etc. etc. For a 70dB masker tone at 1KHz (for example), the audibility thresholds for a maskee tone would be about: 35dB at 1.5KHz, 25dB at 2KHz, 0dB (absolute threshold of audibility) at 5KHz, etc. etc.

The key things to note here are that:

upward masking (masking of tones higher in frequency) is actually quite effective relative to typical levels of harmonic distortion generated by decent electronic components, especially when the maskee is close in frequency to the masker.

as SPL increases, the bandwidth widens in which upward masking (but not downward masking) is effective.

Also note, however, it's not quite this simple. As the maskee gets closer in frequency to a tone masker, beat tones may become audible. Moreover, nonlinearities in the ear itself tend to create audible secondary beat tones at specific frequencies. The following graph shows these effects in more detail for an 80dB 1KHz masker:

View attachment 16306

Although these effects slightly complicate the picture, it's nevertheless the case that even the ear's nonlinearities do not produce audible distortions until the maskee is 60dB below the masker (0.1%) - at this masker frequency and SPL at least (although the trends are similar across the board).

Here's a similar graph using narrow-bandlimited Gaussian noise rather than a tone as the masker:

View attachment 16305

Although the noise masker tends to have a higher and wider peak (which is predicted by its wider bandwidth), farther away from the peak the masking curve is similar (for example, you can see that a 60dB tone masker and 60dB noise masker centred at 1KHz give about the same masking threshold of about -50dB at 2KHz; this is far enough away from the centre frequency of the bandlimited noise for the differences between noise masker and tone masker to be negligible).

So far I've only shown graphs for various masker SPLs at 1KHz. Now here's a graph showing masking thresholds for 60dB bandlimited Gaussian noise at a variety of frequencies:

View attachment 16301

The thing to note here is the general trend that, at lower frequencies, the bandwidth of a masker's effectiveness tends to be wider. This is well-established to be particularly the case below 500Hz.

So, to summarise, we can say the following about masking:

The lower in frequency the masker, the more effective (wider bandwidth).

The higher in level the masker, the more effective (wider upward bandwidth).

Tone maskers and noise maskers behave similarly (although noise markers are slightly more effective, particularly in the frequency range close to the noise).

But perhaps what's most interesting about these data is that, taking them as a whole, it seems hard to imagine that maskees below about -70dB and between the frequency of the fundamental and H4 (i.e. 4 x the frequency of the masker) could be audible under any circumstances. This is because, when the masker is 70dB in level or lower, any maskee below -70dB will tend to fall below the absolute threshold of audibility, while when the masker rises to levels above about 70dB, the range of upward masking widens and, accordingly, maskees higher in frequency (unless much higher in frequency) will tend to fall below the masking threshold. This should mean that under no circumstances could any maskee below -70dB and between H1 and H4 be audible, in any frequency range.

In turn, this should imply that so long as there are no harmonics above H4 / -70dB (0.03%), there should be no audible distortion (of course, I've not addressed here maskees lower in frequency than the masker, a classic example of which would be the IM product given by F2-F1).

Even above 0.03%, a maskee would need to be relatively far from the masker to be audible. And even above H4, a maskee would need to be above absolute thresholds of audibility (the ear becomes less sensitive at lower and higher frequencies, as shown by the dotted line in the above graphs).

Moreover, in a typical music signal, a wide range of frequencies are present. This would seem unlikely to leave a lot of space for unmasked harmonics and/or IM products to become audible, unless rather high in level (certainly compared to what good electronics are capable of).

What could be happening then in these DBTs in which subjects are able to distinguish distortions as much as an order of magnitude below 0.03%?

I have a couple of theories. The first one is that distortions combine to create a wider bandwidth noise-like signal that can rise above the masking threshold provided by the signal. I'm not aware of any studies into masking of noise by noise, but these would seem to be a good place to start to try to get a bit closer to examining this. On the other hand, the signal itself will tend to create a wide bandwidth masker wherever it is creating wideband distortion (although not necessarily, see my next idea).

Another possibility is that what subjects are reliably able to hear in these tests are IM products lower in frequency than the spectral content of the stimuli. For example, if subjects were played a musical passage at 100dB with its spectral content concentrated in the midrange and treble, and with little content below say 500Hz, the signal content itself would do nothing to mask IM products falling below the lower cutoff of the signal content. An IM product at say -80dB and one octave below the main spectral content of the signal may be completely unmasked, meaning that so long as it rises above absolute audibility thresholds, it will be audible.

Some evidence seems to support this theory. For example, in Petri-Larmi's 1990 study, these thresholds were found for various types of musical content:

View attachment 16309

It seems plausible to speculate that perhaps the piano and choir samples were most revealing of distortion because they contained passages with the least low frequency content, but plenty of IM-producing mid-high frequency content. However, we don't know what the spectral content of these samples was, and moreover, it seems to me that the same could likely be said of the violin and harpsichord samples.

A third possible theory is that, although distortion itself may not be directly audible (i.e. audibly adding to the signal tonally) at levels in the range of 0.003-0.03%, due to the critical bands present in our auditory system, addition of distortion of these levels may result in the stimulus seeming louder than an an undistorted stimulus of the same absolute SPL. It's well-established that, even if two stimuli have the same absolute SPL level, the stimulus with the wider bandwidth, or the most even distribution of sound pressure across the widest bandwidth, will be perceived to be louder due to the functioning of our auditory system's critical bands. It may thus be speculated that, even if harmonics do not exceed audibility thresholds in their own right, if they extend the bandwidth of a stimulus or distribute sound pressure across a wider bandwidth, subjects will perceive that stimulus to be louder than a narrower bandwidth signal. It's well-established that subjects tend to prefer music that is, or that seems to be, louder.

Another factor that lends some weight to this theory is that, in many distortion audibility studies, subjects have tended to prefer musical stimuli that were distorted enough to be reliably distinguished from undistorted stimuli, so long as these were not distorted enough for subjects to be able to identify them as "sounding distorted". It seems reasonable to at least speculate that this may have been because these distorted music stimuli seemed louder than undistorted stimuli due to their wider bandwidth, despite having very close to the same absolute SPL.

Again, we don't know enough about the spectral content or the distortion components of the stimuli used in these studies to do any more than speculate here.

Finally (for now), all I've discussed completely sidesteps issues surrounding temporal masking, i.e. masking of a signal by a masker that occurs either before or after it in time. I don't think this should be a major factor in most electronics, but I'm far from an expert on electronics.

Anyway, I hope other members here are interested enough in this topic to offer their comments and maybe introduce new ideas or evidence that might get us a bit closer to making sense of all these data

EDITED: with a few extra thoughts and some clarifications...

Thank you for starting this thread and providing such detailed information. Much to think about with respect to amp+speaker distortions (my interest right now.)

Masking Studies and Relationship to Nonlinear Distortion Audibility (?)

watchnerd

Grand Contributor

Theriverlethe

Addicted to Fun and Learning

watchnerd

Grand Contributor

Blumlein 88

Grand Contributor

Theriverlethe

Addicted to Fun and Learning

watchnerd

Grand Contributor

ahofer

Master Contributor

Blumlein 88

Grand Contributor

Theriverlethe

Addicted to Fun and Learning

watchnerd

Grand Contributor

watchnerd

Grand Contributor

amirm

Founder/Admin

Theriverlethe

Addicted to Fun and Learning

GrimSurfer

Major Contributor

Blumlein 88

Grand Contributor

watchnerd

Grand Contributor

Old Listener

Addicted to Fun and Learning

Similar threads