• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Volume Matching JND

Speedskater

Major Contributor
Joined
Mar 5, 2016
Messages
1,639
Likes
1,360
Location
Cleveland, Ohio USA
If a listener can say that one sounds better than the other, then the volume difference is more than JND.
With a JND, the listener can't identify why they sound different.
 
OP
J

John Kenny

Addicted to Fun and Learning
Joined
Mar 25, 2016
Messages
568
Likes
18
If a listener can say that one sounds better than the other, then the volume difference is more than JND.
With a JND, the listener can't identify why they sound different.
You mean at this JND threshold there is a fluctuating perception of difference/no difference?
I think what you're saying is that this is how the JND is established i.e where that threshold line is found by step testing ? So if they can't say there's a difference, it's below the JND & when the can statistically identify a difference it's above the JND
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,696
Likes
37,434
If a listener can say that one sounds better than the other, then the volume difference is more than JND.
With a JND, the listener can't identify why they sound different.
This isn't the usual definition of JND. Usually JND for loudness in this case is the minimum difference at which a listener will reliably hear one at a higher or lower level vs another signal. You can have a JND for frequency which would be the minimum ratio between two frequencies at which a listener will hear a higher or lower frequency.

The JND for hearing something as different (unspecified) appears to be lower than the JND for hearing a loudness difference. Which is why volume matching by ear makes two sources sound the same loudness, but they might still be heard as differing at smaller volume differences than the loudness JND.

Studying Weber's law might be helpful.
https://en.wikipedia.org/wiki/Weber–Fechner_law

And here is one bit of research showing a reasonably good correspondence with actually listening.

http://m.audres.org/cel/loud/NeelyAllen1998ISH.pdf

Weber's law would predict around .5 db to be JND for most sensations. You will see references to exploration of the near-miss with Weber's law. Equations can do a pretty good job predicting various sensation or perception based JND's at various levels, but usually is off by this near miss component.
 
Last edited:
OP
J

John Kenny

Addicted to Fun and Learning
Joined
Mar 25, 2016
Messages
568
Likes
18
This isn't the usual definition of JND. Usually JND for loudness in this case is the minimum difference at which a listener will reliably hear one at a higher or lower level vs another signal. You can have a JND for frequency which would be the minimum ratio between two frequencies at which a listener will hear a higher or lower frequency.

The JND for hearing something as different (unspecified) appears to be lower than the JND for hearing a loudness difference. Which is why volume matching by ear makes two sources sound the same loudness, but they might still be heard as differing at smaller volume differences than the loudness JND.
Yes & for listening to music (which is mostly what is used in listening tests) it appears that anecdotal reports (I haven't seen any papers quoted) indicate 2-3dB is the JND
 
OP
J

John Kenny

Addicted to Fun and Learning
Joined
Mar 25, 2016
Messages
568
Likes
18
Here's research which throws the whole concept of volume matching using a 1KHz tone into question & the concept that we are less sensitive to amplitudes of LF & HF frequencies & more sensitive to mid frequencies
"Effects of relative and absolute frequency in the spectral weighting of loudness"

Which states that "The loudness of broadband sound is often modeled as a linear sum of specific loudness across frequency bands. In contrast, recent studies using molecular psychophysical methods suggest that low and high frequency components contribute more to the overall loudness than mid frequencies. In a series of experiments, the contribution of individual components to the overall loudness of a tone complex was assessed using the molecular psychophysical method as well as a loudness matching task."

"The stimuli were two spectrally overlapping ten-tone complexes with two equivalent rectangular bandwidth spacing between the tones, making it possible to separate effects of relative and absolute frequency."

"The lowest frequency components of the “low-frequency” and the “high-frequency” complexes were 208 and 808 Hz, respectively. Perceptual-weights data showed emphasis on lowest and highest frequencies of both the complexes, suggesting spectral-edge related effects.Loudness matching data in the same listeners confirmed the greater contribution of low and high frequency components to the overall loudness of the ten-tone complexes. Masked detection thresholds of the individual components within the tone complex were not correlated with perceptual weights. The results show that perceptual weights provide reliable behavioral correlates of relative contributions of the individual frequency components to overall loudness of broadband sounds."
 
Last edited:

Phelonious Ponk

Addicted to Fun and Learning
Joined
Feb 26, 2016
Messages
859
Likes
215
So it seems there is a lot of reason to believe that level differences are perceived as quality differences, though at what level is arguable. Given such knowledge, why would one not simply level match as closely as your equipment will measure the difference? There is no reason to believe than unmatched levels will give a better result.

Tim
 
OP
J

John Kenny

Addicted to Fun and Learning
Joined
Mar 25, 2016
Messages
568
Likes
18
So it seems there is a lot of reason to believe that level differences are perceived as quality differences, though at what level is arguable. Given such knowledge, why would one not simply level match as closely as your equipment will measure the difference? There is no reason to believe than unmatched levels will give a better result.

Tim
Yep & it would appear that there's no reason criticise or consider invalid listening tests which are not matched to within 0.1dB - 2 to 3dB seems appropriate
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,696
Likes
37,434
Yes & for listening to music (which is mostly what is used in listening tests) it appears that anecdotal reports (I haven't seen any papers quoted) indicate 2-3dB is the JND
For loudness yes for difference no.
 

Phelonious Ponk

Addicted to Fun and Learning
Joined
Feb 26, 2016
Messages
859
Likes
215
Yep & it would appear that there's no reason criticise or consider invalid listening tests which are not matched to within 0.1dB - 2 to 3dB seems appropriate

If what seems appropriate becomes well-established, I'd agree. In the meantime, I believe I would match as closely as my equipment will allow.

Tim
 

Phelonious Ponk

Addicted to Fun and Learning
Joined
Feb 26, 2016
Messages
859
Likes
215
For loudness yes for difference no.

If this is accurate, it's even more dangerous to preference testing. If you can detect a difference but you can't tell it's loudness. It would be very natural to attribute it to something else.

Tim
 
OP
J

John Kenny

Addicted to Fun and Learning
Joined
Mar 25, 2016
Messages
568
Likes
18
If what seems appropriate becomes well-established, I'd agree. In the meantime, I believe I would match as closely as my equipment will allow.

Tim
Sure, no problem anybody matching as closely as equipment allows & it seems that 2-3dB is well-established reading the posts here - unless there are some DBTs that indicate otherwise.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,595
Likes
239,636
Location
Seattle Area

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,696
Likes
37,434
Yep & it would appear that there's no reason criticise or consider invalid listening tests which are not matched to within 0.1dB - 2 to 3dB seems appropriate
I would ask what Tim is asking. Why not do the match and remove the issue? Even when using stepped volume in 1 dB steps you will be able to achieve .5 dB or less difference. Under what conditions is that not easy to do?
 
OP
J

John Kenny

Addicted to Fun and Learning
Joined
Mar 25, 2016
Messages
568
Likes
18
I would ask what Tim is asking. Why not do the match and remove the issue? Even when using stepped volume in 1 dB steps you will be able to achieve .5 dB or less difference. Under what conditions is that not easy to do?
As I said I've no problem with that but then it seems that ABX tests are dismissed which haven't been matched to <0.5dB. Your posts here demonstrate you & AJ doing exactly this:
http://audiosciencereview.com/forum...hat-is-measurable-thread.214/page-6#post-6223
(quote=AJ Soundfield)Well, it should be obvious if the output voltages from the T&A weren't measured/precisely matched for A vs B....then there was no "test", never mind the stats.

(quote=Blumlein88)I concur with this. The ADC/DAC in use has switched relay level control. Other than the top two steps, it used 1db per step. It appears once setup they simply switched between inputs on the pre. They may have gotten lucky and the match was good. Most likely was a .5 dB mismatch. We don't know and with out good level matching this means nothing.
And you restate the same criteria later in the thread here
"If you can only change levels by 1 db, then you are almost surely looking at some mismatch in level. If you do the job properly that mismatch would not exceed .5 db. Call it reasonable doubt. Doubt which could have been assuaged by them explaining how they match levels and how closely they were matched. It also seems unnecessary for me to have to explain that to a DAC designer. Makes me think you are only interested in obstructing useful discussion by pest-like responses.​
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,880
Likes
16,667
Location
Monument, CO
The tests I helped perform agree with the above posts in that detecting a difference in loudness required a much larger change than simply detecting a difference. I obviously did not make that clear, perhaps obfuscating it by mentioning the time factor. Level matching for something like ABX testing, looking for a positive result as being any perceived difference, provides a different JND than changing the volume and asking if it got louder or softer. the latter relates to phons and things like the Fletcher-Munson curves, natch.

We did some testing using headphones to take the speakers and room out of the equation.

At that time (early 80's) much ado was being made about DBT and ABX testing and the need for precise (~0.1 dB) level matching. We were trying to see if we could prove or disprove that theory. There was much debate since ABX was fairly new and accepted theory and practice showed 1 dB was about the detectable limit for a change in loudness. Fast-switching DBT and ABX systems blew that out of the water, showed a clear preference for a system only slightly louder, and forced a rethink of what level matching meant. Most dealers had little in the way of test equipment back then and level matching by ear was common. As was savvy but unscrupulous salesmen turning the volume up a hair on the component with the biggest commission... It was the latter that actually drove my boss to look into the issue as he lost a big sale to that practice, at least so far as everyone could tell (and the salesman supposedly admitted it over drinks one night). My boss went on a quest to prove "our" products were better in a matched test.
 
Last edited:

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,696
Likes
37,434
Here's research which throws the whole concept of volume matching using a 1KHz tone into question & the concept that we are less sensitive to amplitudes of LF & HF frequencies & more sensitive to mid frequencies
"Effects of relative and absolute frequency in the spectral weighting of loudness"

Which states that "The loudness of broadband sound is often modeled as a linear sum of specific loudness across frequency bands. In contrast, recent studies using molecular psychophysical methods suggest that low and high frequency components contribute more to the overall loudness than mid frequencies. In a series of experiments, the contribution of individual components to the overall loudness of a tone complex was assessed using the molecular psychophysical method as well as a loudness matching task."

"The stimuli were two spectrally overlapping ten-tone complexes with two equivalent rectangular bandwidth spacing between the tones, making it possible to separate effects of relative and absolute frequency."

"The lowest frequency components of the “low-frequency” and the “high-frequency” complexes were 208 and 808 Hz, respectively. Perceptual-weights data showed emphasis on lowest and highest frequencies of both the complexes, suggesting spectral-edge related effects.Loudness matching data in the same listeners confirmed the greater contribution of low and high frequency components to the overall loudness of the ten-tone complexes. Masked detection thresholds of the individual components within the tone complex were not correlated with perceptual weights. The results show that perceptual weights provide reliable behavioral correlates of relative contributions of the individual frequency components to overall loudness of broadband sounds."

Actually I don't see that this calls into question volume matching at 1 khz. For one thing it didn't extend the testing as high as 1 khz. Typically, unless gear has rather large response issues at either end, matching at 1 khz would result in the same relative level as matching at other frequencies. If you prefer 208 hz then use that. If the gear has response differences exceeding .25 db at either end those are audible and best possible matching won't change that audibility due to FR differences.

I repeat, yet again, once more, why harp on this? Match the levels and move on. You appear only to be angling to include sloppy testing matched with in 2-3 db which is sloppy beyond reason. 1 db is possible by ear, so matching worse than that is surely not much of an attempt at good testing. And yes, I did, do and will reject any test so sloppy as matched to those looser tolerances.

Further, I have been involved with and done testing matching levels by ear. It is more bother, and takes more time switching back and forth to do that than it takes to simply drop your multi-meter across the speaker leads and send a tone thru. So yet one more reason not to do it that way. It is more difficult, takes more time and is less accurate. If someone wishes to convince me of their carefully done listening results yet can't be bothered to own/use a multimeter, I won't lose sleep over dismissing those results. If you still find them worth consideration, repeat the test with matched levels if you wish to convince others.
 
OP
J

John Kenny

Addicted to Fun and Learning
Joined
Mar 25, 2016
Messages
568
Likes
18
Actually I don't see that this calls into question volume matching at 1 khz. For one thing it didn't extend the testing as high as 1 khz. Typically, unless gear has rather large response issues at either end, matching at 1 khz would result in the same relative level as matching at other frequencies.
I initially read it that way at first but on more careful reading, I realised what was being said here ""The lowest frequency components of the “low-frequency” and the “high-frequency” complexes were 208 and 808 Hz, respectively." - there were LF complexes in which the lowest frequency component was 208Hz & there were HF complexes in which the lowest frequency was 808Hz. So, no you can't say "it didn't extend the testing as high as 1 khz."
If you prefer 208 hz then use that. If the gear has response differences exceeding .25 db at either end those are audible and best possible matching won't change that audibility due to FR differences.

I repeat, yet again, once more, why harp on this? Match the levels and move on. You appear only to be angling to include sloppy testing matched with in 2-3 db which is sloppy beyond reason. 1 db is possible by ear, so matching worse than that is surely not much of an attempt at good testing. And yes, I did, do and will reject any test so sloppy as matched to those looser tolerances.
What you call "harping on", I call investigating - it's one of the areas I like about science - the investigative aspect

Further, I have been involved with and done testing matching levels by ear. It is more bother, and takes more time switching back and forth to do that than it takes to simply drop your multi-meter across the speaker leads and send a tone thru. So yet one more reason not to do it that way. It is more difficult, takes more time and is less accurate. If someone wishes to convince me of their carefully done listening results yet can't be bothered to own/use a multimeter, I won't lose sleep over dismissing those results. If you still find them worth consideration, repeat the test with matched levels if you wish to convince others.
OK, thanks - the practicalities do come into it. I'm not favouring one side or the other, I just wanted to hear people's opinions & find any useful research which shone some light on the JND for volume matching
 
Last edited:

AJ Soundfield

Major Contributor
Joined
Mar 17, 2016
Messages
1,001
Likes
68
Location
Tampa FL
Here's research which throws the whole concept of volume matching using a 1KHz tone into question & the concept that we are less sensitive to amplitudes of LF & HF frequencies & more sensitive to mid frequencies
"Effects of relative and absolute frequency in the spectral weighting of loudness"

That appears to be a acoustics science journal, not an audiophile one. Are the results you're hanging your hat on, anecdotal in nature? Or where they obtained via controlled listening tests?
Not a member like you, so please clarify what methods they used, what were the quality checks, how they were proctored, etc, so that we know the results are rigorously science valid.
 
Last edited:
Top Bottom