• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

New Research on Audibility of Distortion in Headphones

The song choices highly impact this assessment. Why did you use preference test tracks for this use???
That's a good question, Sean suggested them. For now I'll just quote the upcoming AES Paper that we'll present in Copenhagen convention. This part Sean wrote. ""Both tracks where used in distortion studies provide a mean for comparison. Fast Car has a wideband spectrum with considerable bss that can excite non-linearitiesin the transducer and modulate the female vocal. Spanish Harlem is Instrumentally sparser and provide less masking of distortion generated by bass and produce larger differences in listener's preference." I'll ask him more.
 
Subjectivity was really on our work table, but you can't prefer what you don't hear.
You won't hear real impairments if test protocol is not transparent enough. You also dilute your results by including people who don't have proper acuity.

This is not like preference test where differences are always audible and every listener can have a valid opinion.
 
That's a good question, Sean suggested them. For now I'll just quote the upcoming AES Paper that we'll present in Copenhagen convention. This part Sean wrote. ""Both tracks where used in distortion studies provide a mean for comparison. Fast Car has a wideband spectrum with considerable bss that can excite non-linearitiesin the transducer and modulate the female vocal. Spanish Harlem is Instrumentally sparser and provide less masking of distortion generated by bass and produce larger differences in listener's preference." I'll ask him more.
Both are inappropriate for this kind of testing. Distortion usually dominate in bass. You have to have tracks that emphasize this. Again, development of a control would have shown this.
 
You objectively and hence provably generate a distorted clip. You then listen to it with your trained listeners to make sure impairments are audible. This is sometimes called a low anchor.
Yes I see, but low anchor assume that you anchor to what we know will be disliked, to have the rest ranked in comparison to that, but it's assuming that they can be distinguished, like equalization curves for example. Low Anchor have no benefit if the listener's can't differentiate between the clips to be ranked.
 
Yes I see, but low anchor assume that you anchor to what we know will be disliked,
Distortion at lower levels is not about liking something but pure detection. It has nothing to do with preference. So drawing anything from that body of work is wrong.
 
Both are inappropriate for this kind of testing. Distortion usually dominate in bass. You have to have tracks that emphasize this. Again, development of a control would have shown this.
Maybe, also the listeners were trained with Fast Car. That's one of the Stimulus used for the Klippel listening tests. In all cases, I think we also make that clear that the findings are only true inside the scope of what we tested. We are not saying that universally Distortion is inaudible for any headphones with any music.
 
Distortion at lower levels is not about liking something but pure detection. It has nothing to do with preference. So drawing anything from that body of work is wrong.
To be clear, you are saying that having a low anchor would have help them detect distortion,, altough they where not being able to diferentiate the recording to the clean file. You are I think talking about the similarity test. But, before that we made an adaptive staircase ABX. This IS a low anchored test. You start with distortion Levels that 100% people hear and go down in distortion until they can't reliably differentiate.
 
Maybe, also the listeners were trained with Fast Car. That's one of the Stimulus used for the Klippel listening tests. In all cases, I think we also make that clear that the findings are only true inside the scope of what we tested. We are not saying that universally Distortion is inaudible for any headphones with any music.
But you did in your conclusions. That training was not useful either. I thought I explained that in our private messages.
 
To be clear, you are saying that having a low anchor would have help them detect distortion,, altough they where not being able to diferentiate the recording to the clean file. You are I think talking about the similarity test. But, before that we made an adaptive staircase ABX. This IS a low anchored test. You start with distortion Levels that 100% people hear and go down in distortion until they can't reliably differentiate.
That is not an ABX test. Nor is it proper protocol for such things. Just because Klippel created the test doesn't mean it is right.

There are standards and a ton of prior work in this area. There was no reason to go in this odd direction. Such tests routinely underestimate audibility of distortion.

To wit, it is a rare headphone that I can't get to distort.
 
But you did in your conclusions. That training was not useful either. I thought I explained that in our private messages.
we’ll make sure to make it clearer that it’s not what we are saying. I know I didn’t say that.
 
That is not an ABX test. Nor is it proper protocol for such things. Just because Klippel created the test doesn't mean it is right.

There are standards and a ton of prior work in this area. There was no reason to go in this odd direction. Such tests routinely underestimate audibility of distortion.

To wit, it is a rare headphone that I can't get to distort.
Klippel was only for training. We did ABX, not the same test as Klippel where you are asked to recognise the distorted track.
 
Klippel was only for training. We did ABX, not the same test as Klippel where you are asked to recognise the distorted track.
I thought you said you mimicked the stepped and reversal protocol they used. That protocol is likely to produce a lot of frustration for listeners and reduce true positive detection.
 
As I said, it is NOT an optimal method if your goal is to find out if the system is transparent or not. In such testing, we want to do everything in our power to get positive outcome. After all, our sample size is quite small and that alone lowers the chances that we are getting to the truth. This requires research into what tracks cause the most audible distortion, the test protocol, audience selection, etc. All the things we do day in and day out in lossy audio codec testing for example. There was no need to create new methods and not leveraging expertise in this area. Unless your goal is to say that distortion doesn't matter, only tonality does.

Again, it is trivial for me to get most headphones to distort. It is simply a matter of level and content. I have some that literally make clicking noise! How could in this landscape, have a study say that it is essentially a non problem??? Have you not experimented yourself this way before creating such tests?
 
We did 3 different tests, the goal of this one was to find a treshold, Kaernbach is the most recognized test for that, we didn't invent it. One was to see if people could find a difference between different headphones recording, for this one it was MUSHRA, minus the Anchor. Normally Mushra is for preference and larger impairment. And Finally, we did a straight up 10 trials ABX to verify if what they heard in the MUSHR(A) was real.
If a headphone in this test is making clicking noise at normal listening level, this would have catched that. I just want to repeat that: We did have audible distortion on all these headphones. 100% of the listeners heard headphones captures that where very obviously distorted. But not at normal level. If they didn't hear it at all we could be doubtful.

Again there was no pre-concieved goal

In term of leveraging expertise in the area, I tought we where doing just that in hiring Sean Olive to work with us on that. He has a great experience and was very helpful.
The upcoming study, that we will present on the 28th of may in AES Europe, have also obviously passed peer review.
I accept your concerns, what we present are the results, and we find that the impairements are, at the very least, small and difficult to detect. Difficult, not impossible.
 
The upcoming study, that we will present on the 28th of may in AES Europe, have also obviously passed peer review.
Peer review is no indication that the proper test was conducted. Only that it meets certain standards as far as publication.

As a person who has truly conducted, and participated in large number of these tests, I am giving you the practical peer review. :)

Let me give you an example of what thought must go into such designs. In lossy compression there is a concept of pre-echo which is caused by the window function of the encoder. Distortion is spread in the window causing artifacts to show up on transients. Knowing this, special content was selected that is highly revealing of such artifact (e.g. pure vocals, or German narration, glockenspiel, etc.) . Many types of music that people think are hard to encode such as classical music, are not so. The right content makes or breaks such tests.

Further, real training for hearing such impairments takes months. As such, assessment of trained listeners is almost always used instead of general public.

These are all very different that preference tests for headphone/speakers.

I worry that the consequences of your study is that manufacturers continue to not care about proper fidelity and reduction of impairments. We are talking about distortions that are hundreds of times worse than any electronics. But folks will show a link to your study and use it as a get out of jail card....
 
There is a huge difference between modern music and pop. Soundtracks for example are modern but can have incredible dynamics including deep bass. Even in pop there are well recorded tracks. Vast majority of my reference tracks are modern recordings.
I am just wondering are you sure that you have tracks where the bass alone is mixed 20-30db louder than the rest of the spectrum, so you have listen to bass at 100db but everything else at a safe volume? I just doubt that soundtracks, classical music, jazz, etc recordings (let alone popular music) are mixed this way. I think the dynamic peaks of music usually have higher frequencies as well, coming from other instruments, regardless of genre.
 
Can you name a classical album with 60db dynamic range? I am genuinely curious.
The most obvious would be the 1812 overture with cannons, but it's debatable whether that is "music" so I'll pick another ;) .
Stravinsky Rite of Spring with NDR Philharmonic & Urbanski, Alpha 292. Peaks within a fraction of 0 dB (without clipping) and the quietest parts (as in track 10) reach -60 dB.
Others that come close (> 50 dB) are good recordings of Mahler's 1st and 2nd symphonies, for example with Budapest & Fischer on Classic Records.
Full enjoyment of these requires a QUIET listening room.
 
I am just wondering are you sure that you have tracks where the bass alone is mixed 20-30db louder than the rest of the spectrum, so you have listen to bass at 100db but everything else at a safe volume? I just doubt that soundtracks, classical music, jazz, etc recordings (let alone popular music) are mixed this way. I think the dynamic peaks of music usually have higher frequencies as well, coming from other instruments, regardless of genre.
Based on what analysis you have done???

This is our threshold of hearing:

png-clipart-hearing-range-absolute-threshold-of-hearing-sound-human-ear-angle-white.png


0 dbSPL at 1000 Hz is the same as 50 dBSPL at 31 Hz! You go below 50 dBSPL @31 Hz and you can't even hear it. So by definition, bass is recorded at far higher amplitude than higher frequencies.

Here is a random high-res track from my library that is not even bass heavy: Madeleine Peyroux's Careless Love Album, track 1 (Dance Me to the End of Love):
1779163729753.png


By 4 kHz, you are 35 dB down. And nearly 50 dB at 20 kHz. Again, this is a female, instrumental track.
 
Back
Top Bottom