• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

What kind of evidence is sufficient?

derp1n

Senior Member
Joined
May 28, 2018
Messages
479
Likes
629

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,406
Interesting! So you think there is a difference? I was similarly challenged by someone on head-fi some time ago. He was an audio expert or liked to think of himself as one, and a great admirer of Apple's iPhone. Other than the equivalent quality, he kept mentioning gain of space as a side benefit of 320 kbps (to which I replied storage space wasn't an issue to an Android user). So I got challenged to take some kind of scientific (pseudo-sientific?) procedure of his own chosing to satisfy his claim. Instead I offered to rip a 320 kbps track and compare it to the same track in FLAC and try an honest comparison. It was a song by Brazil's Caetano Veloso, a world-class performer always surrounded by top musicians. I chose to focus on the very sophisticated percussions part in that song. The loss of detail was very obvious. In brief, the FLAC version made it possible to hear the percussionist play actual melodies ; on the 320 kbps version there were just thuds with no sense of melody. I reported my "findings", suggesting he perform the same test, but he never bothered to even reply. Not scientific enough I guess. I remember him mentioning level matching, I have no idea if my levels were matched, but I doubt it would have made a difference in this case: when a detail is not there, it is not there.

That claim seems to surface quite often, it appeared recently on a YouTube channel I'm a subscriber to ; the author was justifying the fact his new recording would only be available in 320kbps and the purpose of the episode was to prove there was no difference (if you thought differently you were an "audiophool").

Was this done double blind? And using an ABX method or some other method? Not that I'm necessarily doubting you, just that these questions are not at all clear from your post.

Andreas
 

sergeauckland

Major Contributor
Forum Donor
Joined
Mar 16, 2016
Messages
3,460
Likes
9,162
Location
Suffolk UK
Can you prove this?
Yes easily. Just take a piece of audio, find the peaks and make a note of the peak levels. Convert to MP3 and check the peaks again. They'll be the same. As far as I know, there's nothing in the MP3 encoding algorithm that affects peak levels. There's also nothing, again as far as I know, that affects overall energy and hence perceived loudness.

If I'm wrong on this, perhaps somebody can point out where, but I was always under the impression that MP2, MP3 and AAC did not affect volume or peak levels. However, as mentioned in the edit, again my understanding is that the algorithms makes assumptions about the nature of the audio, and doesn't work well with heavily clipped audio.

I will say it's a long time ago that I studied how MP2 and MP3 works, so may be mistaken.

S.
 

derp1n

Senior Member
Joined
May 28, 2018
Messages
479
Likes
629
Yes easily. Just take a piece of audio, find the peaks and make a note of the peak levels. Convert to MP3 and check the peaks again. They'll be the same.

I suggest you try it. :)

There are several ways that a lossy codec can cause peaks to change. Filtering, Gibbs, upsampling, inter-sample peak handling... distortion.
 

vert

Active Member
Forum Donor
Joined
May 30, 2018
Messages
285
Likes
258
Location
Switzerland
I haven't used every piece of conversion software out there, so this may be universally true, but I don't know. Nonetheless, given the criticality and repeatedly demonstrated audibility of small level changes, I'd still spend a minute or two verifying it before beginning a test.
Well, the levels were the same from the device (smartphone) I was using, at least. I wasn't using any special computer programme. I guess I don't understand how a detail is supposed to appear with a different volume if it is no longer there.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,511
Likes
25,349
Location
Alfred, NY
Small volume changes are often perceived as changes in clarity.

My suggestion: match average levels. If they're the same from the get-go, at least you've checked that box.
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,511
Likes
25,349
Location
Alfred, NY
Level matching is absolutely necessary since our ears are extremely sensitive to small level changes, while not necessarily perceiving the difference as being level. No level matching, the test is invalid.
 

Pio2001

Senior Member
Joined
May 15, 2018
Messages
317
Likes
507
Location
Neuville-sur-Saône, France
Phase shifts are audible with headphones. It is in rooms with reflections where audibility becomes very difficult. Here is Dr. Toole from his book

I don't know anything about crossovers, but I've been experimenting with Digital Room Correction lately. Both full band, with a target curve that is reached with a complete correction of the measured response around the listening position, and also limited to the low frequencies, with parametric filters.

The phase shift of these corrections are audible through speakers. In both cases, if the correction is done in linear phase instead of minimum phase, things can go wrong.
In low frequencies, a room mode can easily lead to 100 ms of group delay (not to be mistaken with the phase shift, that is equivalent to 3 ms for the same mode). It is difficult to know if the delay alone is audible because cancelling a delay without affecting frequency response always introduces pre-ringing, which is very audible in iteself.
For example, in this video, you can hear two strong equalisations that are the same order of magnitude as the frequency response of a room with a bad behaviour in the low frequencies. One is done with linear phase, the other is done with minimal phase :


The only difference between the two is the phase shift. Its audibility is obvious because of the huge pre-echo of the linear phase version.

At the end of this other video, after the demonstration of frequency response change, you can hear the effect of an amount of phase shift comparable to the one introduced by acoustic room modes. It is artificially generated, so that the frequency response of the sample is unchanged. Here too, the phase shift introduces pre-echo. This time, without the amplification of low frequencies. That's what you might get if you try to equalize your room modes with a linear phase correction (given that you do have strong room modes to equalize.. and that your modes are minimal phase, like mine) !


Such pre-echo is not limited to low frequency room modes. Some DRC hardware are limited to 6144 taps for digital convolution. But I once tried a full band room correction with a higher precision (it was around 16000 taps, I think), and linear phase. To my surprise, audible pre-echo was introduced by the room correction around 2 kHz ! It was audible using a "CEA 2010 burst" test signal at that frequency.
There was no strong correction at that frequency. After a bit of investigation, I found that the +/- 1 dB corrections made near 2000, 3000 and 4000 Hz in order to compensate for the directivity index of my speakers was the cause ! These small corrections, 1 dB across half an octave, were enough to cause audible time smearing on the burst test signal if they were made with linear phase.

Here too, the audible problem was pre-echo.
 

Pio2001

Senior Member
Joined
May 15, 2018
Messages
317
Likes
507
Location
Neuville-sur-Saône, France
Level matching is absolutely necessary since our ears are extremely sensitive to small level changes, while not necessarily perceiving the difference as being level. No level matching, the test is invalid.

Actually, for MP3 vs PCM conparisons, the best reference for level matching is the MP3 encoder itself.

If you try to match levels using, for example, the RMS level of both samples, there is a risk that, if the original had a lot of energy above 16 kHz, your measurement finds the mp3 to have an inferior overall RMS level, because these frequencies are missing.
However, matching the RMS level this way, all you are going to do is actually mismatch the level of the remaining audible frequencies, to compensate for the disparition of the inaudible content above 16 kHz !

Comparing the peak level of the PCM and the MP3 samples is worse : because of the change in frequency content, all square and step waveforms are changed. The overall level is unchanged, but the peak level can vary a lot. +1 dB on the highest peak for the mp3 file is common.

To compare a PCM file with an MP3, use a command line encoder, or at least a software that you perfectly know, so that you can make sure that the MP3 is not normalized (that would be surprising), that if it features Replaygain information, then you don't use it during playback to adjust the volume, and that in the mp3 file, the highest peak level detected by Replaygain is not above 0 dBFS (that is "1" in the replaygain informations).
 

Jinjuku

Major Contributor
Forum Donor
Joined
Feb 28, 2016
Messages
1,279
Likes
1,180
Pretty simple: If you say you can clear a 10 foot high bar from a standing jump, I'm going to bring out a 10 foot high bar.

You don't need an N of a 1000 jumpers. You don't need a lab stacked with gear. You don't need 7 Ph D.'s for peer review.

Just a 10 foot high bar and a video camera.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,663
Likes
241,011
Location
Seattle Area
Interesting! So you think there is a difference?
Definitely. Just not easy to find on all tracks and for all people. Here is my sample results:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/19 19:45:33

File A: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44.wav
File B: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44_01.mp3

19:45:33 : Test started.
19:46:21 : 01/01 50.0%
19:46:35 : 02/02 25.0%
19:46:49 : 02/03 50.0%
19:47:03 : 03/04 31.3%
19:47:13 : 04/05 18.8%
19:47:27 : 05/06 10.9%
19:47:38 : 06/07 6.3%
19:47:46 : 07/08 3.5%
19:48:01 : 08/09 2.0%
19:48:19 : 09/10 1.1%
19:48:31 : 10/11 0.6%
19:48:45 : 11/12 0.3%
19:48:58 : 12/13 0.2%
19:49:11 : 13/14 0.1%
19:49:28 : 14/15 0.0%
19:49:52 : 15/16 0.0%
19:49:56 : Test finished.

----------
Total: 15/16 (0.0%)

====

foo_abx 2.0 beta 4 report
foobar2000 v1.3.5
2015-01-05 20:26:27

File A: On_The_Street_Where_You_Live_A2.mp3
SHA1: 21f894d14e89d7176732d1bd4170e4aa39d289a3
File B: On_The_Street_Where_You_Live_A2.wav
SHA1: 3f060f9eb94eb20fc673987c631e6c57c8e7892f

Output:
DS : Primary Sound Driver

20:26:27 : Test started.
20:27:01 : 01/01
20:27:09 : 02/02
20:27:16 : 03/03
20:27:22 : 04/04
20:27:28 : 05/05
20:27:34 : 06/06
20:27:40 : 06/07
20:27:51 : 07/08
20:28:01 : 08/09
20:28:09 : 09/10
20:28:09 : Test finished.

----------
Total: 9/10
Probability that you were guessing: 1.1%

-- signature --
7a3d0c1aaaf8321306ff6cfdd1f91ff68f828a54
 

Thomas savage

Grand Contributor
The Watchman
Forum Donor
Joined
Feb 24, 2016
Messages
10,260
Likes
16,306
Location
uk, taunton
This has got over complicated, right.. you think something, just put forward what ‘you’ believe might sway competent individuals..

Really simple, honestly that’s all we want..

If the evidence is contry to forum consensus it will have to have merit BEYOND your own self....

Can’t be any clearer than thIs, we are not here to trap truth we are here to valida discussion.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
Actually, for MP3 vs PCM conparisons, the best reference for level matching is the MP3 encoder itself.

If you try to match levels using, for example, the RMS level of both samples, there is a risk that, if the original had a lot of energy above 16 kHz, your measurement finds the mp3 to have an inferior overall RMS level, because these frequencies are missing.
However, matching the RMS level this way, all you are going to do is actually mismatch the level of the remaining audible frequencies, to compensate for the disparition of the inaudible content above 16 kHz !

Comparing the peak level of the PCM and the MP3 samples is worse : because of the change in frequency content, all square and step waveforms are changed. The overall level is unchanged, but the peak level can vary a lot. +1 dB on the highest peak for the mp3 file is common.

To compare a PCM file with an MP3, use a command line encoder, or at least a software that you perfectly know, so that you can make sure that the MP3 is not normalized (that would be surprising), that if it features Replaygain information, then you don't use it during playback to adjust the volume, and that in the mp3 file, the highest peak level detected by Replaygain is not above 0 dBFS (that is "1" in the replaygain informations).
A quick example:
https://forum.cockos.com/showpost.php?s=e211ec6f49c45ee4c73e3fc64b1fcdbb&p=2001665&postcount=30
 

Pio2001

Senior Member
Joined
May 15, 2018
Messages
317
Likes
507
Location
Neuville-sur-Saône, France

This links talks about mp3gain, a software used to change the volume of mp3 files. You can also change the volume of your original WAV files the same way, with Audacity, for example.

If you don't change the volume, neither of the original, nor of the MP3, they are matched by default. That's what I mean with "a software that you perfectly know"... So as to make sure that there is no mp3gain option or something enabled by default somewhere in a hidden menu.
 
OP
J

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
To stay on topic is a solid basis for valid discussions.
While heroic ABX attempts, and the internals of lossy codecs as well, are of course interesting it is not related to the thread topic. ;)
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,693
This links talks about mp3gain, a software used to change the volume of mp3 files. You can also change the volume of your original WAV files the same way, with Audacity, for example.

If you don't change the volume, neither of the original, nor of the MP3, they are matched by default. That's what I mean with "a software that you perfectly know"... So as to make sure that there is no mp3gain option or something enabled by default somewhere in a hidden menu.

The whole thread:
https://forum.cockos.com/showthread.php?t=207884

The OP cannot use RG since his car stereo doesn't support it. Many untouched mp3 files as you mentioned, can have peaks over 0dBFS which will induce clipping or trigger Windows' built-in limiter to push them down. That's something needed to be addressed when performing listening tests.

To be clear, I am not a non-believer that 320kbps mp3 cannot be successful ABXed.
https://hydrogenaud.io/index.php/topic,39970.0.html
 
OP
J

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
A perfectly defined set of rules minus one point, to be definitive they results should also be repeatable.
If a A-B test resulted in 95% correct answers, it is obvious that result should be repeatable.
If it was not repeatable we could probably assume the presenter was not honest in his presentation in some way.

As cited in another post, you introduced that controlled "blind" listening test should be supervised. Could you clarify if that is an important point?

There seems to be misunderstanding with 95% correct, because repeatability does not obviously follow from the result. It should obviously follow if our null-hypothesis is indeed false.
Also if repeatability fails it might be that replication is still possible, but even if not, it does not follow that any dishonesty was involved.
Our null-hypothesis is random guessing and even a 95% correct answer rate is possible under the null-hypothesis although we consider such an observed result as less compatible with the null-hypothesis compared to maybe 75% (depending on the specific conditions) but nevertheless it is still possible.

Edit: I know that it sometimes looks like nitpicking (and counterintuitive) but these distinctions are unfortunately important
 
Last edited:

garbulky

Major Contributor
Joined
Feb 14, 2018
Messages
1,510
Likes
829
I'm probably in the minority. For me, the evidence I need is obtained from extended listening to the device in the same way that I would normally enjoy such a device. And as for sounding good, I do have some pretty stringent nit picky ideals, which not everybody shares. So I think suffice it to say it has to sound good to me.

I would use people whose listening descriptions tend to match mine to see if some gear would be interesting to check out as well. However, I usually also would like it to meet some somewhat low standards for measurements. For instance, say an SNR of above 75, THD of less than 1%, <5% for tubes. A wide bandwith 20-20khz with no huge amounts of roll off. Though this doesn't preclude me from tube amps, I would prefer an amp to handle a 4 ohm impedance as well. If it's a DAC, at this stage I would prefer it to process 96khz files as I have several of them from my recordings. In the future, it would be nice if they could do 192 khz files. Luckily most DACs nowadays do 192 khz without issue.

At no point though, do I think I'm doing science.
 
Last edited:

Thomas savage

Grand Contributor
The Watchman
Forum Donor
Joined
Feb 24, 2016
Messages
10,260
Likes
16,306
Location
uk, taunton
I'm probably in the minority. For me, the evidence I need is obtained from extended listening to the device in the same way that I would normally enjoy such a device. And as for sounding good, I do have some pretty stringent nit picky ideals, which not everybody shares. So I think suffice it to say it has to sound good to me.

I would use people whose listening descriptions tend to match mine to see if some gear would be interesting to check out as well. However, I usually also would like it to meet some somewhat low standards for measurements. For instance, say an SNR of above 75, THD of less than 1%, <5% for tubes. A wide bandwith 20-20khz with no huge amounts of roll off. Though this doesn't preclude me from tube amps, I would prefer an amp to handle a 4 ohm impedance as well. If it's a DAC, at this stage I would prefer it to process 96khz files as I have several of them from my recordings. In the future, it would be nice if they could do 192 khz files. Luckily most DACs nowadays do 192 khz without issue.
No your in the majority, have a look at the frailty of eye witnesses.. yet the law says..

People are not programmed to care about things that counter their own significance..
 
Top Bottom