Can YOU tell the difference between FLAC & MP3? - Part 2

audiofilet · Oct 17, 2021

Experiment: ABX Test to determine whether I can detect an audible difference between 192KHZ/24bit FLAC & 320kb/s MP3

Equipment:

DAC: JDS Labs Atom DAC+
Amps: JDS Labs Atom Amp+, Monolith Liquid Spark (During testing the Liquid Spark unfortunately stopped working)
Headphones: DT 990 Edition new stock earpads, DT 770 Pro new stock earpads, Hifiman Sundara moderately worn earpads
ABX Version 10.2.1.8
EQ: Graphic > Custom by Jaak, Parametric > Harman Target by Oratory1990. FIR > Harman Target by Oratory1990
(Example Curves for DT 770 Attached below)

AutoEq/results/crinacle/gras_43ag-7_harman_over-ear_2018/Beyerdynamic DT 770 250 Ohm at master · jaakkopasanen/AutoEq

Automatic headphone equalization from frequency responses - AutoEq/results/crinacle/gras_43ag-7_harman_over-ear_2018/Beyerdynamic DT 770 250 Ohm at master · jaakkopasanen/AutoEq

github.com

AutoEq/results/crinacle/gras_43ag-7_harman_over-ear_2018/Beyerdynamic DT 990 250 Ohm at master · jaakkopasanen/AutoEq

Automatic headphone equalization from frequency responses - AutoEq/results/crinacle/gras_43ag-7_harman_over-ear_2018/Beyerdynamic DT 990 250 Ohm at master · jaakkopasanen/AutoEq

github.com

Source Material:

Various Artists, Audiophile Hi-Res System Test - Great Sampling Tracks Included in High-Resolution Audio

Download Audiophile Hi-Res System Test - Great Sampling Tracks Included by Various Artists in high-resolution audio at ProStudioMasters.com - Available in 192 kHz / 24-bit, 96 kHz / 24-bit AIFF, FLAC…

www.prostudiomasters.com

Track 1.5: Studio Recording - Voice & Jazz by Holly Cole @ 192KHZ/24bit
Track 1.8: Piano - Steinway Piano @ 192KHZ/24bit
Track 1.10: Stradivarius Violin - Concert Hall @ 192KHZ/24bit

All three songs were purchased directly from the site.
Converted to 320kb/s MP3 via ffmpeg

FIR

Graphic EQ

Parametric EQ

ABX Result 1 = Track 1.5
ABX Result 2 = Track 1.8
ABX Result 3 = Track 1.10
ABX Result 4 = Track 1.5

Headphones	Amplifier	Equalizer	ABX Result 1	ABX Result 2	ABX Result 3	ABX Result 4	Guessing
DT 990	JDS Atom+	Graphic EQ	4/10	5/10	3/10	4/10	50%
DT 990	JDS Atom+	Parametric EQ	5/10	7/10	6/10	5/10	40%
DT 990	JDS Atom+	FIR Filter	7/10	6/10	7/10	6/10	35%
DT 770	Liquid Spark	Graphic EQ	3/10	2/10	4/10	3/10	70%
DT 770	Liquid Spark	Parametric EQ	5/10	6/10	6/10	4/10	55%
DT 770	Liquid Spark	FIR Filter	7/10	7/10	6/10	8/10	20%
Sundara	JDS Atom+	Parametric EQ	6/10	5/10	6/10	6/10	32%
DT 770	JDS Atom+	Graphic EQ	2/10	4/10	3/10	5/10	75%
DT 770	JDS Atom+	Parametric EQ	7/10	6/10	7/10	7/10	30%
DT 770	JDS Atom+	FIR Filter	9/10	7/10	5/10	8/10	14%

Test results without the usage of EQ were omitted, because I simply was not able to detect any difference whatsoever.
Averaging a constant <3/10 across the board.

Conclusion:

I hate all three tracks now and never want to listen to them again. lol

I personally felt like the DT 770s were resolving on a different level compared to the rest, however only with the FIR and Parametric EQ. I could literally hear the subtle changes in the air as Holly was inhaling during track 1.5. Her lip smacks, slight cotton mouth and background noises were exceedingly present and easy to pick up.
To be honest the sound felt so surgical and over the top clear/open, that my ears were getting fatigued after only a very short amount of time. I have a bit of tinnitus on my left ear and can almost positively say it was the 770's fault.
I felt the most confident wearing them and definitely guessed the least using the above mentioned EQ. With the Graphic EQ it was the exact opposite, I literally could not tell a difference whatsoever and pretty much had to guess the whole time. Same thing without EQ, but I did not even include that result, because I stopped mid-test. It just sounded the same.
The differences were extremely subtle at times, if not to say non-existent, but two fairly distinguishable attributes that stood out to me were clarity at higher frequencies and the rumble feeling of sub-bass 30 - 40khz. That is what I relied on the most during Track 1.5 for example.
I only once hit 9/10, which apparently is the required statistical success threshold, so obviously I was not able to repeat that result and thus objectively failed all tests except for one.

DT 990s came in second. Very similar experience to the 770s, with one slight disadvantage. I live in NYC and there is traffic within close proximity to our complex. With the window closed I could barely hear anything through the HPs, but I did notice it ever so slightly at times. I don't think this directly influenced the test, but I still wonder if the result may have been different in a perfectly quiet environment.

Sundara's were awesome. That was around the time where the Liquid Spark died, so I couldn't test them with it. I used Oratory 1990's Optimal Harman Parametric EQ settings. Same issue regarding open-back, but I felt pretty confident.

alphachannel · Oct 17, 2021

Interesting review, thanks!
I have the 770s also and will use that FIR wav for APO too now. So EQ makes ALL the difference then?

abdo123 · Oct 17, 2021

Just to make sure, you could hear a difference but still you were incapable of picking Correctly?

audiofilet · Oct 18, 2021

abdo123 said:
Just to make sure, you could hear a difference but still you were incapable of picking Correctly?

My ability to detect the audible difference varied between EQ profiles, I posted a spreadsheet above with the results.
My highest scores were achieved using the FIR Filter and DT 770, with this combination I did actually perceive a small difference, as described in the conclusion.

alphachannel said:
Interesting review, thanks!
I have the 770s also and will use that FIR wav for APO too now. So EQ makes ALL the difference then?

In my experience, absolutely.

Without EQ I could not tell any difference, whatsoever.
I did not include those results, because it was always <3/10 with the DT 770 and 990.
Definitely check it out, it takes the 770s to a completely different level, so much so that I honestly haven't heard anything like it before.

AutoEq/Beyerdynamic DT 770 250 Ohm minimum phase 44100Hz.wav at master · jaakkopasanen/AutoEq

Automatic headphone equalization from frequency responses - AutoEq/Beyerdynamic DT 770 250 Ohm minimum phase 44100Hz.wav at master · jaakkopasanen/AutoEq

github.com

abdo123 · Oct 18, 2021

audiofilet said:
My ability to detect the audible difference varied between EQ profiles, I posted a spreadsheet above with the results.
My highest scores were achieved using the FIR Filter and DT 770, with this combination I did actually perceive a small difference, as described in the conclusion.

That is very cool! Thanks for taking the effort to do this!

Do you think in the future you can do DT770 for a longer time next time? maybe your guessing chances would drop even further.

Also do you think that the 'differences' you heard were worth the Hi-Res hype?

audiofilet · Oct 18, 2021

abdo123 said:
That is very cool! Thanks for taking the effort to do this!

Do you think in the future you can do DT770 for a longer time next time? maybe your guessing chances would drop even further.

Also do you think that the 'differences' you heard were worth the Hi-Res hype?

Longer? I spent about 2 minutes listening to each track and did a total of 40 ABX runs with sets of 10 each, meaning I approximately spent about 800 minutes or 13 hours performing these tests. That was my entire weekend, but I wanted to make sure the results were accurate.

I actually wouldn't call it difference, it's more of a nuance. Most people, including myself now, probably agree that objectively they are pretty much identical, however under the right circumstances, which in this case was a particular EQ profile and Amp combination, I do believe that there are elements of a high resolution source that are not as accurately represented in a compressed format.

In my opinion these very subtle nuances are not enough to say there's a difference, because the average listener either won't have the equipment or perfect Harman EQ profile to pick up on them, but I do see why some audiophiles claim there is.

Jim Taylor said:
Very interesting! How was the ABX administered? Did you have a partner in the room switch for you, or did you use a computer app? Jim

I used the latest version of the app. You can either play Track X or Y and then choose whether X is A or Y is B and vice versa.

After 1 - 2 hours you starting getting in the groove. It was very interesting, but I absolutely hate those test tracks now.

caught gesture · Oct 18, 2021

audiofilet said:
Longer? I spent about 2 minutes listening to each track and did a total of 40 ABX runs with sets of 10 each, meaning I approximately spent about 800 minutes or 13 hours performing these tests. That was my entire weekend, but I wanted to make sure the results were accurate.

I actually wouldn't call it difference, it's more of a nuance. Most people, including myself now, probably agree that objectively they are pretty much identical, however under the right circumstances, which in this case was a particular EQ profile and Amp combination, I do believe that there are elements of a high resolution source that are not as accurately represented in a compressed format.

In my opinion these very subtle nuances are not enough to say there's a difference, because the average listener either won't have the equipment or perfect Harman EQ profile to pick on them, but I do see why some audiophiles claim there is.

Good for you for actually going to all this effort. If only other people were so diligent before making statements like “night and day difference” after changing something like a cable. I went through the same process a number of years ago and also came to the same conclusion. A 320mp3 is perfectly adequate as both a storage and playback medium when listening through my hifi. Talk of high frequency content being missing was absurd when I took a listening test and realised how much high content had been lost due to ageing ears!

somebodyelse · Oct 18, 2021

Just for the record which ffmpeg version was it, and what settings did you use to convert?

voodooless · Oct 18, 2021

That's pretty neat. Would be interesting to find out why the FIR filtered result yielded more audible difference than the EQ's? Would be interesting to test the PCM version EQ'ed with the three different methods as well.

On the other hand: why use MP3 for a test? None of the streaming services uses it, it's all either 256 kBit AAC or 320 kBit Vorbis. Both of which should be superior to 320 kBit MP3. Also: how was the downsampling done? Did you leave that up to the MP3 encoder, or did you convert to 16/44.1 first? If so, how?

audiofilet · Oct 18, 2021

caught gesture said:
Good for you for actually going to all this effort. If only other people were so diligent before making statements like “night and day difference” after changing something like a cable. I went through the same process a number of years ago and also came to the same conclusion. A 320mp3 is perfectly adequate as both a storage and playback medium when listening through my hifi. Talk of high frequency content being missing was absurd when I took a listening test and realised how much high content had been lost due to ageing ears!

It's extremely subtle, and mostly undetectable. Certainly not enough to charge people ridiculous amounts.

somebodyelse said:
Just for the record which ffmpeg version was it, and what settings did you use to convert?

This is the command I use on Audiophile Linux
ffmpeg -i input.flac -ab 320k -map_metadata 0 -id3v2_version 3 output.mp3

voodooless said:
That's pretty neat. Would be interesting to find out why the FIR filtered result yielded more audible difference than the EQ's? Would be interesting to test the PCM version EQ'ed with the three different methods as well.

On the other hand: why use MP3 for a test? None of the streaming services uses it, it's all either 256 kBit AAC or 320 kBit Vorbis. Both of which should be superior to 320 kBit MP3. Also: how was the downsampling done? Did you leave that up to the MP3 encoder, or did you convert to 16/44.1 first? If so, how?

Exactly, I'm currently researching convolution and FIR filters for that reason. You always hear it's superior to Parametric EQ but I really want to understand why.

You're saying 256kb/s AAC would be superior to 320kb/s Mp3?
I can't really speak on that, but in what way?

voodooless · Oct 18, 2021

audiofilet said:
Exactly, I'm currently researching convolution and FIR filters for that reason. You always hear it's superior to Parametric EQ but I really want to understand why.

Looking forward to seeing what you'll find.

audiofilet said:
You're saying 256kb/s AAC would be superior to 320kb/s Mp3?
I can't really speak on that, but in what way?

Well, that is what Apple claims anyway

It seems what AAC does better (among other things) is handling HF content above 16 kHz. It's not very easy to find any comprehensive comparisons though.

dasdoing · Oct 18, 2021

audiofilet said:
I approximately spent about 800 minutes or 13 hours performing these tests

was the no-eq the fist one?
You obviously will get better at it while doing this for so long time

Apesbrain · Oct 18, 2021

A similar test with 151 participants:

High Bitrate MP3 Internet Blind Test: Part 1 - PROCEDURE (Set B = MP3)

A blog for audiophiles about more objective topics. Measurements of audio gear. Reasonable, realistic, no snakeoil assessment of sound, and equipment.

archimago.blogspot.com

audiofilet · Oct 18, 2021

dasdoing said:
was the no-eq the fist one?
You obviously will get better at it while doing this for so long time

Yeah, I agree.
It did seem like I was getting better at analyzing certain stages of the tracks after 1-2 hours because I started memorizing them in detail.

No-EQ was actually the final test run, but I stopped half way because it was just identical. All the specific portions where I expected a change, because I had heard one during the other tests at that same point, simply were not there.

At least without equalization I can confidently say that they sound 100% identical.

voodooless said:
Looking forward to seeing what you'll find.

Well, that is what Apple claims anyway It seems what AAC does better (among other things) is handling HF content above 16 kHz. It's not very easy to find any comprehensive comparisons though.

Interesting, I've still got everything set up. I'll convert to AAC and do another run this week. I literally know these tracks intimately now, so I honestly feel like I'd just have to listen to the AAC, but let's see anyway.

Anyway, 256kb/s not 320, correct?

dasdoing · Oct 18, 2021

audiofilet said:
At least without equalization I can confidently say that they sound 100% identical.

well, you should have ended the tests with the no-eq again. that way you would have an estimate of how much the unwanted "training" was a factor

danadam · Oct 18, 2021

audiofilet said:
I do believe that there are elements of a high resolution source that are not as accurately represented in a compressed format.

Huh? Elements of high resolution? Your test was not designed to determine that. It could as well be (and IMO were) elements of standard resolution which inaccurate representation you noticed.

voodooless · Oct 18, 2021

audiofilet said:
Anyway, 256kb/s not 320, correct?

Yup. If possible, feed the encoder a 24-bit source file if possible (as you should do with MP3 or Vorbis as well).

dasdoing · Oct 18, 2021

MP3 isn't even been used anymore btw. AAC has a much better quality/bitrate

audiofilet · Oct 18, 2021

dasdoing said:
well, you should have ended the tests with the no-eq again. that way you would have an estimate of how much the unwanted "training" was a factor

Yeah, that's what I said, the tests without EQ were performed last. I was 100% guessing each time during those.

danadam said:
Huh? Elements of high resolution? Your test was not designed to determine that. It could as well be (and IMO were) elements of standard resolution which inaccurate representation you noticed.

Well, the source is 192khz/24bit. Granted, whether it was just the higher resolution is not clear, I was just trying to say that there were elements of the original recording that sounded slightly less refined or inaccurate with the compressed one.

voodooless said:
Yup. If possible, feed the encoder a 24-bit source file if possible (as you should do with MP3 or Vorbis as well).

Done. I have a little time right now. I just need to lookup a nice command for AAC conversion with ffmpeg, but it's probably the same.

bravomail · Oct 18, 2021

audiofilet said:
Conclusion:

I hate all three tracks now and never want to listen to them again. lol

Kudos for testing this subject again. Testing will hugely be dependent on the kind of music/track u r using. Classical music does not stress MP3 encoder much. Try distorted/unlinear heavy guitar music. U might notice difference.
My go to test track is "Hush" by Deep Purple with its beginning "drummer" section played on guitar.

Can YOU tell the difference between FLAC & MP3? - Part 2

Member

Member

Master Contributor

Member

Master Contributor

Member

Senior Member

Major Contributor

Grand Contributor

Member

Grand Contributor

Major Contributor

Addicted to Fun and Learning

Member

Major Contributor

Addicted to Fun and Learning

Grand Contributor

Major Contributor

Member

Addicted to Fun and Learning

Similar threads