• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are daily reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

[Update] Can Flac files with different audio MD5 be identical?

Tanser

New Member
Joined
Nov 9, 2023
Messages
2
Likes
1
I've just downloaded two versions of a song: one was ripped from a CD, and the other was downloaded from Ototoy. When I check the audio MD5 in foobar2000, they appear different. However, when I use DeltaWave to compare these two files, it states, 'Files are a bit-perfect match at 16 bits,' and they look the same in the delta spectrogram. This has left me wondering if these two files are actually identical. Additionally, I'm not sure if DeltaWave labeling them as a 'bit-perfect match' means they are the same.





Update: Thanks for your suggestions, I've tried the null test by Audacity
eac ori.png

webdl ori.png

These are original audio md5 of them, one was ripped by each and the other was web-downloaded.
And then let's compare them in Audacity
compare ori.png

That's right, these two files have different leading or trailing with value 0.
Now I'll truncate silence.
truncate silence.png


They look the same, then invert one and mix them.
mix.png


Beautiful straight line. But let's check again the audio md5 before invert, just after truncate silence.
eac ts.png
webdl ts.png

Now they are finally the same. It just demonstrated again the reliability of CD and EAC, which may already be automatically performed by Deltawave. This method may also be used to test those called 'HI-FI CD rippers'.
Thank you guys.
 

Attachments

  • truncate silence.png
    truncate silence.png
    24.3 KB · Views: 23
Last edited:

Joachim Herbert

Senior Member
Forum Donor
Joined
Jan 20, 2019
Messages
447
Likes
659
Location
Munich, Germany
Different metadata, maybe. Different copyright notes are enough to change md5.
 

Vincent Kars

Addicted to Fun and Learning
Technical Expert
Joined
Mar 1, 2016
Messages
763
Likes
1,483
would not change the audio md5
Correct.
FLAC contains a MD5 for the audio part only. You can use this to check if the file is corrupted.
Just run FLAC –t FileToTest.FLAC

By design if the MD5 is different, they are different. If one has 1 more sample leading or trailing with value 0, they both of course sound totally identical but do have a different MD5.
You might try a null test using any audio editor.
 
Last edited:

kemmler3D

Major Contributor
Forum Donor
Joined
Aug 25, 2022
Messages
2,824
Likes
5,276
Location
San Francisco
If deltawave says they're identical, then they are probably identical. What could easily be different is a different number of leading / trailing samples. One more fraction of a millisecond of silence in one of the files will give a different MD5 value, so if they are ripped from different sources, you should probably expect MD5 to not match.
 

MaxwellsEq

Major Contributor
Joined
Aug 18, 2020
Messages
1,538
Likes
2,280
Hashes like MD5 are designed to ensure that even one bit difference makes a massive change in the hash so that an attacker cannot guess what one file contains based on another file with a very close hash.
 

palm

Member
Joined
Mar 4, 2023
Messages
74
Likes
62
Is the audio md5 from foobar calculated before or after decompression? I don’t know how flac works exactly but I assume like zip you could have different compression ratios but the same decoded content.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,299
Likes
36,583
Deltawave will tell you if it adjusted by a few samples to align tracks.
 
Last edited:

tmtomh

Major Contributor
Forum Donor
Joined
Aug 14, 2018
Messages
2,581
Likes
7,299
Different CD drives have different offsets. This means they can rip the identical track but as noted above, the resulting file may have different durations of silence before and/or after the music. This is the most likely explanation here, though as @Vincent Kars notes, you can be certain (well, close enough to certain, anyway) that the audio is the same by running a null test in a free app like Audacity.

In fact, if you load both versions up in Audacity, you don't really even need to run the null test - just lining up them both up together in Audacity so the musical data for each one is aligned with the other down to the level of the individual sample, you can scroll left and right and examine the silence at the beginning and end. If you see that either one has more or fewer samples of silence than the other one, you can be almost certain that's the culprit for the different md5 hashes.
 
Last edited:

Vincent Kars

Addicted to Fun and Learning
Technical Expert
Joined
Mar 1, 2016
Messages
763
Likes
1,483
I don’t know how flac works exactly
When FLAC encodes, it calculate the MD5 of the audio input and that is linear PCM. It uses the audio part of the input only, no tags, no artwork, etc.
It stores the MD5 in the header.
When FLAC decodes, you get exactly the same linear PCM as output. FLAC can check this by computing the MD5 of the output and compare its value with the stored one.
If there is a difference the file is corrupted.
 
Top Bottom