• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

MQA creator Bob Stuart answers questions.

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
B The Compression ratio is compressed size/uncompressed size * 100. So, lower is better.

So it says it's 1.4% less efficient?
 

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,407
Likes
4,004
Location
Pacific Northwest
The best in the chart is OptimFROG which is only 2.4% smaller than FLAC, and much slower.
I wonder if we can get a mathematical baseline to estimate the theoretical limit: what is the smallest % that is possible? That's what I meant earlier by measuring Shannon entropy of PCM/WAV files.
 
Last edited:

mansr

Major Contributor
Joined
Oct 5, 2018
Messages
4,685
Likes
10,697
Location
Hampshire
FLAC is not that good although the distance between good and great is very small. My team at Microsoft developed WMA Lossless with the goal of beating all the others in efficiency. You can see that ("WMAL") in this table from HA wiki: https://wiki.hydrogenaud.io/index.php?title=Lossless_comparison

View attachment 27481

That extra 1.4% of efficiency cost us in the marketplace as at the time, many CPUs were too slow to decode it in embedded platforms.

This is a bit like getting cream/butter from milk. The first pass will take out most of the fat. What is left over is very little.
According to that table, WMAL has slightly worse efficiency than FLAC. Of course, the exact results will depend on the material, but it's not looking good for WMAL.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,451
Likes
36,880
I can get FLAC to compress a 24 bit wav file to only 8.9% the size of the original. Of course I cheated. I was encoding a quarter sample rate tone. Looks like that is as good as FLAC gets because encoding all zeros gave the same file size in FLAC.
 

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,407
Likes
4,004
Location
Pacific Northwest
You might get it even smaller if the amplitude is lower: all the higher order bits will be zero (assuming integer, not float).
The theoretical minimum will depend on the content of the music. Tests like yours give a lower bound. The upper bound could be measured from PCM files of complex signals having wide dynamic range with peak levels near max / 0 dB.
 

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
Upper bound is no compression at all. Try compressing white noise.
It can actually be bigger than the source file due to metadata
 

PierreV

Major Contributor
Forum Donor
Joined
Nov 6, 2018
Messages
1,437
Likes
4,686
Mathematically, I can define a square wave that has a distinct unique value for every time t, no need for a single time t to have multiple values:
F(t) = sign(cos(t))
Consider a transition point (say, t = pi/2). When t = pi/2, my square wave (voltage) is 0. Just before that, it's 1. Just after that, it's -1.
Thus, there are never multiple voltages at the same time. But there is an infinite rate of change, which is impossible in the real world.

Yes, I am aware of that, there are still discontinuities though, and it is a convergence issue. The heavyside approach is a convention, but a useful tool. sin works as well ofc. and many other things. IMHO - it is a discontinuous pulse train, not a "wave". The definition for a pure "function" calls for a 1 to 1 mapping from the domain and the co-domain and that of course would be violated in a theoretical perfect square "wave" as represented on audio sites. In the real world, one needs a dx, even if it is an infinitely small one - in that case, it is a function. Without a dx, it is not a function or is a function by convention only.

It's nitpicking, maybe, but many people object, rightly imho, to the staircase representation of a digitally sampled signal as it opens the door to misinterpretation.

I feel the "square wave" representation leads to the same type of misinterpretation. Here's the mathematically correct representation of the function you gave for the "square wave" - a bit different from what people usually picture in their mind.

1560214858249.png
1560214858249.png

1560215327708.png


Note: not trying to lecture or be pedantic in any way, and I am sure you are well aware of all this. But if the staircase representation should be criticized as it is rightly in that famous video, the square "wave" deserves the same treatment I think.
 

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,407
Likes
4,004
Location
Pacific Northwest
I get your point. I wasn't trying to be pedantic either, but point out that of the 2 reasons mentioned here why a square wave can't exist in nature, one is more fundamental than the other.
Reason 1: because it has multiple values for a single instant in time
Reason 2: because it requires an infinite rate of change
The first reason is not fundamental because it is possible to define a square wave function that doesn't have multiple Y values for a given X value. However, the second is fundamental because it is impossible to define a square wave that doesn't have an infinite rate of change.
Whether a square wave really is a wave... I'll leave that to the philosophers. Suffice to say that it's a term in common use.
 

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,407
Likes
4,004
Location
Pacific Northwest
Upper bound is no compression at all. Try compressing white noise.
It can actually be bigger than the source file due to metadata
Interesting experiment. I just tried it: got 49.7% ratio (FLAC half the size of the original).
But that file was at -6 dB, so the top bit of amplitude wasn't used. I made another at digital full scale and it FLAC compressed at 50.2%.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Can you link to a paper about the fast integrator?

That's a good, albeit older, read: https://www.pnas.org/content/100/10/6151.

Please note that the study was done just above the threshold of hearing, so the integration times needed were of the order of milliseconds. At 1000x sound pressures - that is, at normal listening level - they are of the order of microseconds.

In later studies, coefficients of the integration algorithm were refined, yet its formula remained the same.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Thanks. But that misses the last part of my sentence. I like to see @Sergei run his listening tests and post his observation and files. Then we can get somewhere as opposed to a theoretical discussion, or dismissal of the results after the fact because the test files were not this way or that way.

Cat is out of the bag on this one. Will take Amir's suggestion when something like this is needed again.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Fair point. With my devil's advicate hat on I'd argue telling you what you're supposed to hear before you hear it would affect what you hear. Is there a way to do a sealed post with the "here's what you should have heard and why" part that can only be opened some time later, or do we just have to trust people not to open spoiler tags in this sort of situation?
I like the idea.
Having said that I'm missing how this specific test is relevant. It's an interesting demonstration of a phenomenon I didn't know about, but it's something that can be captured and reproduced by the existing recording/playback chain.

It demonstrates that the hearing system entangles duration and loudness. The LTI theory doesn't entangle duration and power. Thus, an energy-preserving linear transform valid under the LTI, such as a filter, may not be automatically perceptually faithful.

This was in illustration of the statement that widening an analog "too sharp to represent" pulse by replacing it with a wider "sampling rate friendly" pulse having the same LTI energy, or even the same perceptually accurate energy, may not be faithful because the perceptual timing may be off.
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
I like the idea.


It demonstrates that the hearing system entangles duration and loudness. The LTI theory doesn't entangle duration and power. Thus, an energy-preserving linear transform valid under the LTI, such as a filter, may not be automatically perceptually faithful.

This was in illustration of the statement that widening an analog "too sharp to represent" pulse by replacing it with a wider "sampling rate friendly" pulse having the same LTI energy, or even the same perceptually accurate energy, may not be faithful because the perceptual timing may be off.

The key word in your statement being MAY.

This is waxing hypothetical taken to the extreme.

It's just another way of saying that the effects of anti-aliasing filters may be audible under some conditions.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
Put differently: construct square wave (A) using 1 MHz bandwidth. Construct square wave (B) using 25 kHz bandwidth. All else equal: frequency, amplitude, phase. We humans can't hear the difference between A and B. At least, I've never seen evidence suggesting this.

Depends on the amplitude. The 25 KHz wave may be still subsonic, whereas the 1MHz one could require a transducer cone to move faster than the speed of sound, generating a shockwave.

Regular sound wave may go up to 194 dB SPL: very loud and ear-damaging, but that's it. A strong enough shockwave topples buildings. Or the main character in "Back to the future" :)
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
The key word in your statement being MAY.

You got it. Precisely. Each of us only MAY get into a car accident on a given day. Still, isn't it wise to wear a seat belt every time anyway?

I view higher sampling rates in a similar light. Most of the time we don't need them to faithfully record music. Yet sometimes we do.

Maybe to capture that once-in-a-lifetime crazy electric guitar solo, the heart-grabbing edginess of which would be smoothed to the point of boredom by a lower sampling rate.
 

nscrivener

Member
Joined
May 6, 2019
Messages
76
Likes
117
Location
Hamilton, New Zealand
You got it. Precisely. Each of us only MAY get into a car accident on a given day. Still, isn't it wise to wear a seat belt every time anyway?

I view higher sampling rates in a similar light. Most of the time we don't need them to faithfully record music. Yet sometimes we do.

Maybe to capture that once-in-a-lifetime crazy electric guitar solo, the heart-grabbing edginess of which would be smoothed to the point of boredom by a lower sampling rate.

Actually, no, I take "may" to mean that it's not demonstrated and may not be an effect at all.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
The fact that some people under ideal conditions can discern 44-16 from higher rate formats was discussed and referenced earlier in this thread.
Was it? Didn't someone point out that in terms of sample rate and bandwidth they may simply be hearing audible products of intermodulation in the hardware rather than the format itself..?
 

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
That's a good, albeit older, read: https://www.pnas.org/content/100/10/6151.

Please note that the study was done just above the threshold of hearing, so the integration times needed were of the order of milliseconds. At 1000x sound pressures - that is, at normal listening level - they are of the order of microseconds.

In later studies, coefficients of the integration algorithm were refined, yet its formula remained the same.
Could you point me to the part of the paper which talks about a fast integrator working over a time frame of microseconds?
The lowest integration time I saw in the graphs there was ~1ms.
 

Costia

Member
Joined
Jun 8, 2019
Messages
37
Likes
21
Interesting experiment. I just tried it: got 49.7% ratio (FLAC half the size of the original).
But that file was at -6 dB, so the top bit of amplitude wasn't used. I made another at digital full scale and it FLAC compressed at 50.2%.
That doesn't sound right.
How did you generate the noise?
Did you make a stereo file with 2 identical channels?
 

SIY

Grand Contributor
Technical Expert
Joined
Apr 6, 2018
Messages
10,359
Likes
24,661
Location
Alfred, NY
This is waxing hypothetical taken to the extreme.

It's goalpost-moving. We're now far into the territory of irrelevant, obfuscatory, and scattered. Any of the original (incorrect) points related to signal generation, capture, and replay have been long forgotten, which may be the idea.
 
Top Bottom