• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Simple Dynamic Compression & Expansion

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,476
Likes
4,093
Location
Pacific Northwest
To compress or expand dynamics, I was wondering why a simple algorithm isn't used: convert each sample to a floating point in the range -1 to 1, then take it to some power P. If P = 1.0, the signal is unmodified. If 0 < P < 1.0 it compresses dynamics. If P > 1.0 it expands dynamics. Of course, preserve the original sign of each sample (don't let squaring them make them all positive). This has the benefit of being reversible: if you compress dynamics with for example P = 1/N, you can restore the original signal using P = N.

I was considering coding this myself and experimenting with it. But I'd like to ask here to see if there is some reason it wouldn't work and I'd be wasting my time.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,159
Location
Riverview FL
I was considering coding this myself and experimenting with it. But I'd like to ask here to see if there is some reason it wouldn't work and I'd be wasting my time.

It would work, but I believe it would create harmonic distortion.

Create a sine wave, perform the expansion, and see if it is still a sine...
 
OP
MRC01

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,476
Likes
4,093
Location
Pacific Northwest
Ah, I was thinking about how changing the shape of the wave could be a problem, but couldn't articulate exactly what the problem was. When I read your response it clicked and I thought "of course". That's why compression algorithms are more complex. They've got to boost the small waves by a linear constant to scale it up without changing its shape. For any real number, R * sin(X) doesn't introduce new frequency components in the FFT. But compression algorithms can't boost the big waves, so they've got to be constantly looking ahead and adjusting the linear constant.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,159
Location
Riverview FL
Example Expansion:

Didn't handle the negative values, but you can see the difference in the positive half of the wave.

Audible? I dunno.

The orange wave is the blue wave ^1.5

Looks like it does something odd at the zero crossing (and nearby)

1589818698275.png




Expansion (blue) compared to same amplitude sine:

1589819199718.png
 
Last edited:
OP
MRC01

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,476
Likes
4,093
Location
Pacific Northwest
Right, that's the same thing I saw in my own spreadsheet. The idea I proposed squishes (or expands) each individual bump of the wave. With compression, the values near zero are increased while values close to max are less affected. That means it's non-linear (obvious, since it's taking exponents). This must necessarily introduce new frequency components into the FFT - it adds frequencies that weren't already there.

One way to avoid this is to use only constants. This doesn't squish or expand each individual bump of the wave, but instead multiplies them all by some constant C >= 1.0. This is what a volume knob does. This of course requires changing that constant dynamically, where during quiet sections C is larger, then shrinks back to 1.0 during loud sections. This means it's linear, which doesn't add new frequency components in the FFT, it only changes the constants applied to the frequency components. In other words, it doesn't add frequencies that weren't already there.
 

Pluto

Addicted to Fun and Learning
Forum Donor
Joined
Sep 2, 2018
Messages
990
Likes
1,631
Location
Harrow, UK
It would work, but I believe it would create ... distortion
Exactly. Compression is, in itself, little more than adjusting a volume control, deftly enough that the listener is unaware of the operation except for the concomitant reduction in dynamic range.

If the attack and decay times are below that of one cycle of audio, the shape of the underlying waveform will be changed.

I was considering coding this myself and experimenting with it. But I'd like to ask here to see if there is some reason it wouldn't work and I'd be wasting my time.
There are so many "plug-ins" that do this kind of job that I would have a good play with them first. Then, if you find the existing algorithms wanting, invent and code away!

If reduction of dynamic range is at the heart of your mission (say for late-night listening to high dynamic range material) I'd be inclined to experiment along the lines of four or five bands and compressing the bands independently of each other to avoid pumping effects. But you have to be careful not to shift the resulting overall frequency balance.

Have fun!
 
OP
MRC01

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,476
Likes
4,093
Location
Pacific Northwest
... If reduction of dynamic range is at the heart of your mission (say for late-night listening to high dynamic range material) I'd be inclined to experiment along the lines of four or five bands and compressing the bands independently of each other to avoid pumping effects. But you have to be careful not to shift the resulting overall frequency balance. ...
My goal is pure education. I was wondering, why are compression algorithms so complex when the method I mentioned in the OP is so much simpler? Now I know - to avoid distortion, it must be linear, scaling by constant factors, which means constantly changing the scaling factor.

BTW, you can see the distortion comparison directly from the spreadsheet, using FFT.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,159
Location
Riverview FL
BTW, you can see the distortion comparison directly from the spreadsheet, using FFT.

I got some numbers, but didn't know what to make of them.
 
OP
MRC01

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,476
Likes
4,093
Location
Pacific Northwest
By "you can see the distortion comparison" I meant:
Multiplying the signal by any constant doesn't change which frequencies appear in the FFT.
Taking the signal to a power as described in the first post, adds new frequencies to the FFT. Those new frequencies are the distortion you mentioned in your first reply.
 

JustAnandaDourEyedDude

Addicted to Fun and Learning
Joined
Apr 29, 2020
Messages
518
Likes
820
Location
USA
To avoid generation of extra harmonics (by nonlinear distortion of sine wave tones) not present in the original music recording, the compression or expansion should be done in the amplitude-frequency domain. This implies a sample time window, and not an instant-by-instant scaling in the pressure-time domain. Perhaps a moving finite window or one encompassing an entire song or track.

The simplest compression-expansion algorithm I can think of is as follows, performed on a sound sample represented as amplitudes of a finite set of sine tones of discrete (distinct) frequencies. Let A(f) be the amplitude A as a function of frequency f, with A in units of pressure or volts or other linear scale and not a log or dB scale. Let B be the maximum amplitude in the sound sample, and C be the minimum amplitude among all the frequencies that you care to preserve, and D the amplitude corresponding to the threshold of hearing. Pick any real number S such that 0 <= S <= (B-D)/(B-C). For each frequency f, calculate A_ce(f) = B - S*(B - A(f)). Discard any negative amplitudes; they correspond to original amplitudes less than C, which you decided you do not care about.

Then the function A_ce(f) is a compressed (S < 1) or expanded (S > 1) version of A(f). The number S is a scaling factor for the dynamic range. The case S=0 gives extreme compression in which all of the original tones wind up with amplitude B to give music that has zero dynamic range. On the other hand, S=(B-D)/(B-C) gives extreme expansion in which the original tones with the minimum amplitude of C wind up with new amplitudes D, while original tones with amplitude B retain their amplitude, i.e., the dynamic range goes from B-C to B-D. The case S=1 just preserves the original tones and original dynamic range.

The above compression algorithm does not involve any interaction between distinct frequencies and will not generate any new frequency tones, i.e., no new harmonics. Of course, the compression or expansion will unavoidably change the tonality of a general music sample and the timbre of instruments and voices in it. The advanced compression algorithms are probably attempts to minimize these effects by frequency-dependent scaling, rather than a single scale factor such as S above. Bear in mind that I have no knowledge of how compression is actually carried out. I just based the scaling above by inferring from the current thread the requirement that the overall volume stay the same, which I translated to the peak amplitude B needing to stay the same. You can come up with something similar to make the average volume stay the same, too.

Compression and loudness wars are not on my radar for the most part. Highly commercial pop and rock dropped out of my musical diet some time during the nineties.
 
Last edited:

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,159
Location
Riverview FL
If I were re-inventing it... I might start with:

The code would look ahead, decide the factor it thinks is needed for the next few (100? 1000?) milliseconds, and ramp (with increment limits) the factor at a zero crossings.

Recalculate for next zero crossing, repeat.
 
Last edited:
OP
MRC01

MRC01

Major Contributor
Joined
Feb 5, 2019
Messages
3,476
Likes
4,093
Location
Pacific Northwest
Yes that sounds like a simple implementation. Essentially, your algorithm shifts the volume knob up and down dynamically depending on what's coming next in the music. More specifically, what is the peak amplitude in the next N milliseconds (more realistically, seconds). This algorithm would have limits not only for rate of change of the knob, but also knob position: what is the max boost it will give quiet parts?
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,159
Location
Riverview FL
This algorithm would have limits not only for rate of change of the knob, but also knob position: what is the max boost it will give quiet parts?

A range of 20dB (x10 linear change in voltage) would be pretty big...

If I still had my ancient in-line C code for writing WAV files I could probably play around with it, but I haven't written a line of anything for at least ten years.

---

Here's some wave files written sample by sample since I couldn't figure out how to tickle the sound synthesizers in Windows. Make some rules, let it go, see what happens.

On Ruth

My Speakers are Broken

The Whale

---

I could GUI on my old MacIntosh, but was completely baffled by Windows code, and gave up,
 
Last edited:

LTig

Master Contributor
Forum Donor
Joined
Feb 27, 2019
Messages
5,807
Likes
9,512
Location
Europe
When I listen to heavy compressed music I usually notice a reduction of bass when it gets loud. This makes sense because the bass uses most of the power. It's just disappointing to listen to.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,667
Likes
10,297
Location
North-East
A simple compressor/expander is just a non-linear transfer function that looks linear below some threshold amplitude and starts to apply a knee or a curve at higher values. Any non-linear transfer function will result in some harmonic distortion, so compressors will as well, but there will be a less of it when the signal amplitude falls inside the linear region. Dynamic compressors use more complex logic to alter the curve based on content and time.

Here's a simple compressor applied to a 1kHz 0dBFS sine wave, note the transfer function:

1589852570399.png

This one generates only odd harmonics because the transfer function is symmetric.

Here's what a sine wave looks like after simple (symmetric) compression:
1589852615452.png


An asymmetric compressor might generate a different set of harmonics. This one attempts a simulation of a triode transfer function:
1589853157780.png


And here's the waveform:
1589853518585.png
 

Pluto

Addicted to Fun and Learning
Forum Donor
Joined
Sep 2, 2018
Messages
990
Likes
1,631
Location
Harrow, UK
If I were re-inventing it... I might start with:

The code would look ahead....
"delay line" limiters have been around for at least 40 years, although how that delay was obtained in the analogue world is another story.
 
Top Bottom