Simple Dynamic Compression & Expansion

MRC01 · May 18, 2020

To compress or expand dynamics, I was wondering why a simple algorithm isn't used: convert each sample to a floating point in the range -1 to 1, then take it to some power P. If P = 1.0, the signal is unmodified. If 0 < P < 1.0 it compresses dynamics. If P > 1.0 it expands dynamics. Of course, preserve the original sign of each sample (don't let squaring them make them all positive). This has the benefit of being reversible: if you compress dynamics with for example P = 1/N, you can restore the original signal using P = N.

I was considering coding this myself and experimenting with it. But I'd like to ask here to see if there is some reason it wouldn't work and I'd be wasting my time.

RayDunzl · May 18, 2020

MRC01 said:
I was considering coding this myself and experimenting with it. But I'd like to ask here to see if there is some reason it wouldn't work and I'd be wasting my time.

It would work, but I believe it would create harmonic distortion.

Create a sine wave, perform the expansion, and see if it is still a sine...

MRC01 · May 18, 2020

Ah, I was thinking about how changing the shape of the wave could be a problem, but couldn't articulate exactly what the problem was. When I read your response it clicked and I thought "of course". That's why compression algorithms are more complex. They've got to boost the small waves by a linear constant to scale it up without changing its shape. For any real number, R * sin(X) doesn't introduce new frequency components in the FFT. But compression algorithms can't boost the big waves, so they've got to be constantly looking ahead and adjusting the linear constant.

RayDunzl · May 18, 2020

Example Expansion:

Didn't handle the negative values, but you can see the difference in the positive half of the wave.

Audible? I dunno.

The orange wave is the blue wave ^1.5

Looks like it does something odd at the zero crossing (and nearby)

Expansion (blue) compared to same amplitude sine:

MRC01 · May 18, 2020

Right, that's the same thing I saw in my own spreadsheet. The idea I proposed squishes (or expands) each individual bump of the wave. With compression, the values near zero are increased while values close to max are less affected. That means it's non-linear (obvious, since it's taking exponents). This must necessarily introduce new frequency components into the FFT - it adds frequencies that weren't already there.

One way to avoid this is to use only constants. This doesn't squish or expand each individual bump of the wave, but instead multiplies them all by some constant C >= 1.0. This is what a volume knob does. This of course requires changing that constant dynamically, where during quiet sections C is larger, then shrinks back to 1.0 during loud sections. This means it's linear, which doesn't add new frequency components in the FFT, it only changes the constants applied to the frequency components. In other words, it doesn't add frequencies that weren't already there.

Pluto · May 18, 2020

RayDunzl said:
It would work, but I believe it would create ... distortion

Exactly. Compression is, in itself, little more than adjusting a volume control, deftly enough that the listener is unaware of the operation except for the concomitant reduction in dynamic range.

If the attack and decay times are below that of one cycle of audio, the shape of the underlying waveform will be changed.

MRC01 said:
I was considering coding this myself and experimenting with it. But I'd like to ask here to see if there is some reason it wouldn't work and I'd be wasting my time.

There are so many "plug-ins" that do this kind of job that I would have a good play with them first. Then, if you find the existing algorithms wanting, invent and code away!

If reduction of dynamic range is at the heart of your mission (say for late-night listening to high dynamic range material) I'd be inclined to experiment along the lines of four or five bands and compressing the bands independently of each other to avoid pumping effects. But you have to be careful not to shift the resulting overall frequency balance.

Have fun!

MRC01 · May 18, 2020

Pluto said:
... If reduction of dynamic range is at the heart of your mission (say for late-night listening to high dynamic range material) I'd be inclined to experiment along the lines of four or five bands and compressing the bands independently of each other to avoid pumping effects. But you have to be careful not to shift the resulting overall frequency balance. ...

My goal is pure education. I was wondering, why are compression algorithms so complex when the method I mentioned in the OP is so much simpler? Now I know - to avoid distortion, it must be linear, scaling by constant factors, which means constantly changing the scaling factor.

BTW, you can see the distortion comparison directly from the spreadsheet, using FFT.

RayDunzl · May 18, 2020

MRC01 said:
BTW, you can see the distortion comparison directly from the spreadsheet, using FFT.

I got some numbers, but didn't know what to make of them.

MRC01 · May 18, 2020

By "you can see the distortion comparison" I meant:
Multiplying the signal by any constant doesn't change which frequencies appear in the FFT.
Taking the signal to a power as described in the first post, adds new frequencies to the FFT. Those new frequencies are the distortion you mentioned in your first reply.

JustAnandaDourEyedDude · May 18, 2020

To avoid generation of extra harmonics (by nonlinear distortion of sine wave tones) not present in the original music recording, the compression or expansion should be done in the amplitude-frequency domain. This implies a sample time window, and not an instant-by-instant scaling in the pressure-time domain. Perhaps a moving finite window or one encompassing an entire song or track.

The simplest compression-expansion algorithm I can think of is as follows, performed on a sound sample represented as amplitudes of a finite set of sine tones of discrete (distinct) frequencies. Let A(f) be the amplitude A as a function of frequency f, with A in units of pressure or volts or other linear scale and not a log or dB scale. Let B be the maximum amplitude in the sound sample, and C be the minimum amplitude among all the frequencies that you care to preserve, and D the amplitude corresponding to the threshold of hearing. Pick any real number S such that 0 <= S <= (B-D)/(B-C). For each frequency f, calculate A_ce(f) = B - S*(B - A(f)). Discard any negative amplitudes; they correspond to original amplitudes less than C, which you decided you do not care about.

Then the function A_ce(f) is a compressed (S < 1) or expanded (S > 1) version of A(f). The number S is a scaling factor for the dynamic range. The case S=0 gives extreme compression in which all of the original tones wind up with amplitude B to give music that has zero dynamic range. On the other hand, S=(B-D)/(B-C) gives extreme expansion in which the original tones with the minimum amplitude of C wind up with new amplitudes D, while original tones with amplitude B retain their amplitude, i.e., the dynamic range goes from B-C to B-D. The case S=1 just preserves the original tones and original dynamic range.

The above compression algorithm does not involve any interaction between distinct frequencies and will not generate any new frequency tones, i.e., no new harmonics. Of course, the compression or expansion will unavoidably change the tonality of a general music sample and the timbre of instruments and voices in it. The advanced compression algorithms are probably attempts to minimize these effects by frequency-dependent scaling, rather than a single scale factor such as S above. Bear in mind that I have no knowledge of how compression is actually carried out. I just based the scaling above by inferring from the current thread the requirement that the overall volume stay the same, which I translated to the peak amplitude B needing to stay the same. You can come up with something similar to make the average volume stay the same, too.

Compression and loudness wars are not on my radar for the most part. Highly commercial pop and rock dropped out of my musical diet some time during the nineties.

RayDunzl · May 18, 2020

If I were re-inventing it... I might start with:

The code would look ahead, decide the factor it thinks is needed for the next few (100? 1000?) milliseconds, and ramp (with increment limits) the factor at a zero crossings.

Recalculate for next zero crossing, repeat.

MRC01 · May 19, 2020

Yes that sounds like a simple implementation. Essentially, your algorithm shifts the volume knob up and down dynamically depending on what's coming next in the music. More specifically, what is the peak amplitude in the next N milliseconds (more realistically, seconds). This algorithm would have limits not only for rate of change of the knob, but also knob position: what is the max boost it will give quiet parts?

RayDunzl · May 19, 2020

MRC01 said:
This algorithm would have limits not only for rate of change of the knob, but also knob position: what is the max boost it will give quiet parts?

A range of 20dB (x10 linear change in voltage) would be pretty big...

If I still had my ancient in-line C code for writing WAV files I could probably play around with it, but I haven't written a line of anything for at least ten years.

---

Here's some wave files written sample by sample since I couldn't figure out how to tickle the sound synthesizers in Windows. Make some rules, let it go, see what happens.

On Ruth

My Speakers are Broken

The Whale

---

I could GUI on my old MacIntosh, but was completely baffled by Windows code, and gave up,

LTig · May 19, 2020

When I listen to heavy compressed music I usually notice a reduction of bass when it gets loud. This makes sense because the bass uses most of the power. It's just disappointing to listen to.

pkane · May 19, 2020

A simple compressor/expander is just a non-linear transfer function that looks linear below some threshold amplitude and starts to apply a knee or a curve at higher values. Any non-linear transfer function will result in some harmonic distortion, so compressors will as well, but there will be a less of it when the signal amplitude falls inside the linear region. Dynamic compressors use more complex logic to alter the curve based on content and time.

Here's a simple compressor applied to a 1kHz 0dBFS sine wave, note the transfer function:

This one generates only odd harmonics because the transfer function is symmetric.

Here's what a sine wave looks like after simple (symmetric) compression:

An asymmetric compressor might generate a different set of harmonics. This one attempts a simulation of a triode transfer function:

And here's the waveform:

Pluto · May 19, 2020

RayDunzl said:
If I were re-inventing it... I might start with:

The code would look ahead....

"delay line" limiters have been around for at least 40 years, although how that delay was obtained in the analogue world is another story.

Simple Dynamic Compression & Expansion

MRC01

Major Contributor

RayDunzl

Grand Contributor

MRC01

Major Contributor

RayDunzl

Grand Contributor

MRC01

Major Contributor

Pluto

Addicted to Fun and Learning

MRC01

Major Contributor

RayDunzl

Grand Contributor

MRC01

Major Contributor

JustAnandaDourEyedDude

Addicted to Fun and Learning

RayDunzl

Grand Contributor

MRC01

Major Contributor

RayDunzl

Grand Contributor

LTig

Master Contributor

pkane

Master Contributor

Pluto

Addicted to Fun and Learning

Similar threads