Hi, I am new to this forum but have been trying to improve my audio setup for a long time. I produce music as a hobby and work as a computer scientist (in algorithm engineering). However, I have no experience with signal processing. For room compensation/frequency response correction, I once used a Behringer DEQ 2496. Then, I switched to a Hifiberry DAC+ DSP running HifiberryOS. In the last month, I have been working in my free time on a Loudness Compensation (LC) algorithm that runs on the DAC+ DSP. I implemented the algorithm in SigmaStudio (see the attached dsp.dspproj).
My LC algorithm analyzes the loudness of the incoming signal to adjust the LC intensity. It is intended to be used in the following way: The signal exiting the DSP should be played back at the same volume always, i.e., you should set your DAC/Amp to a constant fixed volume and instead adjust the volume digitally at the source device. The main philosophy of this approach is that we assume the input signal has been created/mastered at 80 Phon, and that it should sound the same at any loudness. There are upsides and downsides to this approach:
Pro:
- The state of the volume control can be oblivious to the DSP. Traditional LC implementations use the state of the gain knob of an amplifier as a reference for the equalization intensity. This implementation does not require this.
- This algorithm adapts to signals with different loudnesses. Unless you are using a music streaming service with "equal loudness" turned on, the loudness of the signal output by your device may vary significantly depending on the content. Whereas you would have to adjust the gain on an amp (and thereby also change the LC intensity in a wrong way) with a traditional LC implementation, my algorithm automatically adapts to changing loudness.
- This implementation does not change the signal at 80 Phon, if configured correctly. Most other implementations do not support calibration to a reference loudness and therefore almost certainly will alter the sound at 80 Phon.
Contra:
- With very dynamic signals, my LC algorithm may unnecessarily and erratically change the LC intensity. However, by averaging the computed reference loudness over the past ~400ms, I could not perceive this effect in practice.
- This implementation requires an input signal with a bit depth of at least 24. This is necessary because the volume control must be done on the source device, and 16 bits of dynamic range are only sufficient for signals that use the full dynamic range.
Algorithm description:
The algorithm at first computes the digital loudness L(l,r) of the input signal (channels l and r) in dBFS using the flipped ISO 226 80 Phon curve W_80 as frequency weighting (see attached picture) and a first-order low-pass filter F_T with a configurable time constant T (T = 0.4s by default). More formally, it computes L(l,r) = 10 * log_10(F_T((W_80(l)^2 + W_80(r)^2) / 2)) Then, it computes its real perceived loudness P(l,r) = 80 + L(l,r) - L_80 in phon, using the parameter L_80, which is the digital loudness in dBFS that results in a real perceived loudness of 80 Phon. This parameter is dependent on the gain of the amplifier and the playback device and should be calibrated in the following way: Play back a 1000 Hz sine wave and adjust the digital volume until a calibrated SPL meter reports 80 dB SPL. This works correctly because the ISO 226 80 Phon curve is neutral at 1000 Hz. Note that using the flipped ISO 226 80 Phon curve for weighting is philosophically correct, because we assume the signal has been created/mastered at 80 Phon.
Then, given the resulting real loudness x = P(l,r) in Phon, the algorithm computes three values d(x), a(x) and t(x), where d(x) = 1 - (x - 40) / 40, a(x) = |d(x)| and t(x) = 1 if and only if d(x) < 0. d(x) is the loudness deviation relative to 80 phon, a(x) is the absolute deviation, and t(x) is the deviation type.
Now, let y be (a channel of) the input signal, i.e., l or r. With a(x) and t(x), the algorithm then computes the output signal as follows:
If t(x) > 0, then the output signal o(x,y) is o(x,y) = C_L(y) a(x) + y (1 - a(x)).
Similarly, if t(x) <= 0, then we have o(x,y) = C_H(y) a(x) + y (1 - a(x)).
Here, C_L and C_H (see attached pictures) are sets of filters that transform the ISO 226 80 Phon curve into the ISO 226 40 Phon and ISO 226 120 Phon curves in the audible band, respectively (see relative curves in the attached picture). Note that this algorithm closely follows the ISO 226 60 Phon and 100 Phon curves at 60 Phon and 100 Phon, respectively.
Finally, the algorithm limits the signals to prevent clipping. However, if you have enough headroom (L_80 = -25 dB in my case), then the limiters will never act.
In conclusion, the algorithm transforms the input sound s.t. it is not altered if it is played back at 80 phon (which is correct, because we assume that it has been created/mastered at 80 Phon). If, however, it is quieter or louder, then it is transformed in such a way that it sounds the same according to ISO 226, except for its loudness.
Let me know if you have questions regarding the algorithm or the implementation. Do you have suggestions for improving the algorithm?
My LC algorithm analyzes the loudness of the incoming signal to adjust the LC intensity. It is intended to be used in the following way: The signal exiting the DSP should be played back at the same volume always, i.e., you should set your DAC/Amp to a constant fixed volume and instead adjust the volume digitally at the source device. The main philosophy of this approach is that we assume the input signal has been created/mastered at 80 Phon, and that it should sound the same at any loudness. There are upsides and downsides to this approach:
Pro:
- The state of the volume control can be oblivious to the DSP. Traditional LC implementations use the state of the gain knob of an amplifier as a reference for the equalization intensity. This implementation does not require this.
- This algorithm adapts to signals with different loudnesses. Unless you are using a music streaming service with "equal loudness" turned on, the loudness of the signal output by your device may vary significantly depending on the content. Whereas you would have to adjust the gain on an amp (and thereby also change the LC intensity in a wrong way) with a traditional LC implementation, my algorithm automatically adapts to changing loudness.
- This implementation does not change the signal at 80 Phon, if configured correctly. Most other implementations do not support calibration to a reference loudness and therefore almost certainly will alter the sound at 80 Phon.
Contra:
- With very dynamic signals, my LC algorithm may unnecessarily and erratically change the LC intensity. However, by averaging the computed reference loudness over the past ~400ms, I could not perceive this effect in practice.
- This implementation requires an input signal with a bit depth of at least 24. This is necessary because the volume control must be done on the source device, and 16 bits of dynamic range are only sufficient for signals that use the full dynamic range.
Algorithm description:
The algorithm at first computes the digital loudness L(l,r) of the input signal (channels l and r) in dBFS using the flipped ISO 226 80 Phon curve W_80 as frequency weighting (see attached picture) and a first-order low-pass filter F_T with a configurable time constant T (T = 0.4s by default). More formally, it computes L(l,r) = 10 * log_10(F_T((W_80(l)^2 + W_80(r)^2) / 2)) Then, it computes its real perceived loudness P(l,r) = 80 + L(l,r) - L_80 in phon, using the parameter L_80, which is the digital loudness in dBFS that results in a real perceived loudness of 80 Phon. This parameter is dependent on the gain of the amplifier and the playback device and should be calibrated in the following way: Play back a 1000 Hz sine wave and adjust the digital volume until a calibrated SPL meter reports 80 dB SPL. This works correctly because the ISO 226 80 Phon curve is neutral at 1000 Hz. Note that using the flipped ISO 226 80 Phon curve for weighting is philosophically correct, because we assume the signal has been created/mastered at 80 Phon.
Then, given the resulting real loudness x = P(l,r) in Phon, the algorithm computes three values d(x), a(x) and t(x), where d(x) = 1 - (x - 40) / 40, a(x) = |d(x)| and t(x) = 1 if and only if d(x) < 0. d(x) is the loudness deviation relative to 80 phon, a(x) is the absolute deviation, and t(x) is the deviation type.
Now, let y be (a channel of) the input signal, i.e., l or r. With a(x) and t(x), the algorithm then computes the output signal as follows:
If t(x) > 0, then the output signal o(x,y) is o(x,y) = C_L(y) a(x) + y (1 - a(x)).
Similarly, if t(x) <= 0, then we have o(x,y) = C_H(y) a(x) + y (1 - a(x)).
Here, C_L and C_H (see attached pictures) are sets of filters that transform the ISO 226 80 Phon curve into the ISO 226 40 Phon and ISO 226 120 Phon curves in the audible band, respectively (see relative curves in the attached picture). Note that this algorithm closely follows the ISO 226 60 Phon and 100 Phon curves at 60 Phon and 100 Phon, respectively.
Finally, the algorithm limits the signals to prevent clipping. However, if you have enough headroom (L_80 = -25 dB in my case), then the limiters will never act.
In conclusion, the algorithm transforms the input sound s.t. it is not altered if it is played back at 80 phon (which is correct, because we assume that it has been created/mastered at 80 Phon). If, however, it is quieter or louder, then it is transformed in such a way that it sounds the same according to ISO 226, except for its loudness.
Let me know if you have questions regarding the algorithm or the implementation. Do you have suggestions for improving the algorithm?