• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

DR Measurement RMS

chuck_charles

New Member
Joined
Feb 17, 2026
Messages
4
Likes
0
Hello everyone!

Please excuse if this is the wrong part of the forum to ask this and redirect me if necessary, this is my first post.

I am currently trying to get a nice and lightweight command-line based DR14 measurement and log for my Linux server.
Since the official plugin does not work under Linux and is not easily scriptable I tried modifying a python based implementation.

This worked kinda, I had to modify the rounding of the final DR value, but then the DR and Peak values matched the official foobar-plugin.
But the RMS bugs me to no end. It is agonizingly close, but over and undershoots by ~±0.03 dB every time. One channel is always larger than the official value and the other lower.

For example "Woman Tonight" by "America" from the 2016 Hearts release by AF and Steve Hoffman:

The official plugin says:
DR Peak RMS Duration Track
DR12 -0.73 dB -15.53 dB 2:25 08-Woman Tonight

Mine outputs:
DR Peak RMS Duration Track
DR12 -0.73 dB -15.51 dB 2:25 08 Woman Tonight.flac
With the
RMS_left = -15.383717 dB and
RMS_right = -15.641628 dB

My code boils down to this:
y, sr = librosa.load(song["path"], sr=None, mono=False)
y = y.T
s = y.shape
y_rms = np.sqrt(2.0 * np.sum(y**2.0, 0) / float(s[0]))
db_rms = 20.0 * np.log10(y_rms)
db_rms = np.mean(db_rms)

I tried everything I could think of to get closer to the originals RMS:
  1. Downmixing to mono and then calculating
  2. Using other means to get the mean of left and right RMS (geometric and harmonic)
  3. Truncating to 3s blocks and discarding the last < 3s block
  4. Summing 3s block energies then taking the RMS
  5. Forcing Python to use 32 bit float arithmetic
Nothing worked.
Does anyone have an idea what I could otherwise try or knows how the plugin works "under the hood", since its a compiled .dll?
 
I know it is counting literally nothing, but I'd like to have 100% parity with the plugin to be able to tag my releases correctly.
 
Have you checked what method the foobar plug-in uses to average the RMS of 2 channels?

Some programs, e.g. Mathematica, when calculates the RMS Amplitude in its AudioMeasurements function, first converts the 2-channel audio into mono by averaging the left & right channel data, i.e. data_mono = 0.5(data_left + data_right).

Also, are you sure about the 2.0 in the RMS calculation ( ... 2.0*np.sum ... )?
My code boils down to this:
y, sr = librosa.load(song["path"], sr=None, mono=False)
y = y.T
s = y.shape
y_rms = np.sqrt(2.0 * np.sum(y**2.0, 0) / float(s[0]))
db_rms = 20.0 * np.log10(y_rms)
db_rms = np.mean(db_rms)
 

Attachments

  • RMS Average.png
    RMS Average.png
    198.7 KB · Views: 46
Forcing Python to use 32 bit float arithmetic
It does it by itself, so if anything you could try to force it to use 64 bit (but maybe that won't be necessary, see later below).

In dr14tmeter, the audio samples are stored as float32, in audio_file_reader.py:
Code:
convert_16_bit = numpy.float32(2**15 + 1.0)
...
target.Y = target.Y / (convert_16_bit)

The librosa that you are using also loads them as float32, at least according to the documentation:
Code:
librosa.load(..., dtype=<class 'numpy.float32'>, ...)

And then numpy.sum says:
Code:
numpy.sum(a, axis=None, dtype=None, ...)
...
dtypedtype, optional

    ... The dtype of a is used by default unless ...


db_rms = 20.0 * np.log10(y_rms)
db_rms = np.mean(db_rms)
Python version you linked does the mean first, which makes more sense (but still is generally wrong). In compute_dr14.py:
Code:
    y_rms = dr_rms(Y)
    y_rms = np.mean(y_rms)

    db_rms = decibel_u(y_rms, 1.0)
With your version, imagine 0 dB in the left channel and -100 dB in the right channel. The end result will be -50 dB, which doesn't seem right.

The correct way would be to do a normal rms on them:
Code:
rms_overall = np.sqrt( np.mean( y_rms**2.0 ) )
rms_overall_db = decibel_u(rms_overall, 1.0)

With a synthetic signal like this (right channel is a copy of the left one but attenuated by 70 dB):

almost_square.png


Dynamic Range Meter 1.1.1 calculates RMS = -4.24 dB, the python version -7.17 dB and the correct code from above -4.16 dB with float32 in np.sum() and -4.24 dB with float64 in np.sum().

But that is still not exactly what Dynamic Range Meter is doing, AFAICT. The change from mean(y_rms) to sqrt(mean(y_rms^2)) makes almost no difference with normal signals. Here's some album:
Code:
foobar  python_f32  python_f64  correct_f32  correct_f64  correct_f64_diff
 -7.65       -7.66       -7.61        -7.66        -7.61              0.04
 -9.37       -9.43       -9.36        -9.43        -9.36              0.01
 -7.43       -7.44       -7.39        -7.44        -7.39              0.04
 -7.81       -7.81       -7.76        -7.81        -7.76              0.05
 -6.67       -6.66       -6.62        -6.66        -6.62              0.05
 -6.29       -6.31       -6.26        -6.31        -6.26              0.03
 -7.32       -7.35       -7.30        -7.35        -7.29              0.03
 -7.96       -8.03       -7.96        -8.02        -7.96              0.00
 -9.08       -9.13       -9.08        -9.13        -9.08              0.00
 -8.24       -8.28       -8.22        -8.27        -8.22              0.02
 -8.76       -8.78       -8.73        -8.78        -8.73              0.03

The closest I can get is when I split the signal into blocks of 3 seconds, calculate dr_rms() for each block, then do the normal rms of those values:
Code:
foobar  block_rms  block_rms_diff
 -7.65      -7.65            0.00
 -9.37      -9.36            0.01
 -7.43      -7.43            0.00
 -7.81      -7.80            0.01
 -6.67      -6.66            0.01
 -6.29      -6.28            0.01
 -7.32      -7.31            0.01
 -7.96      -8.00           -0.04
 -9.08      -9.12           -0.04
 -8.24      -8.24            0.00
 -8.76      -8.75            0.01
In this case there is also no difference between float32 and float64 np.sum().
 
Also, are you sure about the 2.0 in the RMS calculation ( ... 2.0*np.sum ... )?
DR14 is using 2.0 when calculating RMS of the blocks. Possibly so that the DR value of a pure sine wave is 0.

Or maybe it's AES17 thing?


 
Thank you so much for the very comprehensive answer.

@NTK I know the 2.0 is not part of the "normal" RMS formula I know from EE, but is mentioned in the 2009 Algorithmix paper and without it the values are even further off:
1771414505859.png

@danadam To the point of mean(log(Y)) or log(mean((Y)): I know that log does distort the arithmetic mean (according to Jensen's Inequality the method in the python code should always be a bit less than the "correct" way), but in my testing the difference was in the third decimal of my dB values, so I figured that this is not the (main) source of the discrepancy.
I already tried your approach using the 3s-blocks (downmixing LR before and after logging did not help) and still came to an error (the numerical stability should be good enough). Maybe the plugin uses the last seconds not in the last block as a separate block and overemphasizes them relative to their size? I will try this next.

Again thank you all for the time you took to look into this.
 
Spent a little bit of time looking into the various DR meter implementations. Yes, they have redefined a "DR_RMS" by multiplying the standard definition with √(2) so that a sine wave will give a DR_RMS value of 1.0 (or 0 dB). Annoying but no big deal once you know what it is. Just add 3.0103 dB to the standard definition.

In the DR14 T.Meter, I found in the "compute_dr14.py" the following code snippet (between the two "..." are lines 96-99):
https://github.com/simon-r/dr14_t.meter/blob/master/dr14tmeter/compute_dr14.py

Python:
def compute_dr14(Y, Fs, duration=None, Dr_lr=None):
...
    y_rms = dr_rms(Y)
    y_rms = np.mean(y_rms)

    db_rms = decibel_u(y_rms, 1.0)
...
    return (dr14, db_peak, db_rms)

Where dr_rms() and decibel_u() are defined in "audio_math.py" as:
Python:
def dr_rms(y):
    n = y.shape
    return numpy.sqrt(2.0 * numpy.sum(y**2.0, 0) / float(n[0]))

def decibel_u(y, ref):
    return 20.0 * numpy.log10(y / ref)
 
However, in a different implementation, the signal is partitioned into 3 second blocks (following the DR algorithm in the Algorithmix paper), and the "averaged" RMS is the root mean square of the DR_RMS of all the 3 second blocks.
https://github.com/janw/drmeter/blob/main/drmeter/algorithm.py
Python:
from __future__ import annotations

import math

import numpy as np
import soundfile as sf

from drmeter.exceptions import FileTooShort
from drmeter.models import AudioData, DynamicRangeResult
from drmeter.utils import ignore_div0

BLOCKSIZE_SECONDS = 3
UPMOST_BLOCKS_RATIO = 0.2
NTH_HIGHEST_PEAK = 2

MIN_BLOCK_COUNT = 1 // UPMOST_BLOCKS_RATIO
MIN_DURATION = MIN_BLOCK_COUNT * BLOCKSIZE_SECONDS

SUPPORTED_EXTENSIONS = {f".{fmt.lower()}" for fmt in sf.available_formats()}


def _analyze_block_levels(
    data: sf.SoundFile,
    total_blocks: int,
    blocksize: int,
) -> tuple[np.ndarray, np.ndarray]:
    block_rms = np.zeros((total_blocks, data.channels))
    block_peak = np.zeros((total_blocks, data.channels))
    for nn, block in enumerate(data.blocks(blocksize=blocksize)):
        interim = 2 * (np.power(np.abs(block), 2))
        block_rms[nn] = np.sqrt(interim.mean(axis=0, keepdims=True))
        block_peak[nn] = np.abs(block).max(axis=0)

    block_rms.sort(axis=0)
    block_peak.sort(axis=0)
    return block_rms, block_peak


def dynamic_range(data: AudioData) -> DynamicRangeResult:
    blocksize = round(BLOCKSIZE_SECONDS * data.samplerate)
    total_blocks = math.ceil(data.frames / blocksize)
    if total_blocks < MIN_BLOCK_COUNT:
        raise FileTooShort(f"File cannot be shorter than {MIN_DURATION} seconds")

    block_rms, block_peak = _analyze_block_levels(data, total_blocks=total_blocks, blocksize=blocksize)
    with ignore_div0():
        rms_pressure = np.sqrt((np.power(block_rms, 2)).mean(axis=0))
        peak_pressure = block_peak[-1]

    upmost_blocks = round(total_blocks * UPMOST_BLOCKS_RATIO)
    upmost_blocks_rms = block_rms[-upmost_blocks:]
    pre0 = np.power(upmost_blocks_rms, 2).sum(axis=0)
    pre1 = np.repeat(upmost_blocks, data.channels, axis=0)
    pre2 = np.sqrt(pre0 / pre1)

    dr_score: np.ndarray = np.array(
        [
            20 * np.log10(_peak / _pre2) if _pre2 > 0 else 0.0
            for _peak, _pre2 in zip(block_peak[-NTH_HIGHEST_PEAK], pre2)
        ]
    )

    return DynamicRangeResult(dr_score=dr_score, peak_pressure=peak_pressure, rms_pressure=rms_pressure)

[Edit] Found out that taking the root mean square of the entire signal in one shot should give the same result as partitioning the signal into N equal partitions, calculate the RMS of each of the N partitions, and take the RMS of the N RMS's. However, this is true only if all the partitions are of the same size. If the last partition is shorter than 3 seconds, the samples in the last partition will effectively receive a higher weight because its RMS (calculated with fewer samples) has the same weight as those of the other full length partitions. This can explain the very small differences in the dB RMS calculated using the entire signal in one shot versus partitioning.
 
Last edited:
Okay I tried appending the rms the last, smaller block to the rms values and averaging from there, but the results are even worse (> 1dB in most cases).

@NTK Maybe you are on to something with the averaging of blocks being unequal to the average of all. I could imagine that Maat used some sort of hack to prevent the sum from overflowing the 32 bit integer. Maybe they divided each value first by the length, rounded and summed them in the end. So that the error I am seeing is the result of a lot of rounding, but my gut tells me that they should cancel out if they are statistically independent.
 
Maybe the plugin uses the last seconds not in the last block as a separate block and overemphasizes them relative to their size?
That's how I did it too.

In the DR14 T.Meter, I found in the "compute_dr14.py" the following code snippet (between the two "..." are lines 96-99):
That's what I linked and quoted in my post :-)

Found out that taking the root mean square of the entire signal in one shot should give the same result as partitioning the signal into N equal partitions, calculate the RMS of each of the N partitions, and take the RMS of the N RMS's.
If the summing happens in float32 (whether on purpose or by accident), then the longer the signal the higher the chance of running out of float precision and actually getting a different results :-)



In attachment there are two simple signals that show some weirdness in the calculations done by the plugin:
two_different_blocks.png


The first signal is a 3-seconds block at -3 dBFS, followed by a 3-seconds block at -43 dBFS, same in both channels. The second signal is the 3-seconds block at -3 dBFS in one channel and the 3-seconds block at -43 dBFS in the second channel. When you use the plugin to run the DR calculation on a single file, it produces a log containing RMS values for both channels separately. One could expect that the per channel RMS in the first file (and the overall RMS too) will be the same as the overall RMS in the second file. By my calculations it should be -6.0095 dB, so -6.01 after rounding. Well, the overall RMSes according to the plugin are:
Code:
DR         Peak         RMS     Duration Track
--------------------------------------------------------------------------------
DR3       -3.00 dB    -6.02 dB      0:06 ?-two_different_blocks_in_one_channel
DR0       -3.00 dB    -6.01 dB      0:03 ?-two_different_blocks_in_two_channels
--------------------------------------------------------------------------------
The per channel RMS in the first file:
Code:
                 Left              Right

Peak Value:     -3.00 dB   ---     -3.00 dB   
Avg RMS:        -6.02 dB   ---     -6.02 dB   
DR channel:      3.02 dB   ---      3.02 dB
and in the second file, for completeness:
Code:
                 Left              Right

Peak Value:     -3.00 dB   ---    -42.96 dB   
Avg RMS:        -3.00 dB   ---    -43.00 dB   
DR channel:     -0.00 dB   ---      0.04 dB
 

Attachments

That's what I linked and quoted in my post :-)
Yeah! I missed your post :facepalm:
If the summing happens in float32 (whether on purpose or by accident), then the longer the signal the higher the chance of running out of float precision and actually getting a different results :-)
Round-off/underflow errors can be a problem. (2^-12)^2 = 5.96×10⁻⁸. If using 32 bit float for the calculations, as numpy.finfo(np.float32).eps = 1.192e-07, we'll have √( 1^2 + (2^-12)^2 ) = 1. And it can be quite a bit worse if we are calculating the RMS of a long series of numbers.

In the DR meter calculation, it is probably much less of a problem as it uses the top 20% peaks.

[Edit]
I am wrong again :facepalm: Contribution from the smaller numbers in a series to RMS is actually very small (scale to the square of their ratio). So underflow errors are nowhere as bad as I first thought they would be. Float32 should still be accurate to the 7th or 8th digit.
1771478924714.png
 
Last edited:
Back
Top Bottom