xr100
Addicted to Fun and Learning
No human can hear background noise of music played with SNR of 70db and above
A long video but I've linked to relevant time location, and the point is covered in a couple of minutes.
No human can hear background noise of music played with SNR of 70db and above
1) sound below 50Hz or so is mostly body sensation, not cochlear sensation.Here are a few biological facts:
No human can hear outside 20Hz to 20kHz frequency response.
That depends very much on both the frequency of the sine wave and the order of the distortion. For 2nd order distortions and low/mid frequencies, your statement is mostly true.No human can hear harmonic distortion below 1% for music or below 0.5% for a pure sinewave.
If you mean "local SNR" in time, probably true. You can certainly tell the difference in noise floor in our listening rooms. Yes, they are custom built.No human can tell the difference between music played using 120db dynamic range, and the same music played using 70db+ dynamic range.
I'm not sure what you meant there.No human can hear background noise of music played with SNR of 70db and above
IOW, any audio component that meets the above minimum scores, will sound just as good to any other component that scores higher measurements results...
A long video but I've linked to relevant time location, and the point is covered in a couple of minutes.
sensation
I wonder if you could clarify your use of the word "sensation"? In psychology, as I understand, there is a distinction between "perception" (following processing in our "head") and "sensation" ("raw data" sent by our senses for further processing.) I appreciate a strict split between the two isn't necessarily as clear-cut as might have been believed.
I use analytic signal, i.e. abs(ifft(positive spectrum)) You can weight the spectrum appropriately to a given ERB to get a real good idea of what's up in that ERB.
Thanks, that's what I thought. About 38yr. ago with a little help from our HP apps guy I was able to make one of their 20Hz to 20MHz spectrum analyzers do Dick Heyser's TDS on speakers. He gave me an Easter egg hidden in the firmware that turned off forced autocal which was triggered by what I was doing.
The lowest frequency that can be identified as 'tone' is in the 10 Hz range. Modern music uses bass notes down to 16 Hz (C0, also lowest tone on big pipe organs) without overtones/harmonics.
I listened only a few samples from your set, including "4-bit dithered" and "4-bit truncated". The distortion of sound is pretty heavy in both cases but with this particular sample the "4-bit truncated" version sounds not bad really (even cool)) and I'm sure many people will find it sounding more close to the original than the dithered one. Did anybody else listen them?I find it troubling that the "4-bit dithered" and "4-bit truncated" are so close together and that the "4-bit dithered" gets a worse score. I wonder if you have listened to them?
Dithering usually change artifact signature of bits truncation operation to the extent when the use of df parameter for judging audio quality becomes questionable. The illustration is here in the end of the post - https://www.audiosciencereview.com/...od-for-measuring-distortion.10282/post-290829. ADCs/DACs usually have low df levels where particular structure of degradation becomes less important and df levels become more indicative. I will address the issue a bit later.If an objective is to characterise the performance of DAC's, then how is this helpful, considering that, for example, noise-shaping is fundamental to the operation of delta-sigma DACs?
As artifact signature is not defined for stationary signals in df-metric, df levels above just measure degradation of initial waveform in this case. Audibility of the degradation can not be assessed, sorry.
Instead you can use your particular processings with real music and then df-metric will show some results regarding their audibility. Such results can differ with various music samples.
In df-metric there is no need to measure audibility of distortion with technical/stationary signals. Even if one can elaborate some reliable metric with such signals, it will be extremely difficult to generalize those relations/findings to real music. Testing with t-signals is helpful mostly for designers of audio circuits/algorithms because it helps to find the issues that can be improved/fixed.
I listened only a few samples from your set, including "4-bit dithered" and "4-bit truncated". The distortion of sound is pretty heavy in both cases but with this particular sample the "4-bit truncated" version sounds not bad really (even cool)) and I'm sure many people will find it sounding more close to the original than the dithered one. Did anybody else listen them?
Dithering usually change artifact signature of bits truncation operation to the extent when the use of df parameter for judging audio quality becomes questionable. The illustration is here in the end of the post - https://www.audiosciencereview.com/...od-for-measuring-distortion.10282/post-290829. ADCs/DACs usually have low df levels where particular structure of degradation becomes less important and df levels become more indicative.
I will address the issue a bit later.
desc:Bit Reduction/Dither/Shaping - Truncation Option
desc:Bit Reduction and Dither with Psychoacoustic Noise Shaping and Truncation Option
//tags: utility processing dither
//author: Cockos (EXCEPT added sections)
slider1:16<1,32,1>Output Bit Depth
slider2:1<0,3,1{Off,On,Disabled (Sample Rate Not Supported),Disabled (Overload)}>Psychoacoustic Noise Shaping
slider3:1<0,2,1{Off (Truncate),On (TPDF),On (Highpass TPDF)}>Dither
slider4:2<1,2,0.1>Dither Bit Width (LSB)
// Add slider
slider5:1<0,3,1{Negative,Positive,Zero}>Round
in_pin:left input
in_pin:right input
out_pin:left output
out_pin:right output
@init
N = 0;
coeffs = 0;
// All coefficients John Schwartz 2007.
// http://www.gnu.org/licenses/gpl.html
(srate == 44100) ? (
N = 9;
coeffs[0] = 2.372839;
coeffs[1] = -3.132662;
coeffs[2] = 3.203963;
coeffs[3] = -2.853749;
coeffs[4] = 1.971429;
coeffs[5] = -1.013035;
coeffs[6] = 0.369805;
coeffs[7] = -0.091063;
coeffs[8] = 0.013578;
// Wannamaker 1992 coefficients for reference.
//
// coeffs[0] = 2.412;
// coeffs[1] = -3.370;
// coeffs[2] = 3.937;
// coeffs[3] = -4.174;
// coeffs[4] = 3.353;
// coeffs[5] = -2.205;
// coeffs[6] = 1.281;
// coeffs[7] = -0.569;
// coeffs[8] = 0.0847;
) :
(srate == 48000) ? (
N = 9;
coeffs[0] = 2.077677;
coeffs[1] = -2.721001;
coeffs[2] = 2.602012;
coeffs[3] = -2.157415;
coeffs[4] = 1.398085;
coeffs[5] = -0.84755;
coeffs[6] = 0.373337;
coeffs[7] = -0.161701;
coeffs[8] = 0.003758;
) :
(srate == 88200) ? (
N = 10;
coeffs[0] = -0.037508;
coeffs[1] = 2.14333;
coeffs[2] = -0.089328;
coeffs[3] = -2.38445;
coeffs[4] = 0.376261;
coeffs[5] = 1.940341;
coeffs[6] = -0.463485;
coeffs[7] = -1.241821;
coeffs[8] = 0.157735;
coeffs[9] = 0.412081;
) :
(srate == 96000) ? (
N = 10;
coeffs[0] = -0.26496;
coeffs[1] = 1.847721;
coeffs[2] = 0.692557;
coeffs[3] = -1.949565;
coeffs[4] = -0.691542;
coeffs[5] = 1.415063;
coeffs[6] = 0.463352;
coeffs[7] = -0.792872;
coeffs[8] = -0.229575;
coeffs[9] = 0.156314;
);
(N == 0) ? (
psycho = 0;
slider2 = 2;
sliderchange(slider2);
);
errBufL = coeffs + N;
errBufR = errBufL + N;
@slider
resolution = 2 ^ (slider1 - 1);
psycho = (slider2 == 1);
tpdf = (slider3 == 1);
hiTPDF = (slider3 == 2);
ditherX = slider4 / 2.0;
// For added slider
trneg = (slider5 == 0);
trpos = (slider5 == 1);
trzero = (slider5 == 2);
// End
memset(errBufL, 0, N);
memset(errBufR, 0, N);
p = 0;
zL = zR = 0.5;
rndL = rndR = 0.0;
@sample
sL = spl0;
sR = spl1;
(psycho) ? (
i = 0;
q = p;
loop(N,
sL -= coeffs[i] * errBufL[q];
sR -= coeffs[i] * errBufR[q];
i += 1;
q += 1;
(q == N) ? (q = 0); // % is expensive.
);
);
(tpdf) ? (
zL = 0.5 + (rand(1) + rand(1) - 1.0) * ditherX;
zR = 0.5 + (rand(1) + rand(1) - 1.0) * ditherX;
) :
(hiTPDF) ? (
zL = 0.5 + rndL;
zR = 0.5 + rndR;
rndL = rand(1) * ditherX;
rndR = rand(1) * ditherX;
zL -= rndL;
zR -= rndR;
);
// Implement negative/positive/zero option
(trneg) ? (
spl0 = floor(sL * resolution + zL) / resolution;
spl1 = floor(sR * resolution + zR) / resolution;
);
(trpos) ? (
spl0 = ceil(sL * resolution + zL) / resolution;
spl1 = ceil(sR * resolution + zR) / resolution;
);
(trzero) ? (
snL = sign(spl0);
snR = sign(spl1);
spl0 = ((floor(abs(sL * resolution + zL))) * snL) / resolution;
spl1 = ((floor(abs(sR * resolution + zR))) * snR) / resolution;
);
// End
spl0 = max(-1.0, min(spl0, 1.0));
spl1 = max(-1.0, min(spl1, 1.0));
(psycho) ? (
(p == 0) ? (p = N - 1) : (p -= 1); // % is expensive.
errBufL[p] = spl0 - sL;
errBufR[p] = spl1 - sR;
(abs(spl0) == 1 || abs(spl1) == 1) ? (
psycho = 0;
slider2 = 3;
sliderchange(slider2);
);
);
Probably you wanted to demonstrate the case where lower df levels result in more annoying degradation of the original. This can easily happen as any objective parameter (especially the simple one like df level) or a set of them can not correlate reliably to perceived audio quality due to the very specific character of human hearing (quaintly crafted by the evolution). So, in addition to measured quantity parameter there should be some method for subjective assessment of audibility of a particular type of degradation (qualitative estimation). That's what artifact signatures are for. The idea is simple - audibility of similar distortions can be assessed by their amount. In df-metric the type of a degradation is characterized by an artifact signature of that degradation. The artifact signature is a vector of df levels measured with some music signal (the longer and more varied - the better), it shows how a signal is degraded with time. Such df vectors can be compared with one another and their similarity can be measured in many ways. I use simple Mean_Average_Error-based distance metric (other metrics give similar results). The resulting distance between art.signatures tells us whether we can use df levels for assessment of audibility of the degradation. The distance less than 1.5 - 2.0dB is safe for such inferences. This simple method assumes that we use real music signal because only in this case the vector of df levels consists of different values, which characterize the type of degradation. In case of a stationary signal all df levels are equal and the art.signature is undefined. So, df-metric can not assess audibility of distortions with stationary/technical signals. As we can use real signals for the estimation there is no need to do this with t-signals. The more so as the relation of audibility of a degradation with t-signals to audibility of that degradation with m-signal is very unclear.Not sure what you're saying here. Of course one can define an error spectrum for a stationary signal. Not sure what you're saying here.
Did you listen (at low level) to the original and distorted signals?
Now, correct me if I'm wrong, but a smaller number is better (more negative), yes?
Did you listen to the d2a and d3a signals?
The idea is simple - audibility of similar distortions can be assessed by their amount.
Yes, from a listener perspective both files are degraded/distorted versions of the original. One can be preferred to another depending on pesonal audio taste.By "distortion," are you including the TPDF noise that was added in dithering?
And your music sample is exactly from that era ))The highly audible quantization effects in the truncated file can be considered as a "cool" effect in a "bit-crunched" "80's" way and those of a certain age may find them quite reminiscent of, for instance, certain sound chips found in early home computers and games consoles.
By measuring the distance between vectors of df values (art.signatures). The vectors are de-trended (df-offset is removed) and Mean Average error is calculated.How is their "similarity" defined/known?
Usually dithering and truncation have different art.signatures and can not be compared by df levels. But the less the degradation of a signal (lower df levels), the more similar art.signatures of various types of bit reduction (the type of dithering is less important for 32->24 than for 32->16/8/4). In other words, audibility of only "small" bit reductions can be safely assessed by df levels.Not quite sure what you mean?