Here we go -- explanation of the signal and proof of dynamics processing -- unlike your misguiided (perhaps incompetent) claim about just EQ..
I truly understand that you don't undersand the ramifications of what I write. The fact that your temporal information was far from complete -- what kind of nonsense were you TRYING to say?
It is sad when incompetents do so much pontification..
Evidence (I know that the descriptions are obtuse, but a somewhat competent person who understands the temporal effects of an
expander (even though not main purpose) that there is REAL expansion going on... .
'KSTR': Wanna apologize now -- or soon?
WAVEFORM & SPECTROGRAM
Lets look at the Waveform.
*
Look at the light blue in the middle. That shows an average or RMS type level. Notice that the RAW version has a larger RMS for the same peak. The decoded has lower RMS for the same peak level, therefore most likely DRC expands he signal. Because of the way that FA works, it normally adds only a subtle improvement in dynamics.
*
Look at the peak levels vs. the dark blue -- the dark blue is less dense on the decoded version. That shows dynamics expansion.
Lets look at the spectrogram.
* The decoded version signal ‘maximum darkness/density’ is approx the same level as the undecoded on the loudest sections of the signal, but the average density is a little less. This change in AVERAGE DENSITY is where you get SOME of the AVERAGE spectral difference.
AVERAGE SPECTRAL DIFFERENCE IS ALMOST MEANINGLESS – but admittedly not totally. You cannot use it for sound quality unless there is a profound spectral loss of highs or lows. Expect a difference where the frequencies show significant expansion. AVERAGE SPECTRAL MEASUREMENTS APPEAR TO SOMETIMES BE MISINTERPRETED.
*
Especially in the highest frequency areas (and throughout the signal) where the signal is approx/almost the same level,
the blank or light blue (noise) signal level is even more white (by about 10dB than the undecoded version.) This shows noise reduction. It is very significant to get much NR in the 1k to 3k range.
*
In some cases, the denser levels are a little less dense in some areas -- this is because the density was also relatively less on the original, therefore you see a little, but not much expansion.
But, notice that the voids are significantly more white on the decoded version. These ‘more white’ sections with approx the same peaks show noise reduction AND expansion. When levels are closer together, there is LESS NR and LESS expansion, so you cannot expect a lot of processing in those regions.
*
In between the loud high frequency transients, there seems more of a almost white or white 'gouge' out of the signal. This almost appears to come from the higher freqs, as if the ‘gouge’ is coming from on-high. That ‘gouge’ shows substantive and often very audible noise reduction.
*
The general variation in general brightness (where there IS variation) is greater on the decoded version, yet the darkest, strongest areas are pretty much the same levels. Sometimes at certain freqs darkest are weaker, but also the levels are generally less dark on the RAW versions also. Basically, that difference in darkness simply shows expansion. There really isn’t much substantive NR where the ‘white’ is very temporally short, but DOES show the opportunity for NR.
RAW HW SIGNAL CHARACTERISTICS
RAW FA INPUT (High res):
jdyson@i10900X DaveBrubeck]$ sox /music/*Bru*/01*Turk.flac -n gain -n stats
Overall Left Right
DC offset -0.000280 -0.000164 -0.000280
Min level -0.915195 -0.915195 -0.742085
Max level 1.000000 1.000000 0.762011
Pk lev dB -0.00 -0.00 -2.36
RMS lev dB -22.73 -22.92 -22.55
RMS Pk dB -10.41 -10.41 -11.36
RMS Tr dB -139.72 -139.70 -139.72
Crest factor - 13.99 10.22
Flat factor 0.00 0.00 0.00
Pk count 2 2 2
Bit-depth 32/32 32/32 32/32
Num samples 72.2M
Length s 409.359
Scale max 1.000000
Window s 0.050
DECODED FA (High res):
[jdyson@i10900X DaveBrubeck]$ sox 01*Turk*DEC.wav -n gain -n stats
Overall Left Right
DC offset -0.000000 -0.000000 -0.000000
Min level -0.846001 -0.764966 -0.846001
Max level 1.000000 1.000000 0.685800
Pk lev dB -0.00 -0.00 -1.45
RMS lev dB -25.71 -25.75 -25.68
RMS Pk dB -11.26 -11.26 -13.07
RMS Tr dB -inf -inf -inf
Crest factor - 19.38 16.28
Flat factor 0.00 0.00 0.00
Pk count 2 2 2
Bit-depth 32/32 32/32 32/32
Num samples 72.2M
Length s 409.358
Scale max 1.000000
Window s 0.050
[jdyson@i10900X DaveBrubeck]$
Peak to RMS of 25dB is a little on the high side, and pretty good. Numbers at 30dB or above can sometimes be irritating. The FA peak to RMS is also pretty good at 23dB, but the difference can show some expansion (the expansion is actually more than the numbers imply.) The Crest factor shows an idea about the peaks dynamics. I have some recordings with Crest factor >25, and they often have too strong dynamics. Likewise, some recent ‘loudness wars’ stuff has a peak to RMS of perhaps 13dB (YUCK) and crest factor of 4 or 5 (double YUCK.)
Again, both show pretty good numbers, but the reason for showing them is the implication (but not proof) of expansion.
Any one of the items above can sometimes be explained away, but all of the evidence above *together* shows:
* Dynamics processing
* Noise reduction.
The goal of FA appears to be moderately stealthy/non-obvious distortion, and the goal of the decoder is to avoid showing expansion artifacts while recovering as much of the original signal that remains. It appears that FA sometimes gives about 1-6dB more loudness also.
With the very slightest error in the internal EQ in the decoder (SMALL errors),
* the decoder output can generally blast your hearing,
* blast your hearing with bass,
* blast your hearing with high frequencies,
* extreme bass, or too little bass.
The above shows a hand-in-glove behavior and great likelihood of correctness.
Recent bass problem was related to the choice between one of the two valid EQ methods. One candidate choice is EQ between each layer, and the other possible choice is 'at the end'.
The HF EQ is 'at the end', but the MF EQ is between each layer. Originally, I made the mistake to try to do the LF EQ 'at the end'. THAT became a tweakfest and
very unpleasant. Finally, I decided that MF wasn't special, and tried mapping the 'tweakfest' into EQ between the layers.
Voila -- the EQ between the layers was TWO 1st order EQ. Almost trivial -- why didn't I think of that at the beginning?
* I thought that MF was special, so I used the HF technique.
* Mental lock up with binders on.
* I kept pounding on it with no local feedback telling me that I might rethink the approach
Why was DolbyA chosen for the dynamics processor?
Lots of testing.
* I am expert at writing dynamics processors, and noted that the dynamics were close to that of DolbyA
* I thought that the DolbyA observation was crazy, and it took me at least 1.5yrs after the experiment to decide that DolbyA was really used.
What about this layering thing?
Strange results when using one layer, and inadequate expansion.
I noticed a periodic optimal calibration level at 10dB intervals. It seemed like the compression was done in chunks. It took about 2-3yrs to realize that multiple DolbyA units were used. This was also during that time I was busy developing the best-ever decoder that processes Dolby A recordings
Why so much trouble with layering?
Imagine the layering of details. There could be EQ between layers -- or not. There could be signal level changes, or not; calibration differences -- or not. The decoder needs to match the recordings calibration, but that issue hadn't been solidified until the EQ was correct and the levels between layers was correct. The EQ was very hit-or-miss, but at least I was guided by what was required on input/output, the characteristics of Dolby A, and also my listening to the signal. My ‘rules’ included using STANDARD EQ, STANDARD gain.
Why is the EQ needed?
* DolbyA units are NOT flat (ask anyone who has heard their raw output), so to convert them into a multi-band expander complex, careful EQ is needed. For decoding, even more careful EQ is needed than for encoding. Encoding has more freedoms than decoding. I don't think that FA was never intended to decode, so decoding was a real pain in the butt, especially without specs in hand.
* ALL FA EQ IS EVEN FREQUENCIES AND EVEN GAINS. (e.g. 500Hz, 250Hz, 200Hz, 150Hz, 100Hz,75Hz, 50Hz, 25Hz -- also 1kHz, 3kHz, 9kHz, 18kHz, 21kHz, 24kHz.) I found that the lower frequencies for the original LF EQ aborted method morphed into 500Hz and 75Hz with a standard -3dB gain.
* the gains are ALL +-1.5dB, +-3dB, +-6dB, +9dB (I dont' think that the gains even go as high as 9dB -- gotta refer to the code.)
* All FA EQ are even numbers -- or the results suck REALLY BAD, not just a matter of taste, but might hurt hearing or speakers (or not hear much at all.) The slightest change for 500Hz or 75Hz EQ makes profound differences (both gain and frequencies.)
DA is different, using fractional numbers because of semiconductor component selection and the use of exponential coefficients related to semiconductor (selected diode/JFET) characteristics. The base attack/release stuff are pretty much even numbers. There was a real trick to set up the control for attack release as related to the state of the diodes and JFETs.
How much NR to expect? (definite secondary purpose)
On older recordings that have strong audible hiss, maybe between 10-20dB.
* I consider the Brubeck recordings are on the lesser end of strong hiss,
but the 'Nat King Cole story' has more hiss similar to most of the Simon & Garfunkel, oldest Carpenters albums Tijuana Brass, etc. In the case of strong audible hiss, I'd expect 10dB of 'gouged out' NR, also, about 10-12dB of almost NR at the very highest frequencies.
* For in-between recordings with moderate/low audible hiss, Perhaps expect perhaps 15-20dB of NR. Of course, On these recordings, NR is less needed, when there is less noise to begin with. Compander NR tends to be more effective when it isn’t needed as much.
* On newer, pseudo digital or recent analog tape, perhaps as much as 30dB of NR. Of course 30dB NR isn't needed, and sometimes can hit the 16 bit limit. The reason why the recordings are only pseudo-digital: FA encoding process IS analog.
Where is the 'expansion', I don't hear it?
Expansion is at -20dB and below. You will NOT always perceive it much as expansion because most of the peaks are intact or even louder relative to the lower levels as short transients.
The low level expansion acts more as 'reorganization', but also provide low level expansion and noise reduction. The FA decoder does not sound like a normal expander in the rack.
THE NR IS OF SECONDARY IMPORTANCE, BECAUSE NR WOULDN”T BE NEEDED WITHOUT FA ENCODING.
Why don't *some* people like the decoder?
* It shows that their beloved archives have been damaged all along and really DO have 'the digital sound' as complained about in the 1980s' Most of the 'home' and 'snake oil' remedies are just that. This could be an ego thing about being cheated, or it might simply mean that they are happy anyway. I intend to keep it that way – I don’t want to be a downer for the innocent. Never try to ‘educate’ the innocent other than a mention.
*There IS accommodation to the distorted stereo image, distorted dynamics, even the hiss. There is NO reason to convince those people with accommodated hearing, because they are happy.
This accommodation to the distortions on the recordings is little different from that of smell, taste and vision.
The audiophile hobby is about happiness and enjoyment. My interest is not to hurt peoples enjoyment,
but technical types should really have open minds about the improvements.
Why do the distributors distort the recordings?
I don't really know, but I'd suspect that it has something to do with IP considering the active push back that I am getting. Doesn't it make sense that they don't want to give the family jewels to just anyone?
How much does the decoder cost?
Free forever, source is coming -- just started cleaning & better organizing source code, removing old reverse engineering tests & guesses. Will probably remove the compressor, limiter, and some of the obtuse user controlled post and pre-decoding EQ. The compressor design should be further developed, also the limiter. The quick, 10 minute development was an interesting experiment because I used the Dolby A attack/release scheme.
Why is the decoder slow?
There are a couple of reasons.
* Even the lowest quality modes use very accurate Hilbert detectors for HF0, HF1. (3kHz/9kHz bands.) Hilbert detectors don't work at LF and MF for DolbyA because the gain control waveform must be compliant with a DolbyA to remove the innate distortion. The distortion is created by the mix of fast attack/release and the low frequencies. This distortion must be undone hand in glove fashion. The reason for using Hilbert detectors is that the DolbyA detector design is vulnerable to a form of gain control modulation that creates an effect similar to IMD. This pseudo-IMD is quite different from normal gain control ripple, which is depended upon for accurate LF and MF decoding.
* Everything is done in high precision, several important transcendental operations, careful filtering of the controls signal to keep it from moving too fast. This careful slowdown avoids creation of some of the DolbyA fog. Even though the control signal is filtered, it is fast enough to pass the fastest transient.
* The steps are queued between threads so that a multi-core CPU can take better advantage.
* Not only 8 steps per layer, but there are a standard 7 layers for FA decoding. This means 7 Dolby A compatible decoders along with approx 10 threads per layer. That is a LOT of CPU loading.
Why specifically are the higher quality modes very very slow?
There are several reasons:
* All of the reasons above.
* The anti-MD trimming of sidebands. Basically it moves the control signal slew around to make the fog less apparent. The gain control signal*gain calculation is buried intimately into the anti-MD, so there is NO separation between anti-MD and gain control. This allows intimate reshaping the gain control. This is not an obvious hack, but is a subtle way of bending the gain control to move modulation sidebands to lower amplitude portions of the audio signal without temporal problems. This uses some nonlinear math on the full analytic signal.
* Lots of lower sideband filtering. This helps create a smooth sounding result without negative side effects other than slow operations. It mitigates a form of IMD. IMD is mostly an ugly or foggy sound.
* Splits the bands so that the distortion products mix less during gain control slew. This also helps to create a smoother sound (less distortion products due to nonlinear mixing.)
* The higher quality mode calculations (very accurate Hilbert transforms and LF side filtering) are SO VERY SLOW that they are running in 8 separate threads, except for the highest quality modes, reserved for decoding DolbyA materials which runs in 16 threads.
Spectrogram:
https://www.dropbox.com/s/xzbcj6uiri6073w/Screenshot from 2021-04-09 09-05-33.png?dl=0
Waveforms:
https://www.dropbox.com/s/or0nb665k52ff8f/Screenshot from 2021-04-09 09-00-03.png?dl=0