Good status of the recording correction (FA) project

John Dyson · Apr 3, 2021

Rather good progress has been made... All I suggest, listen to snippets before reading further -- then, read. A more clean, more natural result appears after the processing by the decoder. Do not expect the fake, distorted, calliope type bass of many consumer recordings -- that fake calliope isn't normal bass, but compressed and distorted.
Also, the highs might sometimes appear to be over-emphasized, which they JUST MIGHT BE. However, the highs should sound very coherent, and not confused into a fog or fuzz like on many consumer recordings. It is possible that some EQ might be needed, but the only recent EQ request was a little more lower midrange, and that has been done. (Frustratingly, without 'tweaks' per-se, it is necessary to find an entire, new EQ sequence -- but once the correct one is found, then EVERYTHING works.)

https://www.dropbox.com/sh/tepjnd01xawzscv/AAB08KiAo8IRtYiUXSHRwLMla?dl=0

Not all snippets are perfect comparisons (RAW vs V2.2.2B), but these are NOT cherry picked. Some show profound
improvement, some show minor improvement, and for those who are used to the calliope-like bass from consumer recordings: where is the bass?

Sometimes, there might be an EQ issue, but should be on the order of a few dB on the ends of the spectrum, at most. Some results might need
'mastering'. If you don't notice 'RAW' in the filename - the raw CD is the version with all of the hiss on older recordings. All of the recordings use the same settings (except asingle minor switch on about 3-4 Carpenters recordings, and would also be some of Linda Ronstadts recordings if I had demoed them.)

This SW undoes (apparently close to correctly now) most the original 'Digital Sound' complaint from the 1980s. The 'digital sound' problem was never solved till now. The problem wasn't because of digital tech at all, and couldn't be practically solved until CPUs of the 2010 era. Even disks like Telarc's ''Ein Straussfest-Erich Kunzel' type wide-dynamics CD are compressed by the 'standard' technique. When correcting the CD, then the canons really ARE a lot louder than the rest of the recording, just like real life.

* On typical recordings, after decoding, you might sometimes notice anywhere from 0-6dB of the 'loudness wars' lossage go away. That is, the recordings will sometimes sound less loud for the same peak levels. This effect is extremely variable, but the decoded result is almost never as loud as the original.

* Sorry about the slow, random progress on the project. Imagine software of this complexity, with multiple layers of NON-AVAILABLE specs. At least I have DolbyA and DolbySR schematics to emulate those. There is NO HINT about what the compression on consumer recordings is -- now, I can tell you (early, slightly bugged docs do publcally exist now, soon to be updated.)

Pointer to software:
https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0

What does the software really do? Undoes the final phase of compression done to consumer recordings before distribution. The program is not just an 'expander', and the problem cannot be solved by using an 'expander box'. The compression is purely algorithmic, and is done by essentially 1960s through 1980s electronics HW. The decoder closely emulates that 'ancient' HW, but in reverse.

The quality of the project results have been like a 'drunken walk' partially frustrated by my +-5dB hearing on both ends of the spectrum. I am not deaf, but instead have extremely variable hearing -- therefore settings that work well at one time, sound bad (to me and everyone else) the next time. The settings are NOT tweaks, but still must use hearing for a guide post. Once I find the approx correct setting, then everything 'locks-in'. However, the bass and the final version of the midrange needed serious iterative testing -- I couldn't find any clear 'tells' like I could find in many other places.

There is little or no real 'tweaking' in the design of the higher level 'decoder' project, but there are integral (in dB) steps for EQ, like 1.5dB, 3.0dB, 6dB, 9dB, 10dB, 12dB type numbers. I had troubles, but solvable troubles until the last two unsolved design issuess. Up until the last three months, I could find 'tells' for reverse engineering errors, but on the last two matters, I had to moslty depend on my own listening judgement, and with my extremely variable hearing -- it has be full time whack a mole.

The design is highly layered and the consumer recording compression is NOT solvable with a single expander. Instead, the expansion is done on a segmented basis with attack/release times in the 2msec - 8msec for attack, and 40msec to 160msec for release, depending on frequency and waveshape. (Yes, the attack/release is mostly NOT R-C timeconstant, but is instead a variable time constant based upon diode nonlinearities for the calculation.)

Making a long story short, some people reading this might be thinking -- here is the crazy guy again... This has been a painful project
for three reasons:

1) Nay saying, industry pushback about the reveal of poor commercial recording quality.
2) ZERO spec anywhere/complexity-depth of the process
3) Extremely variable hearing on the primary developer/frustration of the potential users because of that.

Importantly, this SW is free, but simple-to-use command line. Secondly, the design is based on reverse engineering,
but with a lot of design adjuncts to overcome the rather obvious fact that the recordings were never intended to be recovered.
Changing settings, even calibration level is seldom needed. Normalized CDs and digital copies are troublesome, but otherwise
the recordings often (most of the time) work out-of-the-box now. FREE DOES NOT MEAN WORTHLESS, instead if you
use this software, it is probably the single most complex piece of audio processing that you have used (including automatic room
equalizers, which MIGHT come close.) This software does NOT have a GUI, but I don't do GUI, I don't do Android and I don't do Windows.

Starting from day one -- the ideas below were paramount. However, I often violated the 'keep it simple' approach. I would have
benefitted from keeping things simple. Eventually, the LF EQ that I have so painfully worked through ended up being the simplest
approach that I had tried so far!!!

1) attempted simple design because of reverse engineering, recognizing that compressor HW would limit complexity
2) platform/infrastructure with built-in support to undo the creation of distortion & sidebands created by a lot of fast compression

* The infrastructure was prepared to correct for the noise/sideband/distortion created by dense/fast compression WELL BEFORE
the full solution was found. The general sense of the needed design has been well known for over 2yrs, the exact numbers and configurations
have been incredibly variable.

This recovery is NOT ad-hoc, but is purely algorithmic. The 'compression' used on consumer recordings is so consistent that after a lot of
work in the decoder, now there is only one, relatively uncommon adjustment -- some older recordings use a slightly different
EQ. Normally, there is not even a requirement to adjust the calibration (level match for compression curves), and when needed,
it is a number like -1, 0, 1, 2 (also, -2, which is the default.) EQ mode switch is sometimes needed for certain older recordings.

There is evidence that the decoder is correct now (modulo minor EQ problems) simply because the user supplied settings are 99% no longer variable. That is, the EQ and levels are set in that 'magic niche' where all recordings work (except a few.)

I am not prepared to officially announce this *almost finished* version of the FA decoder right away, because I'll be looking for a few last minute bits of feedback (I sure hope no more requests for more bass!!!) However, here is a pointer to the decoder along with primitive docs for its use. There are Linux & Windows versions for SSE3, AVX2 and AVX512. The versions are separate because of the huge amount of interacting, very advanced usage of SIMD -- and adding class member function calls that aren't inline just add overhead. Maybe, someday, I'll segment the code better. AVV512 can help about 30% faster in certain cases. Use AVX2 when you can, or at least use a fast machine if you can. The source code will be freed as soon as I garbage collect the code, and try to document some of the more esoteric sections (e.g. anti-MD.) I use 2core, 4core and 10core machines, but the decoder can benefit to even more cores (probably 18 or more) when running in high quality modes. It takes advantage of SIMD, and benefits primarily from high SIMD performance.

Pointer to the decoder (again):

https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0

John Dyson · Apr 3, 2021

MInor update -- it *looks* like I'll be able to do a more tested release tonight. The big bugaboo has been to effectively 'play Tetris' and find the right building blocks for the most correct, clean bass. There are no 'tells' for the correct bass other than judgement and comparisons with feeback from the user base when testing other versions. Historically, the bass has been a little 'dirty' or 'thin', but never the 'clean in between'. When doing 1st order EQ, errors do not sound like the 2nd order (and higher) EQ that consumers normally deal with. I am not finalizing the EQ based on 'what sounds good to me'. It is being finalized based on:
1) Simple design, based on HW design limitations of 1960 through 1980s.
2) Bass clean, tight transients.
3) Bass doesn't overwhelm vocals.
4) Bass meets intensity criteria based on previous user comments.

I would be doing a very clean, objective design of the EQ, but there are ZERO specs. The project has been like this all along... The only reason why I figured out the dynamics processing is that I could detect the dynamics behavior and match it to a ubquitious dynamics processing device that was common since the 1960s. That match was pure luck. It took *literally*, *really* 2yrs to find the correct dynamics processing method.

Anyway -- I have been doing a myriad of comparisons with at least 5-10 previous versiona and feedback associated with those. The A/B comparisons on the current version under test appear very favorable in most ways. Right now, running the 'final check'. This 'final check' are the demos (both public snippets and private full-recordings) that I offer. If the results meet my criteria, then the snippets/full recordings will be uploaded to the 'same placess' about 1-2Hrs before the release.

I do releases typically at 9:00AM or 9:00PM USA Eastern time. I'll announce when ready -- but like any SW, it might be delayed... No guarantees, but I expect it WILL be ready when I predicted.

John Dyson · Apr 4, 2021

Release V2.2.2J is working very nicely. It is at the usual location, but only 1/2 way through the demos. They will be uploaded when complete.I Now offering the decoder at other sites, it is REALLY good now.
About the bass -- I pushed it as far as I can, but if I add too much 1st order bass, it muffles the vocals.
To get more bass, there must be some post decoding mastering with 2nd order and/or parametric EQ.

There were a few profound bugs in previous demos here today/yesterday -- it was about a debugging test that I forgot to disable, which turned off the pre/de-emphaiss all together. The effect is that the highs tend to slide around (strange effect) unless the pre/de-emphasis is enabled. There are other bad side-effects also. This is a fully functional release.

Decoder location (been available for a while): https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0
Snippet Demos (about +1Hr away): https://www.dropbox.com/sh/tepjnd01xawzscv/AAB08KiAo8IRtYiUXSHRwLMla?dl=0

More than likely, by the time you read this, the V2.2.2J snippets will be available. I am also able to keep snippets online much longer because there is now more space on my Dropbox account.

The *only* known caveat is noted in a message in the decoder download area. it is about the de-emphasis, and which route to take. I have chosen the most conservative (and probably most likely correct) choice for this release.

At this release, the new decoder development becomes my secondary project, but my new temporary primary project will help support the decoder. I am writing a simple little set of subroutines that will automatically convert ANY Laplace domain filter spec to 'z' domain -- with NO limitations about order, etc. I know how to do it, but will take a few hours/maybe a day. I am tired of using the fixed-purpose 2nd order converter that works poorly on 1st order EQ.

John

John Dyson · Apr 4, 2021

The snippets referred to above are ready.
Tomorrow, 4 Apr in approx 24Hrs after this posting, there'll be a wrap-up release based upon comments on this one.
This is pretty much a project wrap-up, where internal cleanups (e.g. the better 's' to 'z' converter, more internal docs)
will be the primary thrust from now on.

Possible problems:
1) try alternative HF EQ (which just might help) (Hard to describe the different sound.)
2) try less LF EQ. (add -3dB below 50Hz down to 20Hz.)
3) broken decoder version... (whoops, cannot test AVX512 on Windows, for example.)

Even though the decoder has taken a long time to write (without specs, etc) -- there are still loose ends from
time to time, and there are NO testers other than myself. In a commercial situation, I wouldn't allow a release to be done without
internal testing. I am stuck without ANY resources trying to get this finished.

Also, my hearing has been messed up over the last few hours -- so my choice of HF EQ (there are two possiblechoices) might not be correct -- but might also be perfect. I cannot tell. The LF might benefit from -3dB at 20Hz (or not.) Again, I cannot hear well.
(The decoder has support internally for the available choices -- I simply need to permanently enable the correct ones.)

These changes are 'block' additions or subtractions, not a variable tweak. There are 'rules', and things should not
just be 'changed' wiithout considering the rules - or choosing new rules/reworking the subsection from scratch.

John Dyson · Apr 4, 2021

For those who might be skeptical about the capability/quality of the decoder, I'll produce REAL WORLD data point with ACTUAL non-FA encoded materials (yes, early vinyl, not cr*pped on like most of our libraries.)... I have very few before and after examples, here is a 100% CONCRETE example.

THE DECODER IS NO JOKE.

The decoder IS real, and works better than any remedial home-brew EQ, or buying $10k speakers that only reveal the distorted consumer recordings more accurately. When you use the decoder, it is probably the most sophisticated piece of audio processing software that you use, including any DAW or even single ended recovery program. This decoder is a TRUE, hand-in-glove decoding against the damage done to most of your favorite reocrdings. IT IS NOT A 'hail Mary' NR system, but instead is a true decoder. The NR on this decoder can be 30dB or more when compared with what you normally listen to, depending on the recording.. It can be amazing. The NR is TRUE, and if there is noise modulation, it comes from a very very noisy signal.

Subject: snippet of 'Linda Ronstadt, Just one look'.
1) Original vinyl example
https://www.dropbox.com/s/wgko1yljytaew3z/JustOneLook-vinyl-snip.flac?dl=0

2) Decoded with decoder/missing 80Hz EQ (has been mistakenly omitted for the last several months)
https://www.dropbox.com/s/i1yd556jox20u1v/JustOneLook-dec-snip.flac?dl=0

3) Decoded with similar to later todays' release. yes, this version can/will produce the extreme bass that I dislike so much. I'll probably support a disable option for the bass EQ, but the bass EQ will be the default and IS needed for accurate reproduction of recordings (just, sometimes I do NOT like 'accurate' when it comes to lots of bass.)

https://www.dropbox.com/s/bqp6y53i2tg3h0e/JustOneLook-NEWdec-snip.flac?dl=0

John Dyson · Apr 5, 2021

Got good feedback from some people helping with testing/reviewing the results. Unfortunately, the mind is willing, but the flesh is weak. I'll have to do the release tomorrow at +26Hrs from now. This thing is ALMOST ready to wrap-up.

q3cpma · Apr 5, 2021

So this is something to try to revert compression? Since this is a process that can't be mathematically reverted, what are you doing?

John Dyson · Apr 5, 2021

q3cpma said:
So this is something to try to revert compression? Since this is a process that can't be mathematically reverted, what are you doing?

Yes it does revert the compression. But, the compression is MUCH MUCH more complex than a normal 'compressor'. Uses .wav file I/O, respects most metadata, supports BEXT (generation/modification), RF64, etc. Rates for consumer use: input: 44.1k -> 96k, 176k->192k, 352k->384k. Output: 66.15k->96k, 176k->192k, 352k->384k. Input: 16bit, 24bit, FP. Output: 24bit, FP.

There is a better release coming out this afternoon (probably 9PM USA Eastern time -- I release at 9AM or 9PM), finally found the 'bass' problem. It was a *stupid* bug, where I had disabled a section of code for testing, and forgot to re-enable it, therefore loss of 6dB of <50Hz. Silly me. I knew that there was a 'problem' with the bass, but also knew that the code was correct -- didn't realize that I had disabled it with a temporary 'if(0)' type statement!!!

Anyway, about the 'reversion' of the compression. That term 'reversion' or 'inversion' is probably more accurate than just 'expansion' because of the complexity of what is going on. Below are some characteristics of the compression used on consumer recordings:

1) Multi-band -- 20->80Hz, 80->3kHz, 3k->9kHz, 9kHz->20+kHz... However, the 3kHz and 9kHz bands are overlapped. The compression threshold for the 80->3kHz band is approx 10dB lower than the other bands. The compression attack time is approx 2->4msec at HF, and 4msec->8msec at LF. The release time is between 40msec->80msec at HF and 80msec->160msec at LF. The attack/release is not a straight RC time constant but is variable based upon current and previous signal state (based on signal history, can be calculated in a reproduceable fashion.)

2) Multiple level -- the compression is done at levels starting at (arbitrary numbers): -60, -50, -40, -30, -20dB and then -70, -60 again. Each range is compressed at something similar to 2:1 and has a 20dB input range for a 10dB output range. So, the compression has a sense of scrambling the signal.
3) Pre-emphasis and de-emphasis, even with the multi-band scheme. Part of the pre-de emphasis is a 9dB boost/cut at 3kHz on up, a 12dB cut/boost at 80Hz and below. Also, approx 9dB dip in the 1kHz to 3kHz range.

The most important thing about the signal mangling (on almost every consumer recording) is that the signal above approx -20dB is not strongly affected, so the sound is *generally* normal, except for the low levels.

Sound character of the compression is a 'swirly' or 'smushed' high end, e.g. cymbals are totally squashed. Midrange sounds 'nasal' when compared to a clean recording -- lots of other frequency defects. For stereo image -- it 'telescopes', so even though the image is 'plausible', it is quite distorted and the strongest easily detectable damage is that the stereo image at 90deg and 0deg is plausible, but the 45deg image has an empty or suppressed spot.

---------------------------------------------
The processor UNDOES all of the distortions above, and more. One of the side-effects of the insanely fast attack/release times used is that there is a 'sideband cloud' created around audible details. The processor/decoder even mitigates a large part of that 'sideband cloud' even reaching back a little into encode/decode cycles of certain NR systems.

Suffice to say, the project is incredibly complex for audio processing, and has taken a LONG AND PAINFUL time. I have exhausted the interest of several generations of potential users, but not really of much consequence in the longer term. This is NOT intended as a money making enterprise, none of the technology has been 'stolen', and the 'meat' of the algorithms have already been privately distributed, and going to be readied for full public distribution in some number of months (code cleanup, documentation, etc.) there are some never-before-used algorithms in the code, and frankly I'll eventually need some help from REAL 'math experts' to work things out. (Nonlinear analytical function processing.)

Like I wrote, this has been painful, and I have done a lot of demos, in retrospect, were very embarassing. The only saving grace is that this SW is hidiously advanced -- NO-ONE has ever done this before, or even written software versions of the basic expander element used in the decoder.

I'll announce the release when ready. I am running off my massive test recordings for the to-be-released version right now, and need to do some serious critical listening. Since the changes are only in very specific areas of the code (not in the very large support infrasturucture -- band-splitting, .wav/metadata I/O, rate conversion, multi-thread messaging support, basic command line infrastructure -- those are all SOLID.) The only changes are to some EQ at this point -- not even changes to the expanders, which are SOLID and being regularly used more and more by professionals for their own specific purposes. (The expander section by itself is not applicable to home users, but the entire complex is.)

Will announce shortly (hopefully in approx +14Hrs from posting time.)

John

KSTR · Apr 5, 2021

Very interesting, subscribed.

FWIW, I've broken perfect code as well with dreaded " if (0) or "#if 0" statements and was too lazy to place a proper compile-time #warning, meh.

KSTR · Apr 5, 2021

Ahm, you are undoing the deep compression (at ~ -20dB and below) "low level density enhancement" only and not the IMHO much more annoying top level limiting/compression/soft-clipping which kills all the louder peaks (snares etc)? I personally always liked the low level density enhancement but only when the top levels are not affected so the "punch" remains fully intact.

John Dyson · Apr 5, 2021

KSTR said:
Ahm, you are undoing the deep compression (at ~ -20dB and below) "low level density enhancement" only and not the IMHO much more annoying top level limiting/compression/soft-clipping which kills all the louder peaks (snares etc)? I personally always liked the low level density enhancement but only when the top levels are not affected so the "punch" remains fully intact.

The more annoying top level compression is not doneby the compression process. The real damage is that the low levels are very aggressively scrambled by the compression process. it isn't just about one compressor, but 7 of them are used -- it is an ugly thing. Also, the stereo image is badly distorted. Compression does have an effect on stereo image, but this compression is insidious because it is so fast, and the compression is done in tandem on various portions of the signal.

I think, very seldom, would ANY artist use 7 mult-band compressors, each individual band has 2msec->8msec attack time and 40msec->160msec release time, and the attack/release is faster for larger excursions. The compression is only relatively gentle for small level changes, and very fast for larger level changes. There is a LOT of scrambling going on, and lots of changing phase shift, even though the HF filters do have fairly low Q (just under 0.50), the phase still shifts all over the place. I used to have a more complete diagram, somewhere I still do. However, here is a very simplified ascii diagram of what the decoder does:

input (CD) -> input EQ -> layer EQ -> DolbyA(0) -> layer EQ -> DolbyA(1) -> layer EQ -> DolbyA (2) -> layer EQ -> DolbyA (3) -> layer EQ -> DolbyA (4) -> layer EQ -> DolbyA (5) -> layer EQ -> DolbyA (6) -> output EQ -> output

The encoding process is the inverse of the decoding process. Yes, those 'DolbyA' units are the same kind of things that are used to 'spruce-up' vocals or do NR in the olden days.

Here are the current sets of comparisons. Not all decodes are major improvements, and there are problems with the 1970s Carpenters examples because it seems like their EQ was slightly different. I will never claim that the results sound better -- because 'how good' is a matter of personal perception. I do believe that the decoded results come closer (not perfect) to what was artistically produced.

1) listen for hiss.
2) listen for stereo image.
3) less nasal sound after decoding
4) If there are any examples, listen for cymbals, etc.
5) 'thud' bass instead of compressed/fakey bass.

EXAMPLES: (RAW is from CD or download, DEC is decoded)
Dropbox .flac player sucks -- so just be aware of that!!!

https://www.dropbox.com/sh/v90m7q56g64tfgo/AACao_I34J7x2ZJu91qpKG4wa?dl=0

KSTR · Apr 5, 2021

@John Dyson, I'll give it a try and will try report back, though atm I can only listen over headphones which is not that very well suited to check spatial issues and compression, for me at least, even with crossfeed (complete reconstruction/overhaul of my living room is due).
No offense but your posts (and the provided docs) feel a bit dense and congested to me and hard to understand even for experts, that's probably why there seems to be little response to this interesting stuff.

As for music mixing and mastering, I did some semi-pro recordings in the late '90ies / early '00 years where I used several layers of C4 multiband comp and L1/L2 limiters (you sure know those plugins, by Waves Inc, Israel) on busses and the master so I generally know what you're speaking about, and the net benefit outweighed the destruction done, at least for that kind of music (Prog Rock and other, more heavy stuff). With the C4 being minimum phase in it's bands, a lot of excess phase was introduced which took out some of the speed and impact of the drums but a try with a phase correction (convolution with time-inverted impulse response of the master C4 set to 1:1 ratios) produced transient pre-ringing (as expected, in hindsight) which was worse so I lived with it.

John Dyson · Apr 5, 2021

KSTR said:
@John Dyson, I'll give it a try and will try report back, though atm I can only listen over headphones which is not that very well suited to check spatial issues and compression, for me at least, even with crossfeed (complete reconstruction/overhaul of my living room is due).
No offense but your posts (and the provided docs) feel a bit dense and congested to me and hard to understand even for experts, that's probably why there seems to be little response to this interesting stuff.

As for music mixing and mastering, I did some semi-pro recordings in the late '90ies / early '00 years where I used several layers of C4 multiband comp and L1/L2 limiters (you sure know those plugins, by Waves Inc, Israel) on busses and the master so I generally know what you're speaking about, and the net benefit outweighed the destruction done, at least for that kind of music (Prog Rock and other, more heavy stuff). With the C4 being minimum phase in it's bands, a lot of excess phase was introduced which took out some of the speed and impact of the drums but a try with a phase correction (convolution with time-inverted impulse response of the master C4 set to 1:1 ratios) produced transient pre-ringing (as expected, in hindsight) which was worse so I lived with it.

Thanks for comment. Haven't officially announced yet, but the V2.2.3A release is available along with snippets. (I can make the full versions privately available to discuss certain issues.) For people not using the decoder before, read the general DolbyA document, and then look at the 'StartUsing' document. If you still have troubles getting started -- just tell me, and I'll definitely be able to get you started using the decoder. Sorry about the sucky docs, but the project is mostly just me, and has totally overwhelmed me for >5yrs...

Snippets (with CD snippet originals.) Version ID is kept in the directory name, just FYI.
https://www.dropbox.com/sh/tepjnd01xawzscv/AAB08KiAo8IRtYiUXSHRwLMla?dl=0

Windows & Linux command line binaries:
https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0

The binaries don't have dynamic CPU type support -- everything is tightly inlined to avoid subroutine calls/passing long SIMD variables. So,
there are SSE3A versions (da-win), CORE2 (da-core2), AVX2 (da-avx) and AVX512(da-avx512).

On Linux, at least, on a 10 core 'X' processor, the da-avx512 version gives about 30% faster perf than AVX2. The AVX2 version, however is very efficently scheduled, and will use every last CPU resource that it can find (up to about 20cores when actually needed.) On current, reasonably quick machines, can run realtime in the 'lowest' high quality mode. On super fast machines (like a 10core more more X machine), the program can run in realtime in the normal highest quality mode. The program will NOT run realtime on a 2 core i5 or Atom, even in the lowest quality mode.

The program does LOTS of heavy math -- not just a simple set of gain control programs -- and is actually lightening fast for what it does (schedules the CPU very well, and attempts good cache locality.)

Oh well -- so much for the deep-tech stuff.

Enjoy!!! (I guess this is now the 'official' announcement. if you find a 'V2.2.3B' appear, just use that one. I won't update V2.2.3A unless there is good reason to...

KSTR · Apr 6, 2021

Some quick first impressions (honest and unfiltered) :

Mrs. Robinson : Proc'd has less midrange body, zingy-stinging treble, more sibilance. I would prefer the original here, in that regard. Soundstage is nicer on the processed take.

Here Comes The Sun: Original is sure boomy at the very low end, but the processed version doesn't sound really good either, some sort of boxy sound. The treble, on the drums, is broken in either case (no way recover missing top end). Spatial rendering I again like a bit better on the processed one.

Mama Mia: Same tendency, processed sounds honky and zingy to me, less balanced.

Danny's Song: Way(!!!) too much lower midrange very noticable on the onset of the vocals, in the processed version, that one went wrong I'd say.

Different EQ grossly dominates (and mostly not for the better I feel) and that makes it almost impossible for a fair compare because we adapt fairly quickly to EQ and then instant switching to the other version always sound just plain off, no matter which is which. Testing would certainly need to "recalibrate" the ears to a standard (say, pink noise for simplicity) otherwise I think it doesn't work (for me). Or just enough pause to reset in between... that's what I did on "Danny", and the pronounced tubbyness is obvious even after a "cold start" (on my Senn HD700, attached to RME ADI-2 Pro FSR which is certainly blameless).
What's your monitoring situation, btw?

Not much difference heard on dynamics (where I expected the most impact would be heard, but see disclaimer wrt headphones) or noise, at least not what might be just from EQ effect alone -- no way to know unless this parameter were the only change. I did a quick check with DeltaWave and see a lot of low-level gain has been reduced, though (like 20dB++?) -- unless the plot is (additionally) skewed by other factors, that is (which it probably is, given the EQ and spatial differences).

In general there is so much difference in several domains at once that it's really hard do comparisons in focussed "spot" tests... ideally, one might need to make long relaxed listening sessions (one album in one go) with less focussing on details and just let the overall feeling of listening pleasure/annoyance sink in (after dialing in any general "correcting" EQ to taste), listening in a more holistic way so to say, and then make a preference rating. Then probably any better rendering of space and low-level dynamics might really get through...

Technical note: The different sample rates (and start offsets) of the snippets are also not that ideal, introducing another variable.

Hope this helps.

John Dyson · Apr 6, 2021

KSTR said:
Some quick first impressions (honest and unfiltered) :

Mrs. Robinson : Proc'd has less midrange body, zingy-stinging treble, more sibilance. I would prefer the original here, in that regard. Soundstage is nicer on the processed take.

Here Comes The Sun: Original is sure boomy at the very low end, but the processed version doesn't sound really good either, some sort of boxy sound. The treble, on the drums, is broken in either case (no way recover missing top end). Spatial rendering I again like a bit better on the processed one.

Mama Mia: Same tendency, processed sounds honky and zingy to me, less balanced.

Danny's Song: Way(!!!) too much lower midrange very noticable on the onset of the vocals, in the processed version, that one went wrong I'd say.

Different EQ grossly dominates (and mostly not for the better I feel) and that makes it almost impossible for a fair compare because we adapt fairly quickly to EQ and then instant switching to the other version always sound just plain off, no matter which is which. Testing would certainly need to "recalibrate" the ears to a standard (say, pink noise for simplicity) otherwise I think it doesn't work (for me). Or just enough pause to reset in between... that's what I did on "Danny", and the pronounced tubbyness is obvious even after a "cold start" (on my Senn HD700, attached to RME ADI-2 Pro FSR which is certainly blameless).
What's your monitoring situation, btw?

Not much difference heard on dynamics (where I expected the most impact would be heard, but see disclaimer wrt headphones) or noise, at least not what might be just from EQ effect alone -- no way to know unless this parameter were the only change. I did a quick check with DeltaWave and see a lot of low-level gain has been reduced, though (like 20dB++?) -- unless the plot is (additionally) skewed by other factors, that is (which it probably is, given the EQ and spatial differences).

In general there is so much difference in several domains at once that it's really hard do comparisons in focussed "spot" tests... ideally, one might need to make long relaxed listening sessions (one album in one go) with less focussing on details and just let the overall feeling of listening pleasure/annoyance sink in (after dialing in any general "correcting" EQ to taste), listening in a more holistic way so to say, and then make a preference rating. Then probably any better rendering of space and low-level dynamics might really get through...

Technical note: The different sample rates (and start offsets) of the snippets are also not that ideal, introducing another variable.

Hope this helps.

I understand your opinions, and they are valid because of being based on perceptions. As I have stated almost everywhere, sometimes the reversion back to closer to original will not be subjectively preferable to the more highly compressed version.

WRT the dynamics, I can make a 'recording' of the dynamics processing -- it would make your hair curl -- multiple sets of 10dB/15dB changes, sometimes grouping 5 of the seven layers active at any given time. Almost all dynamics processing is below -30dB, and intended to 're-organize' and not to 'expand.' The difference is not profound like a simple 1:2 dynamics processor, but is a clarificaiton. There is one HELL of a lot more dynamics processing than even a DBX2 expander -- because the DBX2 isnt fast enough, it can slew only so quickly.

Here is an example: you mentioned sibilance on the S&G recording -- that is to be expected. Reports tell me that the master tape is noticeably sibilant. Probably >50% of the other 'questionable' changes result from reversion to near-original!!!

The release is done, and is meeting the goal. Whether one might prefer one or the other, that is more about accomodation and what one is used to rather than 'perfection' -- because there is NO PERFECTION, ANYWHERE.

As I have mentioned before (maybe elsewhere), I truly cannot listen to current CDs almost AT ALL -- they suck so badly, because I have been accomodated to clean recordings since I was very young - I also made/recorded them (orchestral stuff.) I couldn't 'grok' most CDs -- so I quit the HiFi hobby. The decoded versions come much closer to the originals that I remember. The 'cleanness' programming is still strong in my hearing apparatus, and most CDs just sound like garbage, and it is NOT a subtle sense of garbage.

Note: even the originals are imperfect. A good, true example of an imperfect decode -- listen to the Bachrach medly from the Carpenters -- THERE IS SOMETHING WRONG WITH THAT, but from an integrity standpoint, I will not cherry pick the examples.

Here is a WONDERFUL example of the decoder doing good things --- listen to 'Take a chance on me' before decoding and after. The original CD is so bad that it is ALL GARBLE. The decoder makes a lot of good progress to RE-ORGANIZE THE GARBLE.

THAT is what the decoder is about, and one reason why I don't call it an expander -- don't expect traditional expansion because it does more 're-organization' and not audible as expansion -- EVEN THOUGH IT DOES.

If you note the improvement with 'take a chance on me' -- that is a more extreme case of the other improvements. Not all people will like the improvements -- their brains are used to correcting the sound imparted to almost all digital recordings. There is NO REASON to fix that for people accomodated like that, because they enjoy the current sound, but current CDs are NOT electrically anywhere near what was originally produced.

The decoder does a heck of a lot more to improve the clarity of most recordings than going from a $1k to $10k set of speakers, because the original material is so very flawed witihout correction.

KSTR · Apr 6, 2021

Sorry John, you aren't even remotely responding to the boxy/tubby EQ thing and now you even start stressing and SHOUTING?

BTW, I'm not that used/accommodated to, I actually never listen to the type of music you've choosen for most of the examples (at least the part I've checked out so far)... and probably only very few other people do, let alone on true HiFi rigs. I would not find anyone in my circles (and I'm already an old fart, closing on on 60).

Is there a way to use the prog without that huge EQ changes (and those a global, affecting high and low signal levels)? Otherwise, I'll have to analyse correct all the gross EQ myself which is a unneccessary/tedious task. Only with no major EQ changes I see any chances for meaningful comparison.

John Dyson · Apr 6, 2021

KSTR said:
Sorry John, you aren't even remotely responding to the boxy/tubby EQ thing and now you even start stressing and SHOUTING?

BTW, I'm not that used/accommodated to, I actually never listen to the type of music you've choosen for most of the examples (at least the part I've checked out so far)... and probably only very few other people do, let alone on true HiFi rigs. I would not find anyone in my circles (and I'm already an old fart, closing on on 60).

Is there a way to use the prog without that huge EQ changes (and those a global, affecting high and low signal levels)? Otherwise, I'll have to analyse correct all the gross EQ myself which is a unneccessary/tedious task. Only with no major EQ changes I see any chances for meaningful comparison.

I only stress, NEVER SHOUT!!!

I never get emotional in the way that I think that some people do. My strongest emotion is 'frustration'. My main mantra is true kindness. I will try to use italcs unless single word emphais -- is that better?

About hearing: I have ZERO pride in the precision of my hearing, in fact it sucks t*rds. However, I know distortionwhen I hear it. I am NOT focusing on nonlinear distortion here -- it is all about garble, phase effects, stereo image damage, etc. I am more into mitigating 'distractions', which is the reason why I had to quit listening to the totally horrible sounding CDs back in the late 1980s. The distractions of distorted sound on CDs was so bad that I couldn't listen. I never heard that garble, grind, swishy highs on any natural recording previously.
It makes NO sense to blame it on 'digital tech', because the problem in the recordings still exists (recent test: Shake it Off, from Taylor Swift.)

Anyway -- I do clearly hear the timing/gain level distortions in almost all CDs. It is SO CLEAR to me that I get confused when people don't hear it. In fact, if the decoder didn't follow very precise algorithmic rules, I wouldn't even demo it. Recently, there was a bug in the LF, but I couldn't correct it because there was no justification -- I can NOT do tweaks, or there would be no discipline at all. I really thought that the EQ was already done, because it was clearly in the source code -- but I had mistakenly disabled it. I would NOT make a change without a technical justificaiton. THIS is the only way that I maintain my own integrity and be able to express my clear and accurate statement about the distortion in consumer recordings. If I fall into the world of opinion, then all hope is lost.

The decoder is incredibly precise -- just found a minor bug of about -1.5dB at approx 15kHz, instead of at about 13kHz -- which for some people is just outside the region of hearing -- other people might feel it to be an 'enhancement'. The frequencies are actually calculated, so I dont know off-the-cuff what they really are. Just had to correct it -- but fixing the bug is important for maintaing the internal consistency and integrity about EVERYTHING needing a justification based on the behavior of associated components and understanding/finding tells about the other needed pre/de-emphasis. This correction is the reason for moving from V2.2.3A to V2.2.3B.

Maybe the EQ problem is what you heard?

John Dyson · Apr 6, 2021

Minor update from V2.2.3A to V2.2.3B (movement of EQ based on calculation -- basically -dB from about 15kHz to about 13kHz.)

https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0

Bugfix from V2.2.3A to V2.2.3B...

Code change:
Error in the EQ for the HF pre/de-emphasis.

The ‘past-12k’ side of the 3 phase de-emphasis started at too high a frequency
code change below (approx line 8235 of audioproc.cpp):

//double fcor2 = iminfo.f1 * gcor1 * gcor1; V2.2.3A -> V2.2.3B
double fcor2 = iminfo.f1 * gcor1;

Basically, the compensation for the 12kHz de-emphasis started at approx 1.19x too high in frequency.
This *VERY SLIGHT* error created a mild, metallic effect in the sound. It could be partially corrected
by a small rolloff at about 9kHz, but this is a better fix.

John Dyson · Apr 6, 2021

I just realized that there might be a *fact* that I haven't explained here... EQ is done differently from one recording to another.
The compression used against most consumer recordings is the 'great homogenizer' What I mean is that some
slgnifiant differences from recording to recording are uncovered by undoing the 'great homogenizer'.

In order to force through and emphasize certain sounds of the recordings, sometimes there has to be a profound
emphasis (anyone using compressors should understand that the differences in levels are compresed/made smaller.)

A good example is the 'Maxwells Hammer'. By my own taste there is too much bass. However, on other recordings,
the result is just right. I am VERY SURE that the same 'great homogenizer' design is used on both.

What does this mean? If you want your recordings descrambled, you might need to undo some misguided mastering
intended to punch through the nasty scrambling/compression done against the recording.

Feel free that if you think that there is 3dB or even 6dB too much bass, cut the bass a little -- usually the bass boost will be in the 80 to 120Hz
region -- just cut it a little. The decoder does NOT do mastering (well, you can do it with the decoder -- it has lots of features),
but it cannot do it AUTOMATICALLY.

Most of the time, the damage is done to the lows, not so much the highs. People sometimes seem to like 'bass'. I do NOT like a lot of bass,
in fact my headphones go down almost to DC (well - not quite, but are relatively flat to below 20Hz.) This makes me very
bass adverse, and if I was doing the decoder for my own taste, there would be several dB less bass. Instead, the decoder
is ACCURATE, my own hearing notwithstanding.

KSTR · Apr 6, 2021

John Dyson said:
Maybe the [13/15kHz]EQ problem is what you heard?

No, what I heard is this:

Robinson:

Sun:

Mamma Mia:

Danny:

I think we can stop here... the exact same pattern on all four files.

In general the tonal balance is totally screwed, we're talking a scary peak/valley landscape of +8dB/-10dB of EQ changes applied in the 20Hz to 20Khz range! No wonder it all sounds honky and zingy. Do you ever sanity check your data? Obviously you don't.

Honestly, I'm out here. You're either a troll, or living in a bubble, or even worse, made up a story in order to spread Trojan Horses.

Good status of the recording correction (FA) project

Active Member

Active Member

Active Member

Active Member

Active Member

Active Member

Major Contributor

Active Member

Major Contributor

Major Contributor

Active Member

Major Contributor

Active Member

Major Contributor

Active Member

Major Contributor

Active Member

Active Member

Active Member

Major Contributor

Similar threads