John Dyson
Active Member
- Joined
- Feb 18, 2020
- Messages
- 172
- Likes
- 90
Rather good progress has been made... All I suggest, listen to snippets before reading further -- then, read. A more clean, more natural result appears after the processing by the decoder. Do not expect the fake, distorted, calliope type bass of many consumer recordings -- that fake calliope isn't normal bass, but compressed and distorted.
Also, the highs might sometimes appear to be over-emphasized, which they JUST MIGHT BE. However, the highs should sound very coherent, and not confused into a fog or fuzz like on many consumer recordings. It is possible that some EQ might be needed, but the only recent EQ request was a little more lower midrange, and that has been done. (Frustratingly, without 'tweaks' per-se, it is necessary to find an entire, new EQ sequence -- but once the correct one is found, then EVERYTHING works.)
https://www.dropbox.com/sh/tepjnd01xawzscv/AAB08KiAo8IRtYiUXSHRwLMla?dl=0
Not all snippets are perfect comparisons (RAW vs V2.2.2B), but these are NOT cherry picked. Some show profound
improvement, some show minor improvement, and for those who are used to the calliope-like bass from consumer recordings: where is the bass?
Sometimes, there might be an EQ issue, but should be on the order of a few dB on the ends of the spectrum, at most. Some results might need
'mastering'. If you don't notice 'RAW' in the filename - the raw CD is the version with all of the hiss on older recordings. All of the recordings use the same settings (except asingle minor switch on about 3-4 Carpenters recordings, and would also be some of Linda Ronstadts recordings if I had demoed them.)
This SW undoes (apparently close to correctly now) most the original 'Digital Sound' complaint from the 1980s. The 'digital sound' problem was never solved till now. The problem wasn't because of digital tech at all, and couldn't be practically solved until CPUs of the 2010 era. Even disks like Telarc's ''Ein Straussfest-Erich Kunzel' type wide-dynamics CD are compressed by the 'standard' technique. When correcting the CD, then the canons really ARE a lot louder than the rest of the recording, just like real life.
* On typical recordings, after decoding, you might sometimes notice anywhere from 0-6dB of the 'loudness wars' lossage go away. That is, the recordings will sometimes sound less loud for the same peak levels. This effect is extremely variable, but the decoded result is almost never as loud as the original.
* Sorry about the slow, random progress on the project. Imagine software of this complexity, with multiple layers of NON-AVAILABLE specs. At least I have DolbyA and DolbySR schematics to emulate those. There is NO HINT about what the compression on consumer recordings is -- now, I can tell you (early, slightly bugged docs do publcally exist now, soon to be updated.)
Pointer to software:
https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0
What does the software really do? Undoes the final phase of compression done to consumer recordings before distribution. The program is not just an 'expander', and the problem cannot be solved by using an 'expander box'. The compression is purely algorithmic, and is done by essentially 1960s through 1980s electronics HW. The decoder closely emulates that 'ancient' HW, but in reverse.
The quality of the project results have been like a 'drunken walk' partially frustrated by my +-5dB hearing on both ends of the spectrum. I am not deaf, but instead have extremely variable hearing -- therefore settings that work well at one time, sound bad (to me and everyone else) the next time. The settings are NOT tweaks, but still must use hearing for a guide post. Once I find the approx correct setting, then everything 'locks-in'. However, the bass and the final version of the midrange needed serious iterative testing -- I couldn't find any clear 'tells' like I could find in many other places.
There is little or no real 'tweaking' in the design of the higher level 'decoder' project, but there are integral (in dB) steps for EQ, like 1.5dB, 3.0dB, 6dB, 9dB, 10dB, 12dB type numbers. I had troubles, but solvable troubles until the last two unsolved design issuess. Up until the last three months, I could find 'tells' for reverse engineering errors, but on the last two matters, I had to moslty depend on my own listening judgement, and with my extremely variable hearing -- it has be full time whack a mole.
The design is highly layered and the consumer recording compression is NOT solvable with a single expander. Instead, the expansion is done on a segmented basis with attack/release times in the 2msec - 8msec for attack, and 40msec to 160msec for release, depending on frequency and waveshape. (Yes, the attack/release is mostly NOT R-C timeconstant, but is instead a variable time constant based upon diode nonlinearities for the calculation.)
Making a long story short, some people reading this might be thinking -- here is the crazy guy again... This has been a painful project
for three reasons:
1) Nay saying, industry pushback about the reveal of poor commercial recording quality.
2) ZERO spec anywhere/complexity-depth of the process
3) Extremely variable hearing on the primary developer/frustration of the potential users because of that.
Importantly, this SW is free, but simple-to-use command line. Secondly, the design is based on reverse engineering,
but with a lot of design adjuncts to overcome the rather obvious fact that the recordings were never intended to be recovered.
Changing settings, even calibration level is seldom needed. Normalized CDs and digital copies are troublesome, but otherwise
the recordings often (most of the time) work out-of-the-box now. FREE DOES NOT MEAN WORTHLESS, instead if you
use this software, it is probably the single most complex piece of audio processing that you have used (including automatic room
equalizers, which MIGHT come close.) This software does NOT have a GUI, but I don't do GUI, I don't do Android and I don't do Windows.
Starting from day one -- the ideas below were paramount. However, I often violated the 'keep it simple' approach. I would have
benefitted from keeping things simple. Eventually, the LF EQ that I have so painfully worked through ended up being the simplest
approach that I had tried so far!!!
1) attempted simple design because of reverse engineering, recognizing that compressor HW would limit complexity
2) platform/infrastructure with built-in support to undo the creation of distortion & sidebands created by a lot of fast compression
* The infrastructure was prepared to correct for the noise/sideband/distortion created by dense/fast compression WELL BEFORE
the full solution was found. The general sense of the needed design has been well known for over 2yrs, the exact numbers and configurations
have been incredibly variable.
This recovery is NOT ad-hoc, but is purely algorithmic. The 'compression' used on consumer recordings is so consistent that after a lot of
work in the decoder, now there is only one, relatively uncommon adjustment -- some older recordings use a slightly different
EQ. Normally, there is not even a requirement to adjust the calibration (level match for compression curves), and when needed,
it is a number like -1, 0, 1, 2 (also, -2, which is the default.) EQ mode switch is sometimes needed for certain older recordings.
There is evidence that the decoder is correct now (modulo minor EQ problems) simply because the user supplied settings are 99% no longer variable. That is, the EQ and levels are set in that 'magic niche' where all recordings work (except a few.)
I am not prepared to officially announce this *almost finished* version of the FA decoder right away, because I'll be looking for a few last minute bits of feedback (I sure hope no more requests for more bass!!!) However, here is a pointer to the decoder along with primitive docs for its use. There are Linux & Windows versions for SSE3, AVX2 and AVX512. The versions are separate because of the huge amount of interacting, very advanced usage of SIMD -- and adding class member function calls that aren't inline just add overhead. Maybe, someday, I'll segment the code better. AVV512 can help about 30% faster in certain cases. Use AVX2 when you can, or at least use a fast machine if you can. The source code will be freed as soon as I garbage collect the code, and try to document some of the more esoteric sections (e.g. anti-MD.) I use 2core, 4core and 10core machines, but the decoder can benefit to even more cores (probably 18 or more) when running in high quality modes. It takes advantage of SIMD, and benefits primarily from high SIMD performance.
Pointer to the decoder (again):
https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0
Also, the highs might sometimes appear to be over-emphasized, which they JUST MIGHT BE. However, the highs should sound very coherent, and not confused into a fog or fuzz like on many consumer recordings. It is possible that some EQ might be needed, but the only recent EQ request was a little more lower midrange, and that has been done. (Frustratingly, without 'tweaks' per-se, it is necessary to find an entire, new EQ sequence -- but once the correct one is found, then EVERYTHING works.)
https://www.dropbox.com/sh/tepjnd01xawzscv/AAB08KiAo8IRtYiUXSHRwLMla?dl=0
Not all snippets are perfect comparisons (RAW vs V2.2.2B), but these are NOT cherry picked. Some show profound
improvement, some show minor improvement, and for those who are used to the calliope-like bass from consumer recordings: where is the bass?
Sometimes, there might be an EQ issue, but should be on the order of a few dB on the ends of the spectrum, at most. Some results might need
'mastering'. If you don't notice 'RAW' in the filename - the raw CD is the version with all of the hiss on older recordings. All of the recordings use the same settings (except asingle minor switch on about 3-4 Carpenters recordings, and would also be some of Linda Ronstadts recordings if I had demoed them.)
This SW undoes (apparently close to correctly now) most the original 'Digital Sound' complaint from the 1980s. The 'digital sound' problem was never solved till now. The problem wasn't because of digital tech at all, and couldn't be practically solved until CPUs of the 2010 era. Even disks like Telarc's ''Ein Straussfest-Erich Kunzel' type wide-dynamics CD are compressed by the 'standard' technique. When correcting the CD, then the canons really ARE a lot louder than the rest of the recording, just like real life.
* On typical recordings, after decoding, you might sometimes notice anywhere from 0-6dB of the 'loudness wars' lossage go away. That is, the recordings will sometimes sound less loud for the same peak levels. This effect is extremely variable, but the decoded result is almost never as loud as the original.
* Sorry about the slow, random progress on the project. Imagine software of this complexity, with multiple layers of NON-AVAILABLE specs. At least I have DolbyA and DolbySR schematics to emulate those. There is NO HINT about what the compression on consumer recordings is -- now, I can tell you (early, slightly bugged docs do publcally exist now, soon to be updated.)
Pointer to software:
https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0
What does the software really do? Undoes the final phase of compression done to consumer recordings before distribution. The program is not just an 'expander', and the problem cannot be solved by using an 'expander box'. The compression is purely algorithmic, and is done by essentially 1960s through 1980s electronics HW. The decoder closely emulates that 'ancient' HW, but in reverse.
The quality of the project results have been like a 'drunken walk' partially frustrated by my +-5dB hearing on both ends of the spectrum. I am not deaf, but instead have extremely variable hearing -- therefore settings that work well at one time, sound bad (to me and everyone else) the next time. The settings are NOT tweaks, but still must use hearing for a guide post. Once I find the approx correct setting, then everything 'locks-in'. However, the bass and the final version of the midrange needed serious iterative testing -- I couldn't find any clear 'tells' like I could find in many other places.
There is little or no real 'tweaking' in the design of the higher level 'decoder' project, but there are integral (in dB) steps for EQ, like 1.5dB, 3.0dB, 6dB, 9dB, 10dB, 12dB type numbers. I had troubles, but solvable troubles until the last two unsolved design issuess. Up until the last three months, I could find 'tells' for reverse engineering errors, but on the last two matters, I had to moslty depend on my own listening judgement, and with my extremely variable hearing -- it has be full time whack a mole.
The design is highly layered and the consumer recording compression is NOT solvable with a single expander. Instead, the expansion is done on a segmented basis with attack/release times in the 2msec - 8msec for attack, and 40msec to 160msec for release, depending on frequency and waveshape. (Yes, the attack/release is mostly NOT R-C timeconstant, but is instead a variable time constant based upon diode nonlinearities for the calculation.)
Making a long story short, some people reading this might be thinking -- here is the crazy guy again... This has been a painful project
for three reasons:
1) Nay saying, industry pushback about the reveal of poor commercial recording quality.
2) ZERO spec anywhere/complexity-depth of the process
3) Extremely variable hearing on the primary developer/frustration of the potential users because of that.
Importantly, this SW is free, but simple-to-use command line. Secondly, the design is based on reverse engineering,
but with a lot of design adjuncts to overcome the rather obvious fact that the recordings were never intended to be recovered.
Changing settings, even calibration level is seldom needed. Normalized CDs and digital copies are troublesome, but otherwise
the recordings often (most of the time) work out-of-the-box now. FREE DOES NOT MEAN WORTHLESS, instead if you
use this software, it is probably the single most complex piece of audio processing that you have used (including automatic room
equalizers, which MIGHT come close.) This software does NOT have a GUI, but I don't do GUI, I don't do Android and I don't do Windows.
Starting from day one -- the ideas below were paramount. However, I often violated the 'keep it simple' approach. I would have
benefitted from keeping things simple. Eventually, the LF EQ that I have so painfully worked through ended up being the simplest
approach that I had tried so far!!!
1) attempted simple design because of reverse engineering, recognizing that compressor HW would limit complexity
2) platform/infrastructure with built-in support to undo the creation of distortion & sidebands created by a lot of fast compression
* The infrastructure was prepared to correct for the noise/sideband/distortion created by dense/fast compression WELL BEFORE
the full solution was found. The general sense of the needed design has been well known for over 2yrs, the exact numbers and configurations
have been incredibly variable.
This recovery is NOT ad-hoc, but is purely algorithmic. The 'compression' used on consumer recordings is so consistent that after a lot of
work in the decoder, now there is only one, relatively uncommon adjustment -- some older recordings use a slightly different
EQ. Normally, there is not even a requirement to adjust the calibration (level match for compression curves), and when needed,
it is a number like -1, 0, 1, 2 (also, -2, which is the default.) EQ mode switch is sometimes needed for certain older recordings.
There is evidence that the decoder is correct now (modulo minor EQ problems) simply because the user supplied settings are 99% no longer variable. That is, the EQ and levels are set in that 'magic niche' where all recordings work (except a few.)
I am not prepared to officially announce this *almost finished* version of the FA decoder right away, because I'll be looking for a few last minute bits of feedback (I sure hope no more requests for more bass!!!) However, here is a pointer to the decoder along with primitive docs for its use. There are Linux & Windows versions for SSE3, AVX2 and AVX512. The versions are separate because of the huge amount of interacting, very advanced usage of SIMD -- and adding class member function calls that aren't inline just add overhead. Maybe, someday, I'll segment the code better. AVV512 can help about 30% faster in certain cases. Use AVX2 when you can, or at least use a fast machine if you can. The source code will be freed as soon as I garbage collect the code, and try to document some of the more esoteric sections (e.g. anti-MD.) I use 2core, 4core and 10core machines, but the decoder can benefit to even more cores (probably 18 or more) when running in high quality modes. It takes advantage of SIMD, and benefits primarily from high SIMD performance.
Pointer to the decoder (again):
https://www.dropbox.com/sh/1srzzih0qoi1k4l/AAAMNIQ47AzBe1TubxJutJADa?dl=0