• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Humpify -- apply "ESS hump"-style distortions to audio files

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
3,048
Likes
6,699
Location
Berlin, Germany
Hi,

Some of you might have followed my thead Measurements: "ESS Hump" revisited (Khadas Tone Board V1.3) where I investigated the infamous "ESS hump" in close detail. It turned out that the distortion can be very precisely emulated with a simple static non-linearity (which is not the root cause, tough, rather it is less than optimal circuit and OpAmps used which gives an equivalent error pattern). I could achieve an almost perfect fit of model to measurements for the KTB, both in frequency and time domains, at any signal level. This offers a way to investigate the audibility of this very special distortion pattern in true isolation, free from other factors. Simply testing with files. Plus it allows for variations, that is the level of the distortion can be increased at will.


This is a simple command-line tool for Windows (source code is supplied in case you want to compile it for Linux etc) and I'm hoping you are familiar with this mode. If not, easiest way is to create a directory, put the program there as well as the audio files you want to process. Then open a command shell by pressing WindowsKey+R, type in "cmd" and execute. Then navigate to the directory using "cd" command (copy the path in Explorer with Ctrl-C and paste it with Ctrl-V).

Usage:
Code:
Usage: humpify in_file out_file zoom_factor_db [-r]
 in_file        : WAV input file, must be 24bit packed integers format
 out_file       : generated WAV output file, 24bit packed integers
 zoom_factor_db : zoom factor in dB for the generated distortion, 0 ... +100
 -r             : optional, output file is residual only (not added to input file)"+

A typical call would look like this:
humpify "NORTH COUNTRY II.wav" test.wav 30
Note the quote-characters (") surrounding the file name here, this is required when the name contains spaces and other reserved characters.

No file format checking etc is done by the program, I'm blindly assuming you are feeding proper .WAV files in 24 bit resolution (3-byte packed integer format), stereo or mono, with no frills like extra metadata etc. Sample rate does not matter. If you have regular 16-bit files you need to convert them to 24 bits first.

The "-r" option allows to get the added distortion in isolation rather than imposed on the input file so you can listen to it directly. The zoom factor is applied here as well (use 100 for maximum level -- which will be -8dBFSpeak)

If you have well recorded music with a lot of dynamic range, that is, containing rather quiet passages, preferably in a hi-res format (24bit, and higher than 44.1kHz is useful as well here) then you will probably notice the distortion immediately with zoom factors of 40dB and more. That's the easy part ;-)

Once you go below -30dB it's going to be a lot harder to tell the files apart in an controlled blind listening test (like with foobar's ABX plugin). The real question now is whether we have members that can identify the distortion at the actual true distortion levels? Or to make it a little easier, at zoom factors not larger than 10...20dB?

We have to keep in mind that the playback DAC is also important here. When it sports the hump itself it makes little sense to use the program with 0dB zoom factor because then the contribution from the program is too small compared to the baseline error. But at 10dB or more the generated distortion should dwarf any similar distortion in the playback DAC.

Obviously, the emulation does not mimic the complete behavior of a DAC because only the specific distortion mechanism which is responsible for the hump is modelled but nothing else, still I think it can be very useful to evalute basic audibility thresholds.
 

Attachments

  • humpify.zip
    11.3 KB · Views: 164
As a point of reference, what zoom factor corresponds to the level of distortion observed on the KTB?
 
@Pdxwayne ? You generally like to take these challenges. Had a chance to have a listen?
 
@Pdxwayne ? You generally like to take these challenges. Had a chance to have a listen?
Wow, I totally missed this thread!

Thanks for the invitation.
: )
Yes, I am interested.

I am going to try figure out first why my Gustard DAC behave differently when using 16bit 44.1 vs 24bit 44.1 Windows advance audio setting for online timing listening tests. Will compare with my other two DAC. I hope to collect more information first today and maybe ask other Gustard owners to confirm my observations.

After this, yes, will work on this.
: )
 
KSTR: Thanks a lot for your analysis. Please can you describe more in detail the steps which took you from the spectrum/distortion charts to the sine-modulated static transfer function which produces the same type of distortion? IMO that's very interesting. If this transfer function was a bit stable it could be easily compensated by an "antidistorting" calculation before the DAC. But I assume the actual distortion cannot be approximated by a semi-stable transfer function.
 
KSTR: Thanks a lot for your analysis. Please can you describe more in detail the steps which took you from the spectrum/distortion charts to the sine-modulated static transfer function which produces the same type of distortion? IMO that's very interesting. If this transfer function was a bit stable it could be easily compensated by an "antidistorting" calculation before the DAC. But I assume the actual distortion cannot be approximated by a semi-stable transfer function.
Hi Pavel,
There were several observations that finally led to the lightbulb moment, mainly looking at the time-domain distortion residuals:
- No dependency on sample rate --> it is only related to the final 6-bit DAC stage, running 100MHz (ASRC mode)
- No dependency on frequency, 100Hz, 1kHz, 10kHz all give the exact same picture (at levels where the normal distortions are low).
- Cyclic behavior of the pattern changes when stepping through digital DC offset, it repeats 64x through the whole sample value range (the 6-bit final DAC again)
- The residual sine-like ripple patterns slow down in frequency near the extrema of the input sine where the slope decreases

I broke my head over a mechanism that would cause the OpAmps to add this cyclic offset but at some point it became clear that the modulation of the 6-bit DAC, when feed a with DC signal will (likely) excercise only one segment of DAC (one set of currents switches) when the signal is in the middle of a segment whereas it must excercise two segments when it is on the split point between segments. Basically multi-level PWM. That seems to give different amounts of positive and negative transitions (and maybe glitches) at different times and that seems to throw off the OpAmps as they are demodulating those sharp edges differently depending on circumstances.
I'm not fully able to find all the right words for this but it must be something along these lines.

The key momemt then came when I simply tried to create the residual with a sine modulation of the transfer function, with 32 periods from zero to +FS (so 64 segments total). Bingo, the characteristic was clearly there, just some scaling was off. I fudge-factored with the periods and at around 26.5 periods (which means some lower that full scale modulation depth) I found close resemblance of the measured residual patterns to the emulated ones at different levels. Double checking with the spectra confirmed it (I was quite shocked by striking fit for the lower harmonics, like up to 20th). Finally I dialed in the magnitude of the sine modulation so that it matched what I had measured and that was at -108dB (4e-6).

Will make some plots and show them.

-------------------

I would think pre-distortion with the inverted linearity modulation is very likely to be fully efficient, reduction of 20dB looks possible to me over a wide frequency range. It really looks like a stable static transfer nonlinearity.

I'm currently trying to extract the transfer function directly from measurements for inversion but had no real success as of yet, using dynamic signals. Heavy time-domain sync'd averaging is a must as you might imagine, to arrive at low-noise data for a sample value mapping with interpolation. I might try with "slowly changing DC" (DAC is DC-coupled, ADC can be as well) using a slow (1Hz or lower) triangular wave but that of course makes averaging awkward, plus the drift problems etc. OTOH, with dynamic signal like a sine at higher freqiencies im having troubles with not quite as much codes being excersized, plus sub-sample shift issues etc, additional (frequency dependent) distortion creeping in...
Work in Progress. If you, and/or @pkane, have some pointers how to get to the sample mapping I'd be happy, of course. I could offer to provide the measured data once we have figured out what's best suited...
 
Last edited:
Hi Pavel,
There were several observations that finally led to the lightbulb moment, mainly looking at the time-domain distortion residuals:
- No dependency on sample rate --> it is only related to the final 6-bit DAC stage, running 100MHz (ASRC mode)
- No dependency on frequency, 100Hz, 1kHz, 10kHz all give the exact same picture (at levels where the normal distortions are low).
- Cyclic behavior of the pattern changes when stepping through digital DC offset, it repeats 64x through the whole sample value range (the 6-bit final DAC again)
- The residual sine-like ripple patterns slow down in frequency near the extrema of the input sine where the slope decreases

I broke my head over a mechanism that would cause the OpAmps to add this cyclic offset but at some point it became clear that the modulation of the 6-bit DAC, when feed a with DC signal will (likely) excercise only one segment of DAC (one set of currents switches) when the signal is in the middle of a segment whereas it must excercise two segments when it is on the split point between segments. Basically multi-level PWM. That seems to give different amounts of positive and negative transitions (and maybe glitches) at different times and that seems to throw off the OpAmps as they are demodulating those sharp edges differently depending on circumstances.
I'm not fully able to find all the right words for this but it must be something along these lines.

The key momemt then came when I simply tried to create the residual with a sine modulation of the transfer function, with 32 periods from zero to +FS (so 64 segments total). Bingo, the characteristic was clearly there, just some scaling was off. I fudge-factored with the periods and at around 26.5 periods (which means some lower that full scale modulation depth) I found close resemblance of the measured residual patterns to the emulated ones at different levels. Double checking with the spectra confirmed it (I was quite shocked by striking fit for the lower harmonics, like up to 20th). Finally I dialed in the magnitude of the sine modulation so that it matched what I had measured and that was at -108dB (4e-6).

Will make some plots and show them.

-------------------

I would think pre-distortion with the inverted linearity modulation is very likely to be fully efficient, reduction of 20dB looks possible to me over a wide frequency range. It really looks like a stable static transfer nonlinearity.

I'm currently trying to extract the transfer function directly from measurements for inversion but had no real success as of yet, using dynamic signals. Heavy time-domain sync'd averaging is a must as you might imagine, to arrive at low-noise data for a sample value mapping with interpolation. I might try with "slowly changing DC" (DAC is DC-coupled, ADC can be as well) using a slow (1Hz or lower) triangular wave but that of course makes averaging awkward, plus the drift problems etc. OTOH, with dynamic signal like a sine at higher freqiencies im having troubles with not quite as much codes being excersized, plus sub-sample shift issues etc, additional (frequency dependent) distortion creeping in...
Work in Progress. If you, and/or @pkane, have some pointers how to get to the sample mapping I'd be happy, of course. I could offer to provide the measured data once we have figured out what's best suited...

Klaus, I have a couple of mechanisms I've been testing for transfer function extraction/correction. The one that's currently available in Deltawave is probably not what you'd need for this, as it's more of a deconvolution process based on averaged transfer function measured from multiple delta measurements. Probably not going to be easy to measure or to fix anything as low-level as this distortion appears to be.

The other one uses harmonic distortion to create the inverse non-linearity that can be applied before or after loopback (pre or post compensation). This one has the potential to address this. For example, I've been able to get my ADI-Pro FS to produce a THD of better than -130dB @1k:
adi2.png
It's not ready for prime-time, but if you want, I can share an early version of this feature.
 
Klaus, I have a couple of mechanisms I've been testing for transfer function extraction/correction. The one that's currently available in Deltawave is probably not what you'd need for this, as it's more of a deconvolution process based on averaged transfer function measured from multiple delta measurements. Probably not going to be easy to measure or to fix anything as low-level as this distortion appears to be.

The other one uses harmonic distortion to create the inverse non-linearity that can be applied before or after loopback (pre or post compensation). This one has the potential to address this. For example, I've been able to get my ADI-Pro FS to produce a THD of better than -130dB @1k:
View attachment 186666
It's not ready for prime-time, but if you want, I can share an early version of this feature.
Great, looking forward to this, Paul!

In DW, the linearity plots shows exactly whats going on with the emulated scenario even at those very low levels of distortion... which is easy as there is no noise, no (sub-)sample offset. With real data I'm struggling but at one point I got a plot that was correct down to 16..17bit levels.

EDIT: Plot with the emulated data, 1 second of pink noise:
ripple-pattern.gif


I'll try the quasi-DC "Voltmeter approach" the next days (with heaps of averaging, like 10,000x -- which means 10,000 seconds recording time for a 1-second ramp function) and see if that works out.
A brute-force lookup table will be a memory and CPU hog but a refined curve fitting algo could be feasible... not really my field of expertise, though.
 
Last edited:
Klaus, please would you have several-second captures (e.g. flacs) of a fixed sine signal at different frequencies and amplitudes? I could try to curve-fit the transfer function in octave/matlab, to see how much the transfer function varies.
 
Hi,
I use Linux. So I haved compil with success with the command line : gcc -Wall humpify.c -o humpify -lm

But, I have tried several zoom_factor_db and the generated file is very bad:
- zoom_factor_db 100 is white noise
- but other setting like 0 , 0.1, 1 or 10 generate a very bad distorted sound from the original (not need blind test )

I have also tried the provided humpify.exe with wine but this do the same very distorted noise .

Tested on Ubuntu 21.10
 
I used audacity to export my 44.1 wav file to signed 24-bit PCM format. Then, used the humpify app to add 30 zoom factor. Even at 30, deltawave is saying that it is unlikely I will hear any different. So I am not going to even attempt doing any ABX listening tests.

: )

pk_metric_right_channel.PNG
 
- but other setting like 0 , 0.1, 1 or 10 generate a very bad distorted sound from the original (not need blind test )
Is your wav really the basic format with 44 bytes header which is what humpify expects? E.g. the extended wav format (default wav in sox) produces header 80 bytes long. For 44 bytes header the type -t wavpcm must be used.
 
Hi,
I use Linux. So I haved compil with success with the command line : gcc -Wall humpify.c -o humpify -lm

But, I have tried several zoom_factor_db and the generated file is very bad:
- zoom_factor_db 100 is white noise
- but other setting like 0 , 0.1, 1 or 10 generate a very bad distorted sound from the original (not need blind test )

I have also tried the provided humpify.exe with wine but this do the same very distorted noise .

Tested on Ubuntu 21.10
Please forgive my lazyness not having implemented a proper .WAV import that can handle different formats. I will modify the code so that header length is handled, check for PCM data and accept 16 and 24 bit data.

See Pavel's answer for a fix (I didn't know that SoX does not use the standard 44-bytes header).
 
Klaus, please would you have several-second captures (e.g. flacs) of a fixed sine signal at different frequencies and amplitudes? I could try to curve-fit the transfer function in octave/matlab, to see how much the transfer function varies.
Do you need the raw recordings? Would block-averaged data help which denoises the data and condenses the sine to exactly one single cycle (output format would be 64bit floats)?
 
Please forgive my lazyness not having implemented a proper .WAV import that can handle different formats. I will modify the code so that header length is handled, check for PCM data and accept 16 and 24 bit data.

See Pavel's answer for a fix (I didn't know that SoX does not use the standard 44-bytes header).
My bad, yes with a 24bit signed wav the produced sound is OK.

Very quick test, the distortion is obvious at 60db .
 
I'll try the quasi-DC "Voltmeter approach" the next days (with heaps of averaging, like 10,000x -- which means 10,000 seconds recording time for a 1-second ramp function) and see if that works out.
@phofman,
Quick try is looking better than expected.
0.25Hz triangular wave, 0dBFS, 192kHz, which means one ramp is 384000 samples long, giving some 18bits of covered input codes range.
66 averages (~18dB of noise reduction).
Preliminary removal of the DC offset in the recording, so ignore that part for the moment.

Subtraction of input from output slope (from -FS to +FS, so at sample #192000 we are at input zero):
1644927038245.png

Noisy, but we can see the ripple pattern already, besides the overall curvature... the negative branch of sample values sees a compression once it's more than 0.3 of -FS or so. Similar kink showing up at the end of the positive range.
DC-distortion/nonlinearity of some kind (could be thermal). Contribution from the ADC is unknown but I would hope it is small.

Cleaning a bit with 12th-order linear phase lowpass filter at 300Hz:
1644927351532.png

There we have it. Zero center seems spot on (rotation-symmetric pattern).
And I'm counting ~26.3 periods, bingo.

Zooming in:
1644927472814.png

Ok, not a sine, rather a pretty much triangular pattern. The more I think about it the more this should be expected pattern.

I would guess, encouraged by this preliminary shot from the hip, that the DC approach is going to give precision results with a bit of care.
 
Last edited:
Back
Top Bottom