• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required as is 20 years of participation in forums (not all true). Come here to have fun, be ready to be teased and not take online life too seriously. We now measure and review equipment for free! Click here for details.

Study of the effects of Nonlinear Distortion on the Perceived Sound Quality

OP
Pinox67

Pinox67

Member
Joined
Apr 8, 2020
Messages
34
Likes
76
Location
Italy
I am going to watch that interview in a few moments.
A couple of thoughts. First, I admire you for the undertaking, which should be useful in a broad sense. The more we understand about what kinds of distortion matter, the better. My sense, though, is that this may be something that is already understood fairly well, thanks to prior work by a variety of other researchers. As such, before undertaking something as ambitious as what you are undertaking, I would first search the literature exhaustively. In fact, a paper that definitely summarizes the existing investigations and findings would be very useful all on its own, possibly deserving of publication, independently of the independent research you are planning to do.

Almost all the studies I have found concern the audibility threshold of various forms of linear and non-linear distortions, not if these can be used to improve the enjoyment of listening to real music. I realize that it is much more difficult to quantify this in numbers, but in the end this is what matters most to an audiophile and has probably not been studied in detail. If you can find something about it, I would be happy if you share it, maybe it will save us some work.

Second, I cannot help but notice something that seems a bit curious, in what you wrote and that I excerpted above. You alluded to realism, i.e., "the illusion of being in front of the live musical event". Implicitly you are saying that in order to achieve this illusion, the sound must be altered so that it does not sound the same as what was recorded the live event, in order that the listener's emotional response will be more like what it is for a listener at the live event. This is what you are saying in essence, and since it is, it seems to me that you should state this in the plainest manner possible.

On the second point, perhaps I have not explained myself well. In an ideal world, where the whole chain of production and reproduction is perfectly linear, I don't think it is necessary to deliberately inject any form of distortion to achieve a sense of realism in reproduction. If it is added in production, it is for artistic purposes. In the real world, unfortunately, electronic devices in the whole chain introduce "bad" unwanted distortions that can destroy this sense of realism of the original source (one of the aspects of listening pleasure) that an audiophile is looking for. By controlling the injection of "good" distortion into reproduction, it should be possible to recover some, if not all, of these qualities using the extensively studied ear masking effect. Obviously, assuming the source material is not too compromised and the sound engineer has done his job correctly (e.g. if he introduces digital clipping, one of the worst distortions, I don't think much can be done).

Third, I don't think that there can be any genuine reason for not giving the listener the ability to control the level of added distortion. I think it self-evident that it is highly desirable for the listener to be provided the ability to control the level of added distortion, and even the ability to vary the makeup of the distortion. It is self-evident to me that the listener should be given control over anything that alters sound in any way, including the ability to disable the effect via a by-pass switch. It follows (directly) that any distortion added to amplification should be thought of as adjunct processing. I do not know of any good, genuine way to avoid this observation, yet this observation is never made by people who advocate that it is desirable for amplifiers to add distortion along with the amplification. This strikes me as patently disingenuous, and this is the stronger of the two reasons why people who argue for distortion in amplifiers seem disingenuous to me.) ...

On this point I agree with you. I would prefer to have reproduction chains that are as linear as possible. Then, eventually, the user can adjust the "good" distortion injection as he prefers on his preamp. What I can add is that it is very complex to independently control the second harmonic and third harmonic distortion in analogue, leaving the other harmonics unchanged and at negligible levels. My designer friends are thinking about it. Anyway, if you have other types of tests in mind by working on the tracks digitally I am open to discussion.
 
Last edited:

MrPeabody

Addicted to Fun and Learning
Joined
Dec 19, 2020
Messages
657
Likes
844
Location
USA
Almost all the studies I have found concern the audibility threshold of various forms of linear and non-linear distortions, not if these can be used to improve the enjoyment of listening to real music. I realize that it is much more difficult to quantify this in numbers, but in the end this is what matters most to an audiophile and has probably not been studied in detail. If you can find something about it, I would be happy if you share it, maybe it will save us some work.

That seems a fair answer. The existing research is concerned mainly with audibility of distortion, not specifically with how we react emotionally to different kinds and levels of distortion.

On the second point, perhaps I have not explained myself well. In an ideal world, where the whole chain of production and reproduction is perfectly linear, I don't think it is necessary to deliberately inject any form of distortion to achieve a sense of realism in reproduction. If it is added in production, it is for artistic purposes. In the real world, unfortunately, electronic devices in the whole chain introduce "bad" unwanted distortions that can destroy this sense of realism of the source (one of the aspects of listening pleasure) that an audiophile is looking for. By controlling the injection of "good" distortion into reproduction, it should be possible to recover some, if not all, of these qualities using the extensively studied ear masking effect. Obviously, assuming the source material is not too compromised and the sound engineer has done his job correctly (e.g. if he introduces digital clipping, one of the worst distortions, I don't think much can be done).

I appreciate the effort to explain this better. Your hypothesis is that by adding distortion that is believed to be pleasing, that this will compensate for various forms of non-pleasing distortion that have been added at various stages in the production process. It is an interesting hypothesis. Kind of like eating more good cholesterol in the effort to get rid of accumulated bad cholesterol.

There are a few things that I think are worth mentioning. The distortion that some people supposedly find pleasing is also the distortion that we don't much hear because of the masking effect. I don't think it would make sense to think that low-order distortion can be simultaneously inaudible and pleasing. It would have to be one or the other, and it is now fairly well established that low-order distortion is relatively inaudible. A better explanation for why tube and solid state amplifiers sometimes sound different is with the difference in output impedance and the ensuing difference in the way the voltage is split between speaker impedance and amplifier output impedance. This is a very well understood effect, that I previously mentioned, and that is often mentioned. It is exceedingly likely for multiple reasons that this is the more correct explanation for why the two kinds of amplifiers may sound different in some cases and circumstances. Since this is the better explanation for why the two types sound different, it is probably also the better explanation for why some people prefer the sound of one type over the other.

Also worth mentioning is that you have made it apparent that you anticipate a particular outcome and are strongly biased toward a particular outcome. As such, you need to use extra caution in everything you do, to avoid a biased outcome. And since your hypothesis, as you stated it above, is that the good distortion will only likely be beneficial when it serves to compensate for bad distortion, your experiment should start by identifying the kinds and levels of distortion that people don't like. You'll need to do this first, and apply the bad distortion to both the control samples and the samples to which the "good" distortion has been added. Then you'll need to do the experiment without the bad distortion having been added to any of the music samples, and compare the statistical results of this to the statistical results of the experiment done with the bad distortion having been added. This is the only way that you'll be able to establish whether the reason people who like the good distortion is specifically because it compensates for bad distortion, vs. like it for reasons that have nothing to do with compensation for bad distortion. This is not going to be an easy experiment.

By the way I watched and enjoyed the video you linked, with Erin @hardisj interviewing Earl Geddes. That was a great interview. I couldn't help but take notice when Mr. Geddes said that he does not believe that the reason that some people prefer tube amplifiers has anything to do with distortion. If you want to show otherwise, you'll not merely need to demonstrate that people really do like low-order distortion, but also that the level at which they prefer it is a level commonly encountered with tube amps regardless of brand or model or even of the volume level. It does not seem to me that this will be an easy thing to do, and again you'll want to keep a wide berth of confirmation bias.
 
OP
Pinox67

Pinox67

Member
Joined
Apr 8, 2020
Messages
34
Likes
76
Location
Italy
I appreciate the effort to explain this better. Your hypothesis is that by adding distortion that is believed to be pleasing, that this will compensate for various forms of non-pleasing distortion that have been added at various stages in the production process. It is an interesting hypothesis. Kind of like eating more good cholesterol in the effort to get rid of accumulated bad cholesterol.

Yes, we are there, the key word is "masking".

There are a few things that I think are worth mentioning. The distortion that some people supposedly find pleasing is also the distortion that we don't much hear because of the masking effect. I don't think it would make sense to think that low-order distortion can be simultaneously inaudible and pleasing. It would have to be one or the other, and it is now fairly well established that low-order distortion is relatively inaudible. A better explanation for why tube and solid state amplifiers sometimes sound different is with the difference in output impedance and the ensuing difference in the way the voltage is split between speaker impedance and amplifier output impedance. This is a very well understood effect, that I previously mentioned, and that is often mentioned. It is exceedingly likely for multiple reasons that this is the more correct explanation for why the two kinds of amplifiers may sound different in some cases and circumstances. Since this is the better explanation for why the two types sound different, it is probably also the better explanation for why some people prefer the sound of one type over the other.

I think they are two different things. The dependence of the damping factor of an amplifier on the frequency creates colorations in the sound and therefore linear distortions (if you have accurate references on the subject, forward them to me). However, it still makes sense to study the effect of injecting non-linear distortions on these.

Also worth mentioning is that you have made it apparent that you anticipate a particular outcome and are strongly biased toward a particular outcome. As such, you need to use extra caution in everything you do, to avoid a biased outcome. And since your hypothesis, as you stated it above, is that the good distortion will only likely be beneficial when it serves to compensate for bad distortion, your experiment should start by identifying the kinds and levels of distortion that people don't like. You'll need to do this first, and apply the bad distortion to both the control samples and the samples to which the "good" distortion has been added. Then you'll need to do the experiment without the bad distortion having been added to any of the music samples, and compare the statistical results of this to the statistical results of the experiment done with the bad distortion having been added. This is the only way that you'll be able to establish whether the reason people who like the good distortion is specifically because it compensates for bad distortion, vs. like it for reasons that have nothing to do with compensation for bad distortion. This is not going to be an easy experiment.

This could be one good way to proceed, i.e. injection of bad and good distortions into the source track and after only the good ones. But there is always a problem... there are aspects that cannot be completely controlled. What is the distortion structure already contained in the test music track? Which is the one added by the reproduction chain? If measurements can be made for this second, there is a big question mark for the first. In other words, the original track already plays with a "bad" distortion, totally or partially known, and that can't be removed. Adding more complicates the study a lot.

By the way I watched and enjoyed the video you linked, with Erin @hardisj interviewing Earl Geddes. That was a great interview. I couldn't help but take notice when Mr. Geddes said that he does not believe that the reason that some people prefer tube amplifiers has anything to do with distortion. If you want to show otherwise, you'll not merely need to demonstrate that people really do like low-order distortion, but also that the level at which they prefer it is a level commonly encountered with tube amps regardless of brand or model or even of the volume level. It does not seem to me that this will be an easy thing to do, and again you'll want to keep a wide berth of confirmation bias.

Differentiating the study for tube and solid state amplifiers, at different volumes, is something I would like to consider.

A much larger study could be organized by involving members of this site. I could make original and distorted music tracks available for download, without specifying of course the nature of each track. Everyone could listen to the tracks in the comfort of their own home, without exam stress, at controlled levels and provide the listening sensations. Information such as the experience in the field of each one, age, type of reproduction chain, price range, room etc., should be collected anonymously to build a case study. It would be less rigorous, but certainly interesting: It would simulate what would happen if they added distortion adjustment on their reproduction chains.
 
Last edited:

bigjacko

Addicted to Fun and Learning
Joined
Sep 18, 2019
Messages
529
Likes
253
If adding distortion makes music more pleasing, I wonder why the music mixers have not found this. Some guitar effects are distortion, so it should not be too hard for them to relate to mixing. There must be some experiment or maybe accident that make people find out adding distortion is better. Of course those are just guesses.
 

levimax

Major Contributor
Joined
Dec 28, 2018
Messages
1,040
Likes
1,459
Location
San Diego
If adding distortion makes music more pleasing, I wonder why the music mixers have not found this. Some guitar effects are distortion, so it should not be too hard for them to relate to mixing. There must be some experiment or maybe accident that make people find out adding distortion is better. Of course those are just guesses.

If you read Steve Hoffman over at SH forums he often talks about running the original master tapes through several "tube" stages and other devices that "add distortion" when he produces "audiophile" remasters for SACD's, CD's, and vinyl. I don't like the concept but I do own some of the AP and DCC remasters he has done and they sound great.... much better than the original CD's or original vinyl.
 

mt42

Member
Joined
Mar 4, 2021
Messages
51
Likes
16
1) a friend of mine with 2 PhD in psychoacoustics told me once that ~20% of the population prefers soft-distorted sound. No one likes clipping or zero-crossing distortions.
2) Once upon a time, I was looking for a tester with "golden ears". Lots of audio engineers applied, all of them were distortion-deaf. I ended up hiring a young Italian guy from a family with 10+ generations of musicians.
3) If you have a high-quality audio system, you may find out that 95% of CDs are pretty poorly recorded.
 
OP
Pinox67

Pinox67

Member
Joined
Apr 8, 2020
Messages
34
Likes
76
Location
Italy
If adding distortion makes music more pleasing, I wonder why the music mixers have not found this. Some guitar effects are distortion, so it should not be too hard for them to relate to mixing. There must be some experiment or maybe accident that make people find out adding distortion is better. Of course those are just guesses.

As far as I know, distortions are used in production (in the thread you can find some references, like here), although not good enough. I believe that their actual use depends a lot on the knowledge and skills of the sound engineers. Probably the objectives they are induced to pursue also affect, such as optimization for compression for streaming or equalization for non-high-quality playback devices.
For playback, as I mentioned, my amp builder friends are used to adjusting them to add low order harmonics to satisfy their customers, who have no technical knowledge but they have a good ear and a lot of experience.

My interest is to study these effects more scientifically. An update: after the summer break, I mainly dedicated myself to perfecting the program I wrote to inject the distortions. Now I can inject distortions of any order and apply high pass and low pass filters of any order to simulate bandwidth limitations. However, the implemented nonlinear distortion model is the simplest, static type. As a next step I would like to model also the dynamic nonlinear distortion, i.e. with memory, which is always present in all audio devices. This is a bit more complex to manage: it is necessary to play with Volterra Kernels, not really a walk in the park...
 
Last edited:

GimeDsp

Senior Member
Joined
Oct 30, 2020
Messages
391
Likes
307
Location
Earth
"making moves" in audio production comes down to many things.
1. The band members who have Crafted "thier sound" with guitar/strings/pedals/ etc.

2 the recording/mixing engineer who is using a mix of transparent and "character" gear to shape the sound.

3. Finally the mastering engineer who has training and tools and often a boss room and playback system to fine tune, sometimes adding hormonic distortion.

In the case of modern music, it is squashed so bad even the mastering engineer often hates what the clients demand.

So good music will sound good on any transparent and balanced system and bad music needs smoothing on playback adding pleasing distortion to hide the badness. But this masking will also ruin good music.
 

Duke

Addicted to Fun and Learning
Manufacturer
Forum Donor
Joined
Apr 22, 2016
Messages
797
Likes
1,791
Location
Princeton, Texas
The goal of the study is to understand how the distortions you have indicated can be masked with other "good" distortions (i.e. low-order), specifically injected at some point in the reproduction chain, which make the sound more pleasant, without hiding the details of the original sound texture.


I find this fascinating.

I was working with Earl Geddes on a loudspeaker project at the time that he and Lydia Lee were developing the GedLee Metric, and after he had finished analyzing the data, he said to me: "Duke, now I understand why you and your friends like tube amps". I say this NOT to derail with a debate about tubes, but to point out that distortion perception is counter-intuitive. I applaud your exploration into using distortion to work WITH human hearing instead of AGAINST it, by using a benign distortion to mask an objectionable one.
 

GimeDsp

Senior Member
Joined
Oct 30, 2020
Messages
391
Likes
307
Location
Earth
I find this very interesting.

I was working with Earl Geddes on a loudspeaker project at the time that he and Lydia Lee were developing the GedLee Metric, and after he had finished analyzing the data, he said to me: "Duke, now I understand why you and your friends like tube amps". I say this NOT to derail with a debate about tubes, but to point out that distortion perception is counter-intuitive. I applaud your exploration into using distortion to work WITH human hearing instead of AGAINST it, by using a benign distortion to mask an objectionable one.

It just goes back the the principle of "design for usage"

I run tech/repair at a small church, they have no band and rely on horrible YouTube videos for music playback. I am considering some tube pre amps to mask the bad distortions with "better" ones. But that's because the usage demands it.

If I listen to good quality music why would I want to mask anything? I don't.

So why would someone spend time and money for transparent signal chain from source to speakers then insert a masking device.

If you listen to bad music then design to mask it.
If you listen to good music then masking with distortion makes no sense.
 

Duke

Addicted to Fun and Learning
Manufacturer
Forum Donor
Joined
Apr 22, 2016
Messages
797
Likes
1,791
Location
Princeton, Texas
If I listen to good quality music why would I want to mask anything? I don't.

So why would someone spend time and money for transparent signal chain from source to speakers then insert a masking device.

If you listen to bad music then design to mask it.
If you listen to good music then masking with distortion makes no sense.


This isn't about masking music or degrading clarity. This is about improving perceived sound quality.

See Section 4 of this paper, on the psychoacoustics of distortion perception: http://www.gedlee.com/Papers/Distortion_AES_I.pdf

Distortion tends to show up as a tonal or timbral characteristic. For example, crossover distortion in an amplifier may show up as "harshness", even if the THD numbers are low. Now IF we could perceptually mask that "harshness" with a benign "warmth", without degrading clarity, imo THAT would be a worthwhile improvement.
 

GimeDsp

Senior Member
Joined
Oct 30, 2020
Messages
391
Likes
307
Location
Earth
This isn't about masking music or degrading clarity. This is about improving perceived sound quality.

See Section 4 of this paper, on the psychoacoustics of distortion perception: http://www.gedlee.com/Papers/Distortion_AES_I.pdf

Distortion tends to show up as a tonal or timbral characteristic. For example, crossover distortion in an amplifier may show up as "harshness", even if the THD numbers are low. Now IF we could perceptually mask that "harshness" with a benign "warmth", without degrading clarity, imo THAT would be a worthwhile improvement.

My point is that the vast majority of music already has tons of distortion added by people at every step of the way.

From carefully chosen guitar tones, the character compressors that bring up the natural harmonics by limiting dynamic range(often used on drums), the the trained mastering engineer who has his/her own distortion adding devices. It's not like there is a single recoding free of added distortion except maybe a full dynamic range live recording.

So the "musical" distortion you want to add is already there in 95% of music, so you can't add anything better then the pro's.
"Character" shaping rack gear and plug ins is big business and often used.

https://www.empiricallabs.com/distressor/
https://www.sweetwater.com/store/search.php?s=tape+saturation

https://www.uaudio.com/blog/how-the-pros-choose-microphone-preamps/

https://www.izotope.com/en/learn/4-ways-to-add-augment-or-excite-upper-harmonics.html
 

GimeDsp

Senior Member
Joined
Oct 30, 2020
Messages
391
Likes
307
Location
Earth
Here is a good basic overview of distortion in music production.
It points out correctly that bass DI and drums/parallel drum buss is the go to for heavy distortion to fatten things up.

https://www.edmprod.com/distortion-saturation-guide/

As a sound engineer for many years I can say that adding distortion to sound will always result in a trade off.
 

Duke

Addicted to Fun and Learning
Manufacturer
Forum Donor
Joined
Apr 22, 2016
Messages
797
Likes
1,791
Location
Princeton, Texas
My point is that the vast majority of music already has tons of distortion added by people at every step of the way.


I don't dispute your point.

What @Pinox67 is investigating has to do with something different. It has to do with an application of the psychoacoustic phenomenon called "masking", which is described in Section 4 of the paper that I linked to.
 

GimeDsp

Senior Member
Joined
Oct 30, 2020
Messages
391
Likes
307
Location
Earth
some music is mono'd on LF and some isn't, I have been going through songs I enjoy and doing this
1. Take stereo track, split to mono
2. Invert one side
3. Sum into new track

4. Take that track and LP(48db per octave) at 120hz, 100hz, then 80hz

It's quite interesting how some music has a lot of LF content that is panned uniquely to one side.

The next test I want to do is:
1. isolate my listening chair
2. Try to locate with ear plugs
3. try to locate with earplugs in muffles

It is quite possible that bone induction via floor can help isolate LF. Since sound/vibration travels faster through wood/concrete and other rigid material that air, your brain would have all this info way to locate way before the sound even gets to you. Vibration, LF content that has sound past Xover region (like bass guitar/drums/etc) would all help to locate.
 

xaviescacs

Active Member
Forum Donor
Joined
Mar 23, 2021
Messages
249
Likes
200
Location
Barcelona
This isn't about masking music or degrading clarity. This is about improving perceived sound quality.

See Section 4 of this paper, on the psychoacoustics of distortion perception: http://www.gedlee.com/Papers/Distortion_AES_I.pdf

Distortion tends to show up as a tonal or timbral characteristic. For example, crossover distortion in an amplifier may show up as "harshness", even if the THD numbers are low. Now IF we could perceptually mask that "harshness" with a benign "warmth", without degrading clarity, imo THAT would be a worthwhile improvement.

I'm just a guy that owns two headphone amps: a SS one, a Lake People and an Hybrid one, a Pathos Aurium, and I want to share some subjective impressions.

The Hybrid amp make things less harsh and more enjoyable. One clear example arises when playing an old recording, those analog recordings that have some background noise. With the hybrid amp, this noise is noticeable lower than with the SS amp, but the rest of the musical seems to be the same in both. How is that possible? It gets rid of the nasty things? I also believe this amp does something in the time domain, making "events" more separable resulting in a more relaxed experience, like if one was able to hear more content per time unit, like watching it at slow motion. Something like what is called de-masking in the SPL Tube Vitalazer web page?

For the record, few days ago I participated in a distortion hearing poll here at ASR and was able to pass it quite easily with the hybrid amp, which is rated at THD < 0.1.

I'm not an audio nor electronics expert, just sharing my experience, and I will be following these studies because my subjective experience tells me there is something worth investigating here. Thanks everybody and specially to @Pinox67.
 
Last edited:

GimeDsp

Senior Member
Joined
Oct 30, 2020
Messages
391
Likes
307
Location
Earth
I've noticed songs are much dryer and sterile on headphones, that is defiantly one place where tubes can.
 

bigjacko

Addicted to Fun and Learning
Joined
Sep 18, 2019
Messages
529
Likes
253
@levimax @Pinox67 My thinking about this problem is what if they already add enough distortion that sounds good, adding furthur distortion will make things worse. Maybe some people did not add distortion, some add a bit but not enough and some add good amount, all those situations will need different amount of distortion or even none added to sound good. Preferably I would make the mixers add the distortion they want as effect of the song and we reproduce it without distortion, but who knows what is on their mind.
 

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,069
Likes
1,211
Location
California
I suggest first running a control test to make sure any audible effect is detected. Only then enter phase 2 of quantifying the differences if they exist.

You also need to perform a statistical analysis to make sure distortion was a significant factor and chances of guessing was less than 5%.
I notice that Amir posted this a few months ago and I might have missed it but I'd didn't see the results of the individual listening trials posted, let along the statistics that would accompany them. Given the level of scientific rigor I often see with these "informal" experiments (sorry), my guess is that you had a single run for each listener and simply asked them to provide their preference. If that's the case, then regrettably the experimental results are not usable at all and no conclusions whatsoever can be drawn from the so-called experiment.

Now I hope I'm wrong and that an appropriate number of blinded trials were run for each listener, in which case it shouldn't be a big deal to post that data.

Also it would lend some additional credibility if you posted the test files used so that others can try to reproduce your results.
 
OP
Pinox67

Pinox67

Member
Joined
Apr 8, 2020
Messages
34
Likes
76
Location
Italy
I notice that Amir posted this a few months ago and I might have missed it but I'd didn't see the results of the individual listening trials posted, let along the statistics that would accompany them. Given the level of scientific rigor ...

The results of the limited preliminary tests that I described some months ago have been useful to understand whether what hypothesised (the relationship between specific type of distortions and listening pleasure, which relies on the masking effect of our ear) are a way worthy of a scientific study or less. These tests, let's say “non-formal”, gave encouraging results, in line with other experiences “in the field”. To proceed with the study, I agree with you, it is necessary to build rigorous listening test protocol. This is one of the reasons why I opened the thread, looking for opinions and suggestions, which in fact have arrived numerous; I take this opportunity to thank everyone for the time dedicated to me.

At the same time, preparatory to the set up of the test protocol is the definition of the mathematical model of digital simulation for the nonlinear distortions. Interesting aspects emerged while working on the model, which made me touch many of the considerations described in Chap. 4 of the GedLee publication. I summarise below the main aspects of the adopted model to highlight how complex is the topic and therefore have an idea about how difficult is to understand "how it sounds" a particular amplifier. In closing, the details on the test protocol that I am developing. Suggestions are always welcome.


Static nonlinear distortion model
In a nutshell, we can divide the type of distortions caused by systems (in this context, any audio electronic device) in two families: linear and nonlinear.
  • Linear distortions determine in the output “only” alterations of the module and phase of each of the frequencies contained in the input signal. These are fully modelled by the Transfer Function, the curve that is most frequently shown and measured for any audio device: in general, a curve that is as flat (in the audible band) as possible is preferred to avoid "sound colouring".
  • Nonlinear distortions instead introduce frequency components into the output that are not present in the input signal. These are linear combinations of the frequencies in the input signal and they generally reported through the values of THD (harmonic distortion) and IMD (intermodulation) which quantify them with simple numbers. Low values of these ones are better than highs, even if what really matters is the structure in frequency of distortions.
These two components are always present in a system, so we can exemplify our system in the following figure:
Figure 1 - Simple Nonlinear Static amplifier model
1632083207264.png
Here an analog input signal x(t) passes through a system which transforms it into another output signal that we can represent for convenience in the form gy(t), where:
  • g is the gain operated by the system (generally in voltage), which we will assume constant and frequency independent, at least in the usual range 20Hz-20KHz. It can be controlled through the volume if the device is a preamplifier; constant not modifiable if it is a power amp.
  • f(x) models the nonlinear aspect of the system. This is a function that for each instantaneous value of the input x(t) returns a value u(t) which depends only on the value of x at that instant (less the crossing time, which is not interest here); moreover, it does not change over time. It is normalised (gain = 0dB). If f(x) = x (output = input) then the system has no nonlinear distortion; any other relation determines new frequency components in u(t).
  • H(f) models the linear part, i.e. the transfer function that determines how each frequency component of its input is altered. This function in fact also models the memory effects of the system, absent in f(x). The analogue in the time domain of H(f) is the impulsive response h(t), linked together by the Fourier transform.
Both curves f(x) and H(f) depend on the type of amplifier; more specifically, f(x) is very different between the solid state and tube family of amplifiers. Using a "black box" approach, we assume that it is representable through a polynomial, which essentially represents Tylor's series development of the original function, truncated to a certain number of terms:

f1.PNG
By introducing a pure sinusoidal function x(t) in f(x) and carrying out the expansions, it results that the term ai controls:

- a0: component in DC, usually equal to 0.
- a1: component of the original signal, close to unity.
- ai, i > 1: harmonic distortion component of order i, i - 2, i - 4, ...

Therefore, the degree of f(x) determines also the order of the maximum harmonic distortion that is generated. For amplifiers in normal working conditions the distortions are generally lower than the 10th order; the higher ones are covered by the thermal noise.

For example, the f(x) curve obtained through distortion measurements with a tone at 1KHz, 1Vrms, 0dB gain of the preamplifier used for the tests, the Threshold FET10/e, of low distortion, is shown in the following figure. Here the effect of the distortion is amplified 10000 times (80dB) with respect to the line at 45 degree to make it visible.
Figure 2 - f(x) for Threshold FET10/e at 1KHz, 1Vrms, 0dB gain
Screenshot 2021-09-20 at 00.03.08.png
It is evident here how the distortion, higher in the second harmonic, is asymmetrical. Furthermore:
  • Positive values of x(t) are slightly "compressed" for low values, and then "expanded" for higher values.
  • Negative values of x(t) are always compressed more.
Simulation
Now let's try to transform the system of Fig. 1 into an equivalent one, where the target system is replaced by an ultra-linear one and its distortions are digitally injected. A first step is to consider the discrete version x[n] of x(t), and then we introduce in the chain a DAC upstream of the amplifier, as shown in Fig. 2-a.
Figure 3 - Simple Nonlinear Static amplifier simulation
Screenshot 2021-09-20 at 08.16.01.png
The next step is to replace the target system with two components:
  • An ideal amplifier downstream of the DAC with zero linear and nonlinear distortions, which only takes care of amplifying the input signal of the g value. In reality, we will be "satisfied" with using an amplifier with very low distortion levels and a high bandwidth.
  • A component upstream of the DAC (the program I wrote) that digitally simulates the application of f(x) and H(f) (h[t] in discrete time domain) functions on the discrete signal x[n]. Both create the y[n] version with unity gain, in input to the DAC which transforms it into y(t) in input to the linear amplifier, as shown in Fig. 2.b.
So, y[n] value is obtained from x[n] with the simple formula:
image.png
where * represent the convolution operation.
As additional note, the DAC should be configured to avoid oversampling: we'll see why in the next section.

Identification of f(x)
The coefficients of f(x) are calculated by the harmonic distortion value in dB for each order (up to 32). These values can be obtained from a real device measuring the harmonic distortion of a pure tone at 1KHz. From these data, the coefficients ai of f(x) can be obtained resolving a set of linear equations.
For example, for distortion components of -50dB, -60dB and -70dB for second, third and fourth harmonic respectively (THD = 0.33%) we will have the following time contributions for each harmonic for one cycle of the fundamental harmonic (DC is omitted; the resulting curve is dashed):
Figure 4 - Distortion shape by frequency synthesis
1632084989931.png
The corresponding polynomial f(x), of 4th degree, has coefficients a1 = 0.9897, a2 = 0.0037, a3 = 0.0039, a4 = 0.002, whose addends together give rise to the same curve for a sinusoid as an input signal:
Figure 5 - Distortion shape by time domain components
1632085000075.png

Processing of x[n]
In the discrete processing of the x[n] several precautions must be taken:
  • Generally, the bandwidth of an amplifier exceed 100KHz. This implies that the discrete signal processing, that can create ultrasonic frequencies, must have a sampling frequency fs higher than at least 200KHz to avoid any aliasing phenomenon. We have therefore chosen to perform an oversampling as the first operation on x[n], bringing it at least to fs = 352.8KHz or 384KHz (configurable up to 32x).
  • After the distortion injection, the signal can then be brought to a lower sampling frequency, with a decimation operation downstream of an aliasing filter to limit it to the new in-band. This operation is also configurable: at least fs = 176.4KHz or 192KHz is recommended.
  • Given the numerous operations on the signal, the risk of inserting distortions due to rounding errors is contained by managing all calculations, in floating point with 64bit precision. At the output, the signal is brought to the bit depth of 24bit through a re-quantization operation. To avoid introducing unwanted distortions in this operation, the signal is subject to dithering.
  • Since f(x) is normalised (f(1) = 1 or f(-1) = -1), the injection of harmonics is related to this value. If the g value (volume) of the ideal amplifier is changed, the amount of distortion in dB in the output signal g⋅y'(t) always remains at the same value in dB. This is not what happens with the real amplifier, where the distortion level depends instead on g value. The implication is that the amount of distortion injected is that actually heard only at a certain listening level. In the process of calculating f(x) it is however possible to set the desired value of g to remodulate the amount of distortion injected.
Structure of distortions
Some interesting properties derive from the analysis of the equations that control the estimation of f(x):
  • The phase of the distortions relative to a pure tone have the following values:
- order 1, 5, 9,…: 0 degree​
- order 2, 6, 10,…: -90 degree​
- order 3, 7, 11,…: 180 degree​
- order 4, 8, 12,…: +90 degree​

  • Odd harmonics (symmetric) make contributions to the fundamental:
- order 3: 3/4⋅a3⋅x^3​
- order 5: 10/16⋅a5x^5​
- order 7: 35/64⋅a7x^7​
- order 9: 126/256⋅a9x^9​

  • Even harmonics (asymmetric) add a DC component:
- order 2: 1/2⋅a2x^2​
- order 4: 3/8⋅a4x^4​
- order 6: 10/32⋅a6x^6​
- order 8: 35/128⋅a8x^8​

  • If the coefficients of f(x) are all positive, we have that a harmonic of order d adds contributions (in phase) to the lower harmonics of degree d-2, d-4, ..., in increasing quantity as the order decreases. This implies a decreasing structure of the distortion values with increasing degree, separately for even and odd harmonics.

  • As the level of the input signal increases, the amount of distortion increases faster for higher order harmonics (there is the dependence on x^n). For example, when measuring the RME-ADI2 pro fs DAC chain with the Threshold FET10/e preamplifier, this distortion pattern results (1KHz tone, 0dBFS, 1Vrms in/out):
Figure 6 - Measured Harmonic Distortion of a 1KHz tone
1632086197218.png

While the distortion measure per input level is as follows:​
Figure 7 - Measured Harmonic Distortion per input level of a 1KHz tone
1632086249080.png

By obtaining the curve f(x) with the described model, the following diagram results for the same graph, unless of course the background noise (-135dB).​
Figure 8 - Harmonic Distortion per input level by the model
1632086314119.png

There is a moderate agreement, but not perfect... why?​


Dynamic nonlinear distortion model
The main reason for the moderate agreement for the data above is due to nonlinear distortion model adopted, of the static type, where it is assumed that the nonlinear component f(x) has no memory about the value of the signal handled in the previous instants (only the linear one deals with it). Unfortunately, this hypothesis is not true for audio devices, which all exhibit this type of memory effects, named also dynamic nonlinear distortions. From a physical point of view this is determined by the non-linearity of the components that make up the amplifier, together with thermal effects. From a mathematical point of view, the effect of memory translates into making the value of u(t) depend in a given instant also on the values assumed by the entry in previous instants. From a certain point of view, it can be said that the way in which x(t) moves on the “static” f(x) curve modifies the f(x) curve itself, which determines alterations in the structure of the distortions; in other words, f(x) depends on the frequencies present in x(t).

An indication of this effect can be found in the measure of harmonic distortion as a function of frequency: more the curves deviate from a constant value, more the system is affected by dynamic distortion. The following graph illustrates this effect for the Threshold FET10/e, very good when compared to the average:
Figure 9 - Measured Harmonic Distortion per frequency
1632086826977.png
A powerful mathematical “black box” model that describes these effects is based on the Volterra Kernels. A strong simplification applicable in the audio field is that of Diagonal Volterra Kernels, which consists in dividing the system still in two parts, as shown in Fig. 10:
Figure 10 - Diagonal Volterra Model
1632135976585.png
The output signal here is obtained by adding together n parallel streams. The i-th flow is built again by a nonlinear part without memory that models the nonlinearity of order i (a polynomial gi(x) of degree i), followed by a dedicated linear system that models the memory effect only for that order of distortion. With this schema, the contribution of each nonlinear distortion order is modified in module and phase before being added all together (to remember that the distortion of order i includes also components of order i - 2, i - 4, ...). So, the final distortion structure can differ a lot from the static one, where instead the sum in f(x) of distortions was all in phase. The value of y[n] is expressed by the following "relatively" simple formula:
image.png
The real complexity is in the identification of the transfer functions hi[n] from the measurements of the real devices; gi(x) polynomials are fixed in advance to make simple the same estimation. Unfortunately, the measurements that are normally made do not sufficiently explore this aspect. As the term "dynamic" suggests, this type of distortion has major impact on transients, which real musical signals are full of... by neglecting it, we lose non-secondary aspects of the amplifier's behaviour.

This overview shows how is difficult to fully characterise the sound of an audio device. Anyway, I won't go any further now: there is an extensive bibliography on the subject for those who want to know more. Currently I am investigating more on this model; the consolidated simulation program implements only static distortion.


Subjective Listening Test Protocol
Moving on lighter and more fun aspects, I report below the procedure for carrying out the listening tests that I am preparing.

Preparation
A certain number of tracks of different genres and of proven quality are selected. For each one, a new track is created consisting of three components:
  • Selection of a significant part of the original track of no more than 30 seconds.
  • Creation of a new track in high resolution (at least 192Khz/24bit) which contains three subtracks in a random order:
    • The original track part, oversampled only.
    • Two additional tracks with the injection of different levels of second and third order distortions: (high, low) and (low, high). The low and high values are to be determined based on preliminary audibility tests on the reference system used in the listening test.
In the injection of harmonics, the gain (or attenuation) value of the original signal must be chosen in such a way that:
  • The RMS level of the resulting signal does not differ by more than 0.5dB in the different versions.
  • There isn't clipping.
  • The listening level, set at 90dB SPL and 70dB SPL, must be taken into account (then, possibly more versions of the same track).
These aspects are verifiable with the same program that inject distortions.

Execution
On the reference system for the test, the correct volume is set. The person carrying out the listening test can independently select the tracks (each containing the three subtracks A, B, C) and listen to them at will, without any external help. Then, he will fill in a questionnaire divided into the following items, for which he must express a score from 1 (low) to 5 (high) for versions A, B and C of in track:

- A / B difference
- B / C difference
- Pleasant
- Distortion
- Timber
- Dry
- Enveloping
- Quality of Bass
- Quality of Medium
- Quality of High
After compiling the report for all tracks, the data are collected and analyzed with the classic statistical methods.
 
Last edited:
Top Bottom