Introduction
As a hi-fi enthusiast for many years I have always tried to deepen the themes underlying sound reproduction and psychoacoustics. For the latter topic, some time ago I came across the interesting book (on line here) "Premium Home Theater - Design & Construction" by Earl Geddes & Lidia Lee (or GedLee). Here are some considerations that have changed a lot the way I conceive of audio playback. At page 58, it is reported verbatim, about the relationship between classic distortion measurements and listening sensation:
The preference of the presence of distortion (even if of a specific type, as we will see shortly) over its absence may at first appear contrary to common sense. In fact, it is to be expected that, to obtain the same sensations desired by the artist, it is necessary to reproduce the original music track with a system that playbacks any music track with the highest level of fidelity. Therefore, among the various parameters that characterize the components of a system, the nonlinear distortion should be as low as possible.
By integrating personal experience with an in-depth reading of the works of GedLee and others, such as that of Bob Katz (here) or Nelson Pass (here) to name a few, it emerges that we need to consider the entire chain of sound, from production to reproduction. From the moment the signal is captured from the source to its recording on disk (to be nostalgic), the audio signal passes through a series of inherently non-linear electronic systems. Then it moves on to the reproduction chain, again consisting of non-linear components. Each component in the chain adds nonlinear distortions to the signal output from the previous component. All these distortions, despite being singularly very small to the elementary test signals, are much less small with the real signals and multiply in the signal chain, resulting in high order harmonic distortions on our loudspeakers. These can be potentially audible and they have nothing to do with the sound as it was generated by the original source. Unfortunately it is practically impossible to identify and eliminate these distortions that create effects such as listening fatigue and lack of naturalness, even assuming that the sound engineer has not made any mistakes.
How to get out? A help is offered to us by nature, which has endowed us with a highly complex and nonlinear hearing system. Specifically, we can leverage the effect of masking, which manifests itself by raising the threshold of hearing of signals close to others of a higher level. Lossy compressors such as MP3 are also based on this principle. Many details are always in GedLee's book, Chap. 2: below a graph from the book showing the threshold raising depending on the dB level of a masking tone.
Hence the justified practice of building amplifiers that add low-order nonlinear distortions in the playback, that is, low-order distortions; audiophiles with a preference for "tube sound" are familiar with them by the name of "euphonic" distortions. These generally have the dual effect of:
However, the excess of these components has the effect of obscuring the sonic details or making it too artificial. In fact, the quantity of the different harmonics added is one of the aspects that most determines the "character" of an amplifier and affects the "winning" combinations with other audio components.
My study described below, which has just begun using personal and friends' resources, therefore aims to investigate in detail how specific amounts of low order distortions affect perceived sound quality, using real music. For the above, I expect that preferences can be different depending on:
Test Preparation
As first step, I wrote a program that adds a configurable amount of second and third harmonic distortion (asymmetric and symmetric distortions, respectively) to a digital audio track of any sample rate or bit depth. In this way it is simulated the addition of "good" nonlinear distortions in the "original" signal. The output track is (automatically) compensated to avoid clipping and to eliminate any DC introduced. Moreover, in order to avoid further distortions, it is subject to dithering and noise shaping in the recoding downstream of the high-precision internal processing. The idea is that if we use a downstream amplification chain that is as "transparent" as possible, we can thus get a fairly precise idea of the effect on the perception of these harmonics.
To test the program I worked first with test signals. For example, a pure tone input at 1KHz 0dBFs (96KHz / 24bit) has this spectrum:
The program outputs tones of this shape by setting -40dB and -60dB of second and third harmonic distortion respectively:
For a bi-tonal signal at 5KHz and 6KHz, 44.1KHz / 16bit:
We have in the output for distortion at -40dB of third harmonic:
While for distortion at -40dB of second harmonic only we have:
Below is what happen when a distortion of -70dB in both the second and third harmonics is applied two times to a pure tone at 1KHz, 96KHz / 24bit (it simulates a chain of two devices with same nonlinear distortion):
The level of second and third harmonics are increased of 6-7dB and appear higher order harmonics (fourth and fifth), albeit at low levels.
With real musical tracks, I tried to subtract the original signal from the distorted one, obtaining a distorted copy at very low level of the original track, quite similar at listening.
After the program testing, to start with real listening sessions, I selected 4 well known Jazz / Pop music tracks with a good recording quality. For each I have created 2 further versions: one with main distortion of the second harmonic (-50dB) and the other of the third (-70dB). I chose high values in order to make the perceptual effect of each type of distortion evident. The value of Gm(f), the metric of GeeLee that tries to correlate distortion to subjective rating taking into account the masking effect, i.e.:
is always less than 1 with these values (T(x,f) is the nonlinear transfer characteristic). From each music track it has been therefore selected significant parts of 15-20 seconds, each predominantly for female voice, or for percussion, wind or string instruments.
Test Execution
The 3x4 music tracks above were played at controlled levels (about 82dB) several times at a very close distance on a couple of high quality solid-state audio systems, inserted in dedicated medium-small rooms treated acoustically. In each session the listeners naturally did not know which of the three songs they were listening to, randomly labeled with a number. For each session I asked to answer the questions:
Test Results
The small sample is quite concordant in the results:
Preliminary Conclusions
At the date the test results appear to be in line with those of GeeLee and other works. Of course, more extensive tests are needed with more people, also not trained and on medium quality audio systems, combining different distortion levels with listening levels and then with more audio playback systems, music types and recordings. Yes, it is not a fast task!
However, comments and suggestions are welcome to refine the study, as well as comparisons with similar experiences.
As a hi-fi enthusiast for many years I have always tried to deepen the themes underlying sound reproduction and psychoacoustics. For the latter topic, some time ago I came across the interesting book (on line here) "Premium Home Theater - Design & Construction" by Earl Geddes & Lidia Lee (or GedLee). Here are some considerations that have changed a lot the way I conceive of audio playback. At page 58, it is reported verbatim, about the relationship between classic distortion measurements and listening sensation:
- There is virtually no correlation between Total Harmonic Distortion (THD) or Intermodulation Distortion (IMD) measurements of a system and the subjective impression of the sound quality of that system. The correlations were weak, but most shockingly they were negative—according to these tests people liked THD distortion. This is actually somewhat true in general that people prefer some forms of distortion to no distortion.
- Signal based distortion measurements (THD, IMD, MTD), which are based on a purely mathematical formula which does not take into account the characteristics of human hearing, do not hold out much hope for ever being an accurate measure of subjective impression.
- Measures need to be based on the actual nonlinear characteristic of the system and scaled to account for the human hearing system.
The preference of the presence of distortion (even if of a specific type, as we will see shortly) over its absence may at first appear contrary to common sense. In fact, it is to be expected that, to obtain the same sensations desired by the artist, it is necessary to reproduce the original music track with a system that playbacks any music track with the highest level of fidelity. Therefore, among the various parameters that characterize the components of a system, the nonlinear distortion should be as low as possible.
By integrating personal experience with an in-depth reading of the works of GedLee and others, such as that of Bob Katz (here) or Nelson Pass (here) to name a few, it emerges that we need to consider the entire chain of sound, from production to reproduction. From the moment the signal is captured from the source to its recording on disk (to be nostalgic), the audio signal passes through a series of inherently non-linear electronic systems. Then it moves on to the reproduction chain, again consisting of non-linear components. Each component in the chain adds nonlinear distortions to the signal output from the previous component. All these distortions, despite being singularly very small to the elementary test signals, are much less small with the real signals and multiply in the signal chain, resulting in high order harmonic distortions on our loudspeakers. These can be potentially audible and they have nothing to do with the sound as it was generated by the original source. Unfortunately it is practically impossible to identify and eliminate these distortions that create effects such as listening fatigue and lack of naturalness, even assuming that the sound engineer has not made any mistakes.
How to get out? A help is offered to us by nature, which has endowed us with a highly complex and nonlinear hearing system. Specifically, we can leverage the effect of masking, which manifests itself by raising the threshold of hearing of signals close to others of a higher level. Lossy compressors such as MP3 are also based on this principle. Many details are always in GedLee's book, Chap. 2: below a graph from the book showing the threshold raising depending on the dB level of a masking tone.
Hence the justified practice of building amplifiers that add low-order nonlinear distortions in the playback, that is, low-order distortions; audiophiles with a preference for "tube sound" are familiar with them by the name of "euphonic" distortions. These generally have the dual effect of:
- add a sort of loudness to the original music content;
- hide unwanted higher order distortions (already contained in the music track or created by the playback chain).
My study described below, which has just begun using personal and friends' resources, therefore aims to investigate in detail how specific amounts of low order distortions affect perceived sound quality, using real music. For the above, I expect that preferences can be different depending on:
- audio playback chains;
- type of music;
- mastering of audio track;
- listening levels.
Test Preparation
As first step, I wrote a program that adds a configurable amount of second and third harmonic distortion (asymmetric and symmetric distortions, respectively) to a digital audio track of any sample rate or bit depth. In this way it is simulated the addition of "good" nonlinear distortions in the "original" signal. The output track is (automatically) compensated to avoid clipping and to eliminate any DC introduced. Moreover, in order to avoid further distortions, it is subject to dithering and noise shaping in the recoding downstream of the high-precision internal processing. The idea is that if we use a downstream amplification chain that is as "transparent" as possible, we can thus get a fairly precise idea of the effect on the perception of these harmonics.
To test the program I worked first with test signals. For example, a pure tone input at 1KHz 0dBFs (96KHz / 24bit) has this spectrum:
The program outputs tones of this shape by setting -40dB and -60dB of second and third harmonic distortion respectively:
For a bi-tonal signal at 5KHz and 6KHz, 44.1KHz / 16bit:
We have in the output for distortion at -40dB of third harmonic:
While for distortion at -40dB of second harmonic only we have:
Below is what happen when a distortion of -70dB in both the second and third harmonics is applied two times to a pure tone at 1KHz, 96KHz / 24bit (it simulates a chain of two devices with same nonlinear distortion):
The level of second and third harmonics are increased of 6-7dB and appear higher order harmonics (fourth and fifth), albeit at low levels.
With real musical tracks, I tried to subtract the original signal from the distorted one, obtaining a distorted copy at very low level of the original track, quite similar at listening.
After the program testing, to start with real listening sessions, I selected 4 well known Jazz / Pop music tracks with a good recording quality. For each I have created 2 further versions: one with main distortion of the second harmonic (-50dB) and the other of the third (-70dB). I chose high values in order to make the perceptual effect of each type of distortion evident. The value of Gm(f), the metric of GeeLee that tries to correlate distortion to subjective rating taking into account the masking effect, i.e.:
is always less than 1 with these values (T(x,f) is the nonlinear transfer characteristic). From each music track it has been therefore selected significant parts of 15-20 seconds, each predominantly for female voice, or for percussion, wind or string instruments.
Test Execution
The 3x4 music tracks above were played at controlled levels (about 82dB) several times at a very close distance on a couple of high quality solid-state audio systems, inserted in dedicated medium-small rooms treated acoustically. In each session the listeners naturally did not know which of the three songs they were listening to, randomly labeled with a number. For each session I asked to answer the questions:
- which song is the most pleasant to listen to;
- perceived differences between songs, if any.
Test Results
The small sample is quite concordant in the results:
- Tracks with second harmonic distortion were preferred in about 75% of cases; 25% those of the third; surprisingly never the original tracks.
- Tracks with second harmonic distortion was considered the warmer and with more "body"; other times too soft and a little confused.
- Tracks with third harmonic distortion was considered the brightest and most defined; other times excessively defined and tiring.
Preliminary Conclusions
At the date the test results appear to be in line with those of GeeLee and other works. Of course, more extensive tests are needed with more people, also not trained and on medium quality audio systems, combining different distortion levels with listening levels and then with more audio playback systems, music types and recordings. Yes, it is not a fast task!
However, comments and suggestions are welcome to refine the study, as well as comparisons with similar experiences.
Last edited: