A case for multi-tone and higher-order testing of headphone distortion
This is the latest formulation of what I had already documented in
https://www.audiosciencereview.com/...n-susvara-headphone-review.50705/post-1888972 (post #1,183; the old methodology dB calibration in those measurements is possibly high by a few dB compared to the latest calibration methodology).
tl;dr:
- Two headphones EQed to the same in-ear frequency response that have very low harmonic distortion in a given frequency band can greatly differ in their multi-tone distortion within that band, therefore it may be of interest to use multi-tone distortion measurements as a better measure of exceptionality.
- In a recent measurement of a demo unit, a headphone (name and distortion measurement not to be disclosed unless the other party were to provide their context) that measured reasonably and as expected up to the third harmonic was found to have significant harmonic distortion of the fifth order and higher where the other headphones for the same measurement conditions primarily exhibited second order distortion with the rest being close to or below the noise floor.
This post will show my measurement methodology, present the harmonic distortion measurements, and then the commensurate multi-tone distortion measurements demonstrating why this type of measurement may be of interest to ASR. This post excludes measurements of certain headphones that I cannot disclose without the other party's authorization, yet would have likely been of particular interest to ASR.
Methodology
In-ear microphones, audio interface, and SPL calibration:
https://www.head-fi.org/threads/con...les-mattering-for-audio.970430/#post-17816683 had covered my latest measurement equipment, namely in-ear microphones plugged into RØDE VXLR+ adapters plugged into a MOTU M2 whose DAC's line output is connected via a male dual TRS to male 4.4 mm TRRRS adapter to a FiiO K9 Pro ESS DAC/amp's balanced line input in preamp mode (having the DAC and ADC on the same device facilitates more accurate and consistent phase response measurements and likewise measurement averaging for reducing the effective noise floor), measurements being captured using Room EQ Wizard (REW). The main difference is that I had developed new mounts for my in-ear microphones and a new method of SPL calibration as seen below.
Figure 1:
- (a): The 4 mm electret microphone capsules used are KEEG1538WB-100LB (https://canada.newark.com/kingstate/keeg1538wb-100lb/microphone-condenser-wire-w-proof/dp/48W5995), supplied by https://www.earfish.eu/ primarily for HRTF measurement. The dust covers were removed since they were bound to be rubbed off upon insertion into the new mounts. The same Radians Custom Molded Earplugs CEP001-R Red putty was used to form a plug to be inserted into the 1/2" to 1" adapter that came with my SPL calibrator.
- (b): The SPL calibrator used is the Tekcoplus ND9B (SLTK-885B) SPL calibrator. The tolerances are likely not as advertised (no Class 1 certification or calibration report is provided), but should be sufficient for my purposes. I had also purchased the Latnex SM-130DB SPL meter with "certificate" of calibration from facilities in Taiwan only specifying supposed compliance with ISO 9001:2015(CNS12681), the manual specifying IEC 61672-1:2002 class 2, IEC651 Type2, and ANSI1.4 Type2 as references, its measuring within the advertised +-1.8 dB @ 1 kHz spec between 0.4 dB to 0.9 dB too high compared to the ND9B 94 dB reference. In this case, the microphone capsule is mounted to the plug, inserted into the adapter, then inserted into the SPL calibrator which is turned on, SPL calibration being conducted via the instructions in https://www.roomeqwizard.com/help/help_en-GB/html/inputcal.html for "Use an external signal" until both channels register 94 dBC.
- (c): The in-ear microphone mounts were formed using the Radians Custom Molded Earplugs CEP001-R Red putty (try at your own risk; it is highly recommended to use an otoblock in addition to embedding a looped string for safely removing the plugs once formed; as a general rule of thumb, use no more than an 8 mm to 8.5 mm diameter ball of each reagent per plug and press the material in to be flush with the canal entrance and no deeper). A rotary tool was used to form a mounting channel as well as a small hole on the side through which the microphone capsule can be elastically inserted for measurements and removed for calibration. It was difficult to get a perfect match of shape and depth between the left and right plugs, but this sufficed considering the variable of the physiological differences between the left and right ears and sides of the head. The plugs provide around 15 to 20 dB of attenuation for the test tones, though not all impression attempts had yielded good attenuation or seal.
- (d): The in-ear microphones are inserted like so and the downward tension on the microphone wires minimized, the headphones then being placed on the head with the intuitive feedback not offered by industry standard heads. The REW "Measure" window (see https://www.roomeqwizard.com/help/help_en-GB/html/makingmeasurements.html) is opened at the default -12 dBFS test signal level, the measurement length set and the measurement rate set from 2 Hz to 96 kHz (purely for interest) with the measurement chain set to a sample rate of 192 kHz (averaged measurements run faster this way), "Check levels" clicked and the MOTU M2 and FiiO K9 Pro ESS volume knobs adjusted until the reference level (e.g. 94 dBC) is displayed, and the sweep then played with a piece of Amazon LENRD-shaped acoustic foam held behind the driver to absorb the backwave and prevent room reflections from interfering with the group delay and CSD measurements. Compared to my previous in-ear microphone mounts in https://www.head-fi.org/threads/mez...eadphone-official-thread.959445/post-17743502 (post #5,152), these provide much better positional consistency of the microphones as seen in Figure 4. They are also substantially more comfortable, and I can remove the plugs in the middle of a measurement session without concerns for measurement consistency.
Magnitude response calibration:
It was deemed uneconomical to invest in an Earthworks M23R measurement microphone and a commensurate Class 2 SPL calibrator. I had originally intended on upgrading my 4 mm electret microphone capsules to the TOM-1537L-HD-LW100-B-R (
https://www.digikey.ca/en/products/detail/pui-audio-inc/TOM-1537L-HD-LW100-B-R/12152294; see
https://www.digikey.ca/en/products/detail/pui-audio-inc/TOM-1537L-HD-R/7898333 for the datasheet) which had a 68 dB SNR as opposed to the 58 dB SNR of the KEEG1538WB-100LB. Though I did not have a calibrated measurement microphone, I did already have a pair of Genelec 8341A studio monitors. An indoor Genelec GLM calibration was done with the GLM microphone 1 m away on-axis from the tweeter facing away from the wall to the left of my desktop, the microphones being mounted as below.
Figure 2: TOM-1537L-HD-LW100-B-R 4 mm electret microphone mounted with lower edge 1 cm above the tip of the GLM calibration microphone 1 m away on-axis from the tweeter. Alligator clips were used before committing to solder more closely matched pairs to adapter cables with colour-coded shrink tubing.
Figure 3: 1/12 octave smoothing overlay of microphone responses in the configuration seen in Figure 2. As can be seen, these four capsules meant for use with two pairs of mics are impressively linear in "free-field" (in this context meaning without any other housing to interact with). I chose Mics 2 and 4 as my references. Measurements taken with the capsules only 0.5 cm above the GLM microphone's tip showed small variations likely more so related to height within the sound field than to the proximity to said tip. Comparisons with 3.51 ms impulse windowing and the same 1/12 octave smoothing showed similar trends for differences within the treble and top octave.
Unfortunately, when driven by the RØDE VXLR+ adapters which should have been providing plug-in power within spec, the TOM-1537L-HD-LW100-B-R capsules as seen in Figure 4 exhibited high second-order harmonic distortion tracking the measured headphone frequency response in a manner suggesting that the distortion came from the capsules themselves. It did not seem like I could have made a wiring error. As a last resort, the KEEG1538WB-100LB capsules with which I had already in the past measured impressively low distortion levels with for the Meze Elite were removed from their original in-ear mounts and inserted into the same left ear custom mount. 1/3 octave smoothed measurements were taken using TOM-1537L-HD-LW100-B-R Mics 2 and 4 in that same left ear custom mount, these being exported into Excel where they were RMS averaged, 1/3 octave smoothed measurements for the KEEG1538WB-100LB then being taken and exported into Excel where I could make calibration files to be imported into REW. These worked well and exhibited reasonable headphone channel matching for the measurements taken on April 5, though by April 10, the right KEEG1538WB-100LB which was known to have bass and treble roll-off was found to be much less stable than the left capsule.
Figure 4: Here is the unaveraged 4M length counterpart to Figure 9 where the TOM-1537L-HD-LW100-B-R capsule was used with the same VXLR+ adapter. This is a markedly different result from my measurements that use my older KEEG1538WB-100LB capsules, there only being second-order harmonic distortion above the noise floor that tracks the magnitude response and is vertically stretched perhaps due to nonlinear increase in distortion with respect to level.
Finally, here is a showcase the magnitude response measurement consistency for the left driver of my Meze Elite with "V3.1 PEQ"; see Figure 13 in
https://www.head-fi.org/threads/rec...-virtualization.890719/page-121#post-18027627 (post #1,812) for how this EQ profile compares to my threshold of hearing EQ compensated left ear 30-degree speaker HRTF.
Figure 5: Red trace: 4M length measurement taken on April 5 immediately after creating the calibration file. Cyan trace: 4M length measurement taken on April 10. Blue trace: 512k length measurement with 1/48 octave smoothing taken later on April 10 amid quicker measurements comparing FR variations with pad position, this measurement being representative of returning the cups to my preferred centered seating used for the previous two measurements.
Observations
Harmonic distortion:
In the below "4M8R" means that the measurement length in REW was "4M" and it was averaged across 8 repetitions to minimize the effective noise floor.
For all of these measurements, resting a hand on a grounded chassis like the MOTU M2's or FiiO K9 Pro ESS's attenuated a 60 Hz mains tone and its harmonics. Per
https://audiosciencereview.com/foru...hones-with-motu-m2-and-rew.49384/post-1783062 (post #6), use of the RØDE VXLR+ incurs upwards of 10 dB more noise centered around 900 Hz than the RØDE VXLR Pro which uses an internal transformer to convert the unbalanced plug-in power signal to a balanced at the expense of greater third-order harmonics below 1.6 kHz. As such,
the second-order harmonic distortion between 160 Hz and 600 Hz is more likely to be noise limited than the frequencies above it.
The demo unit measurements were taken at the audio shop in the dedicated listening room with sliding glass door, my SPL meter registering an ambient noise of around 45 dBA. These measurements were taken with the MOTU M2 USB connection plugged into my laptop with charger connected and the FiiO K9 Pro ESS plugged directly into the mains socket. All these demo units were connected to the DAC/amp with their stock unbalanced 1/4" TRS cable due to balanced cables having been unavailable.
Figure 6: Meze Elite demo unit with "V3.1 PEQ" left driver harmonic distortion at 94 dBC, 4M8R. An exceptional result comparable to my at-home result seen in Figure 7 and within Meze's spec, but a bit more noise limited in the bass up to 400 Hz.
The measurements below were taken at home with my own headphones, my SPL meter having registered an ambient noise of around 31 dBA. These measurements were taken with the MOTU M2 USB connection plugged into my desktop and the FiiO K9 Pro ESS plugged directly into the mains socket, its USB connection having still been connected to my desktop for use in regular listening.
Figure 7: Meze Elite Tungsten with "V3.1 PEQ" left driver harmonic distortion at 94 dBC, 4M8R. Mainly cleaner or less noise limited below 400 Hz, likewise for the 100 dBC measurements.
Figure 8: Audio-Technica ATH-M50xBT with "V3.1 PEQ" (effectively matching the magnitude response of Figure 6) left driver harmonic distortion at 94 dBC, 4M8R. With EQ, the distortion performance above 140 Hz competes with the EQed Meze Elite.
Figure 9: HiFiMan HE1000se left driver harmonic distortion at 94 dBC, 4M8R. Only competitive in the upper bass and midrange. This is the third out of three HE1000se units I had encountered, the second due to a mistake on my part and the third due to the second unit having had clear Q/C issues with its distortion per
https://www.head-fi.org/threads/totl-disappointments.925164/page-63#post-17949397 (post #936), whereby the first unit may have measured a bit better in the treble.
Multi-tone distortion:
Most interesting are the multi-tone distortion measurements which reveal cases where headphones with comparable harmonic distortion levels above 200 Hz differ in multi-tone distortion performance, in this case, the EQed Meze Elite coming out on top. Here, multi-tone distortion test signals provide an approximation of the busiest sections of music which can in some cases indeed be found through FFTs to comprise a superposition of tones following an approximately pink spectrum like below:
Figure 10: FFT spectrum of the first big tutti of the opening of Boulez' recording of Mahler Symphony No. 5. I've found most orchestral music to technically rather follow a brown/red spectrum envelope unless substantial brass was involved.
These measurements are taken with REW's RTA Window (
https://www.roomeqwizard.com/help/help_en-GB/html/spectrum.html), signal Signal Generator (
https://www.roomeqwizard.com/help/help_en-GB/html/siggen.html), and SPL meter (
https://www.roomeqwizard.com/help/help_en-GB/html/splmeter.html). The RTA window is set to a 4M FFT length with a rectangular window and "Exponential 0.50" averaging, my typically managing to show the result after only one or two averaging iterations.
Multi-tone distortion (of various causes) would show up as spuriae above the noise floor at the base of or between the main tones of the signal.
Here are TRS to XLR loop-back measurements from my MOTU M2's balanced line out to its left XLR input (I admit that an RCA to TRS adapter for testing single-ended ADC performance may have been a better test if I had such an adapter) with the preamp gain set to 12 o'clock as in the original measurements. The SPL scale was as calibrated with my SPL calibrator. The below results were only possible for the 4M FFT length and rectangular window.
Figure 11: -10 dBFS 1/10 octave pink spectrum multitone volume-matched to the levels seen in my headphone measurements; i.e. the ADC electronics see tones at the same scale as when receiving those signals from my microphones picking up the headphone signals. Spuriae are around 95 dB down at 1 kHz.
Figure 12: -10 dBFS 1/10 octave pink spectrum multitone with max volume. Spuriae are around 90 dBFS down at 1 kHz, so still well below the distortions measured out of my headphones.
Figure 13: At-home RTA noise floor for the left in-ear microphone with the Meze Elite worn; at least in this measurement, the 60 Hz mains tone was hard to suppress. The noise floor with the HE1000se was cleaner in the bass. I cannot share the at-shop noise floor measurement insofar as it shows the frequency response of one of the headphones that cannot be disclosed. Otherwise, the noise floor can be inferred from the graphs.
At-shop demo unit measurement:
Figure 14: Meze Elite demo unit with "V3.1 PEQ" left driver 94 dBA 1/10 decade pink spectrum multi-tone. Quite low, especially in the bass, the midrange distortion lobes being upwards of 56 dB down, or 0.16%, probably well below audible. I also have 99.8 dBA (approximating some of the loudest orchestral tutties like with the opening of Mahler Symphony No. 5) 1/24 octave pink spectrum multi-tone measurements which mainly incurs a denser distortion floor that is likewise raised by around 6 dB relative to the tones. The lower midrange multi-tone distortion happened to be worse on this unit than on my own unit at home as seen in Figure 20.
At-home personal unit measurements; my laptop happened to have exported the images smaller than on my desktop, my having not remembered the exact size number while at the shop:
Figure 15: Meze Elite Tungsten with "V3.1 PEQ" left driver 94 dBA 1/10 decade pink spectrum multi-tone. Cleaner in the lower midrange.
Figure 16: Audio-Technica ATH-M50xBT with "V3.1 PEQ" left driver 94 dBA 1/10 decade pink spectrum multi-tone.
Figure 17: HiFiMan HE1000se left driver 94 dBA 1/10 decade pink spectrum multi-tone. Better than the ATH-M50xBT in some places, the worst in the treble. The bass noise floor is lower thanks to the pads contacting the skin further from my ears, incurring less heartbeat noise.
To check on the possible influence of higher bass distortion on the increased full-range multi-tone distortion measurements, here are measurements with the tones below 100 Hz truncated and the rest left at the same level, inherently reducing the test signal's amplitude:
Figure 18: Meze Elite Tungsten with "V3.1 PEQ" left driver 94 dBA 1/10 decade pink spectrum multi-tone, tones below 100 Hz truncated. Quite clean. The multi-tone distortion is yet further improved, but the amplitude of the signal was likely also reduced. Such a spectral profile is probably also quite unlikely to be seen within music.
Figure 19: Audio-Technica ATH-M50xBT with "V3.1 PEQ" left driver 94 dBA 1/10 decade pink spectrum multi-tone, tones below 100 Hz truncated. Upper bass and lower midrange harmonic distortion is still rather high despite the same single-tone sweep harmonic distortion in that range having been more competitive.
Figure 20: HiFiMan HE1000se left driver 94 dBA 1/10 decade pink spectrum multi-tone, tones below 100 Hz truncated. Here, the treble distortion was still salient. Though the single-tone sweep harmonic distortion between 100 Hz to 500 Hz was comparable to the EQed ATH-M50xBT, the HE1000se has notably better multi-tone distortion in that range.
Conclusion
Though different headphones may show comparable single-tone sweep harmonic distortion performance within a given frequency band, they can still differ significantly in their multi-tone distortion performance for those bands. As such, multi-tone distortion measurements may provide another meaningful and more practical (insofar as most music comprises a substantial superposition of tones) means for assessing the comparative performance between headphones. Sure, maybe even my HE1000se's multi-tone distortion may not be audible in practical listening, but multi-tone spuriae close to the noise floor would be quite a badge of exceptionality giving consumers full confidence in the inaudibility of distortion artifacts through the given headphone. Likewise, where manufacturers may be content with achievement under the harmonic distortion metric, others might be found to still be quite competitive in the multi-tone distortion metric, encouraging further technological competition and progress.
- Again, this post excluded measurements of certain headphones that would have provided a more compelling case, but these cannot be shared unless the other party were to share their context or match the pink spectrum multi-tone and the levels thereof and convincingly demonstrate a discrepancy from my measurements.
Likewise, under the case that the measurement system is capable of showing higher-order harmonic distortion, these had ought to be checked at least up to the fifth harmonic to reveal any surprises or evince concerning issues with a new design choice.
For consideration
Should such multi-tone distortion measurements be adopted, it would be up to debate what parameters should be used. As seen here, I had chosen a pink spectrum as a practical average for music. A rectangular window should have a sufficiently high dynamic range for transducer measurements, and should supply the highest frequency resolution and floor particularly in the bass, especially when using up to a 4M FFT length provided that DAC to ADC sample synchronization is ensured. 1/24 octave measurements as in https://www.audiosciencereview.com/...n-susvara-headphone-review.50705/post-1888972 (post #1,183) are a reasonable extreme, but obscure resolution in the bass which is better seen with 1/20 or 1/10 decade multi-tone. Another multi-tone signal may be decided upon. For volume normalization, I chose 94 dBA for 1/10 decade pink spectrum multi-tone and 100 dBA for 1/24 octave pink spectrum multi-tone as approximating the perceived loudness of some of the loudest live orchestral tutties; this or whether to instead use dBC or a single tone may also be deliberated. Consideration may also be given to normalizing headphones to a reference EQ profile for a fairer comparison of driver distortion capability in yielding the same tonality.