The consequences of including ultrasonics in audio systems

Promit · Jul 24, 2020

This is prompted by another bad-faith discussion which I will not deign to link, but I think the core subject matter is worth discussing on its own. One of the fads that has been big over the last decade or more has been "high resolution audio", built around the idea that hifi audio is in need of more bandwidth and bit depth than the 16/44.1 format of Red Book CD. This started with the push for 24/96, and often goes as far as 24/192 source formats. This whole idea was notably backed by musician Neil Young with the Pono music service and digital music player, and somewhat famously rebutted by Xiph.org here: https://web.archive.org/web/20200310055211/https://people.xiph.org/~xiphmont/demo/neil-young.html

The Xiph article notes something quite crucial: audio systems are not necessarily designed to reproduce ultrasonics properly, and supplying them may have consequences, most notably spraying IMD back into the audible spectrum. The funny consequence here is of course that high resolution audio may indeed be clearly, provably audible compared to "standard" resolution - just for all the wrong reasons. IMD isn't usually desirable or pleasant to listen to, but in the world of audiofool nonsense, "different" plus "expensive" often equals "better".

That brings me to the point of this discussion: what are the consequences of including ultrasonics in the playback chain, from the source through DAC, amp, and speakers? Is it ever beneficial to do so? How often is it actively harmful to do so? Is the shape and strength of the bandwidth filter consequential in practice? Does it matter if you have something like the Adam A7X monitors, which claim a reference response out to 50 kHz? Is it a mistake to have my Windows configured in 24/96? I feel like there's very little technically informed discussion and testing of what ultrasonics mean in practice and whether high resolution audio is just a fashionable nothing or an active distortion generator.

pozz · Jul 24, 2020

The only time I've heard ultrasonic IMD through my speakers is when I deliberately ran 20Hz-30kHz sweeps through my system.

Otherwise I haven't encountered a situation where it's mattered. I generally don't play anything above 44.1/24.

Chromatischism · Jul 24, 2020

Be more respectful to the dogs in the neighborhood.

GeorgeBynum · Jul 25, 2020

While we neither have the power or transducers to be dangerous, I worked on non-woven web (think paper or "plastic" clothing like Tyvek) lines that bonded fibers via melting with ultrasonic energy. There were sensors all over the bonding station to detect leaks. It would destroy skin, and you would have no idea what was happening.

pozz · Jul 25, 2020

@amirm If it's possible with the current setup could you run an ultrasonic IMD study of speakers?

MRC01 · Jul 25, 2020

I think engineering and psychoacoustics justifies a higher sampling rate and bit depth. 44-16 is transparent most of the time, but not quite transparent under some conditions, with some listeners, with some material. That said, this idea has been tarnished by people (like Neil Young and many others) who advocate it for the wrong reasons and don't know what they're talking about. It's not about ultrasonics!

The reason for using a higher sampling rate than 44.1 has been discussed extensively here, but in summary a higher sampling rate makes it easier to implement transparent AA filters. Many of the current DACs widen the filter transition band to 24 k which is above Nyquist and technically "wrong". They do this because the transition band 20,000 to 22,050 is so narrow it's hard to implement a filter that goes from 0 to -infinity and runs in real time on limited hardware, while keeping passband amplitude & phase perfectly flat. Stretching the transition band even just a little wider eliminates that problem. A slightly higher sampling rate, even 48 k, would make this easier and obviate the need for this hack.

As for why to use more than 16 bits, consider music having some of the widest dynamic range typically found in recordings, say a large ensemble symphonic or choral recording having 60 dB of dynamic range. 16 bits gives us 93 dB when dithered, so what's the problem? The quietest parts are 60 dB below full scale, so the top 10 bits are all 0, leaving only 6 bits of resolution (really 5.5, since the LSB is dither). And the overtones that enable us to differentiate, say, a flute from an oboe, are higher frequencies which are about 18 dB lower in level than the fundamentals, so they've got only 3 bits of resolution (really 2.5, since the LSB is dither). And they're in the 2k - 5k range where our hearing is most sensitive. 24-bit eliminates this problem, giving higher resolution (or lower noise, since it's dithered) in the quietest passages.

All that said, I believe that 44-16 is transparent 99% of the time. I think it would only take 48-24 to be 100% transparent all the time, for all people, for all kinds of music. Certainly not 96, 192 or any of the crazy stuff that is often advocated in the "high res" circles.

phoenixdogfan · Jul 25, 2020

Making your Lab's life a living hell?

Daverz · Jul 25, 2020

With the acoustic instrumental music I usually listen to, there's very, very little energy above 20 kHz, so even with amps and tweeters with extended bandwidth, I don't think it will cause any issues. Perhaps with electronica played very loud?

MRC01 · Jul 25, 2020

When it comes to acoustic instruments, the extreme HF is with twocky clicky percussive sounds, like sticks smacking wood or metal, castanets, etc. Cymbals and triangles, of course. Plucked harp strings. Some kinds of bagpipes. Trumpets playing FF in the high register. With vocals, sibilant consonants.

Notwithstanding these examples, the norm is that most musical instruments have little energy above 10 kHz and what is there, diminishes rapidly as you go higher.

Jangling a set of keys in front of the mic generates incredible amounts of HF energy well extending well above 20 kHz before it attenuates. That's not a musical instrument, but it is a natural sound and interesting to know. This is a trick recording engineers sometimes use to check the equipment. And it's a sound that can be very effective and useful for ABX testing for extreme HF response (whether in headphones, speakers, a DAC or whatever).

Blumlein 88 · Jul 25, 2020

MRC01 said:
I think engineering and psychoacoustics justifies a higher sampling rate and bit depth. 44-16 is transparent most of the time, but not quite transparent under some conditions, with some listeners, with some material. That said, this idea has been tarnished by people (like Neil Young and many others) who advocate it for the wrong reasons and don't know what they're talking about. It's not about ultrasonics!

The reason for using a higher sampling rate than 44.1 has been discussed extensively here, but in summary a higher sampling rate makes it easier to implement transparent AA filters. Many of the current DACs widen the filter transition band to 24 k which is above Nyquist and technically "wrong". They do this because the transition band 20,000 to 22,050 is so narrow it's hard to implement a filter that goes from 0 to -infinity and runs in real time on limited hardware, while keeping passband amplitude & phase perfectly flat. Stretching the transition band even just a little wider eliminates that problem. A slightly higher sampling rate, even 48 k, would make this easier and obviate the need for this hack.

As for why to use more than 16 bits, consider music having some of the widest dynamic range typically found in recordings, say a large ensemble symphonic or choral recording having 60 dB of dynamic range. 16 bits gives us 93 dB when dithered, so what's the problem? The quietest parts are 60 dB below full scale, so the top 10 bits are all 0, leaving only 6 bits of resolution (really 5.5, since the LSB is dither). And the overtones that enable us to differentiate, say, a flute from an oboe, are higher frequencies which are about 18 dB lower in level than the fundamentals, so they've got only 3 bits of resolution (really 2.5, since the LSB is dither). And they're in the 2k - 5k range where our hearing is most sensitive. 24-bit eliminates this problem, giving higher resolution (or lower noise, since it's dithered) in the quietest passages.

All that said, I believe that 44-16 is transparent 99% of the time. I think it would only take 48-24 to be 100% transparent all the time, for all people, for all kinds of music. Certainly not 96, 192 or any of the crazy stuff that is often advocated in the "high res" circles.

Something that gives me pause in regard to this is the test Amandine Pras et al did comparing 88 khz to 44 khz sample rates. All 24 bits. Using no processing and all very high quality gear for recording and playback they compared 88/24, 44/24 and downsampled 88/24 to 44/24. Only the downsampled result was discernible. They were using a high quality Pyramix for down sampling. The natively recorded 44 was not heard as different to the natively recorded 88.

I know Arnie's old jangling keys test file was something I could with great difficulty hear in an ABX comparison. When I used a modern downsampler instead of what he used many years ago, it was no longer possible for me to ABX it.

My preference is always to hear a recording in its native format if possible.

https://www.academia.edu/441305/Sampling_Rate_Discrimination_44.1_KHz_Vs._88.2_KHz

JeffS7444 · Jul 25, 2020

If something's wrong at ultrasonic frequencies at sufficiently high amplitudes, I think it'll have an effect! Many years ago at a part-time hifi store job, a few of us got to horsing around with a signal generator and a Magneplanar MG3 which was equipped with ribbon tweeters. And I think you can figure out where this is heading: My recollection is that while I did not discern a pure tone at 20 kHz or beyond, if I positioned myself just-so with respect to the tweeter, I got a blast of something painful, but it was concentrated in a very narrow beam; narrow enough that I think it'd be tricky to set up speakers with sufficient precision for the full effect.

Daverz · Jul 25, 2020

Spectrogram of Transmission from the Pan Sonic album Cathodephase.

Pan Sonic - LДhetys _ Transmission.flac.jpg

It would be interesting to see what a hi-res version of something like this would look like.

The consequences of including ultrasonics in audio systems

Promit

Active Member

pozz

Слава Україні

Chromatischism

Major Contributor

GeorgeBynum

Active Member

pozz

Слава Україні

MRC01

Major Contributor

phoenixdogfan

Major Contributor

Daverz

Major Contributor

MRC01

Major Contributor

Blumlein 88

Grand Contributor

JeffS7444

Major Contributor

Daverz

Major Contributor

Similar threads