• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required as is 20 years of participation in forums (not all true). Come here to have fun, be ready to be teased and not take online life too seriously. We now measure and review equipment for free! Click here for details.

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,056
Likes
927
Yes, this nativedsd nonsense reminds me the article from mansr.
https://troll-audio.com/articles/pcm-and-dsd/
Misconceptions

In the world of hi-fi, it is common to encounter misleading statements about DSD, the marketing name for 1-bit audio. A few examples (found on dsd-guide.com and blog.nativedsd.com):

  • “There are no samples, there are no words, there is no code.”
  • “DSD is a lot closer to analog than PCM ever thought to be.”
  • “[PCM] has really nothing to do with audio. You have minimal resolution at zero crossing, whereby with DSD you have maximum resolution and on and on.”
  • “DSD (or the more correct non marketing term, Pulse Density Modulation – PDM) is an analog signal itself.”

These quotes are all nonsense. A DSD signal is discrete in time and amplitude, ergo digital.

------------------
So Andy, after realizing Merging's false explanations about DSD/DXD/impulse response/bandwidth, you decided to post more misinformation?
 
Last edited:

Frank Dernie

Major Contributor
Forum Donor
Joined
Mar 24, 2016
Messages
3,897
Likes
8,008
Location
Oxfordshire
No.
Try any half decent brass band live - and, immediately after returning home, any brass band recording on CD.
( which made me immediately sad - no such opportunity to listen to a brass band parade or similar at the moment live, in most parts of the world... )
I would be prepared to bet a small quantity of my own money that the difference between a live brass band and playing back a CD of one at home has nothing whatever to do with frequency.
To what species of creature are you planning to replay your recordings and using what sort of loudspeaker?
 

Kal Rubinson

Major Contributor
Industry Insider
Joined
Mar 23, 2016
Messages
2,442
Likes
3,352
Location
NYC/CT
All good - I was just trying to join the dots that the change in voltage at the speaker terminals causes the speaker cone to move which causes the change in air pressure which causes the ear drum and other bits to move which causes the hair cells to move and then understand where this stops being a causal linear(ish) chain.
Fussy edited version: All good - I was just trying to join the dots that the change in voltage at the speaker terminals causes the speaker cone to move which causes the change in air pressure which causes the ear drum and other bits to move which causes deflection of the basilar membrane resulting in the deflection of the cilia (hairs) of the the hair cells (insert ionic channel operations here) and then understand where this stops being a causal linear(ish) chain. :)
 

andymok

Senior Member
Joined
Sep 14, 2018
Messages
482
Likes
476
Location
Hong Kong
Yes, this nativedsd nonsense reminds me the article from mansr.
https://troll-audio.com/articles/pcm-and-dsd/
Misconceptions

In the world of hi-fi, it is common to encounter misleading statements about DSD, the marketing name for 1-bit audio. A few examples (found on dsd-guide.com and blog.nativedsd.com):

  • “There are no samples, there are no words, there is no code.”
  • “DSD is a lot closer to analog than PCM ever thought to be.”
  • “[PCM] has really nothing to do with audio. You have minimal resolution at zero crossing, whereby with DSD you have maximum resolution and on and on.”
  • “DSD (or the more correct non marketing term, Pulse Density Modulation – PDM) is an analog signal itself.”

These quotes are all nonsense. A DSD signal is discrete in time and amplitude, ergo digital.

------------------
So Andy, after realizing Merging's false explanations about DSD/DXD/impulse response/bandwidth, you decided to post more misinformation?
I wouldn’t call yours misinformation, while I do not share the same view.
 
Joined
Sep 16, 2020
Messages
52
Likes
18
Location
Slovakia
DSD signal is one bit two level PDM signal. The word level (not value) is important. Level 0 or 1 at a given time does not tell anything about the exact signal value at that time. The DSD signal levels at a given time really are not samples as we know them in PCM domain. So the DSD signal levels do not represent digits. Therefore it is a bit hard to call them digital.
On the other side, DSD signal requires nothing more than low pass filtering to become analog signal. That's the only thing what DACs with separate DSD path do. DSD only DAC implementation can be fully discrete. So to say that DSD signal is more near to analog signal than PCM signal is simply true.
It is quite possible that the original meaning of DSD was something like Direct Sigma Delta but such a term would not be well marketable with commercial SACD products. If you understand the nature of DSD signal, then the term Direct Stream Digital sounds a bit artificially.
 
Joined
Apr 29, 2020
Messages
266
Likes
332
Location
USA
I expect I'll get a flogging here but I'd like to understand where my logic is wrong so please bear with me if it's all been said before. The hair cells in your ears respond to movement in the air. Movement makes me think of rate of change, in a sine wave rate of change is proportional to voltage and frequency. If I my hair cells respond to 1V at 10 kHz, why don't they respond to .33V at 30 kHz (ie same rate of change). I'm not saying that at 30 kHz they can accelerate, decelerate and actually track the 30 kHz, I'm just asking will a high frequency transient have an influence on the perceived sound, and if not why not?
All good - I was just trying to join the dots that the change in voltage at the speaker terminals causes the speaker cone to move which causes the change in air pressure which causes the ear drum and other bits to move which causes the hair cells to move and then understand where this stops being a causal linear(ish) chain.
As @Kal Rubinson indicated, the basilar membrane of the cochlea in mammalian ears performs something like a Fourier (frequency-based) decomposition of the sound, whether the cochlea is excited by air conduction or bone conduction. The basilar membrane resonates locally at different locations along its coiled length in response to tones of different frequencies, ranging from around 20 kHz at its base to around 20 Hz at its apex in human ears. This stimulates the hair cells at that location to release synaptic neurotransmitters that in turn stimulate the cochlear nerve fibers at that location to fire, which the brain perceives as a tone of the corresponding pitch. Check out the linear differential equation representation of a simple driven damped harmonic oscillator in the Wikipedia page on "Resonance". If sound tone amplitudes are within the normal hearing levels, the vibration response of the basilar membrane may be linear(-ish) with respect to the tone amplitude at any fixed frequency. However, at any fixed location along the membrane, the response is nonlinear with respect to the tone frequency, due to the resonance. As there is no basilar membrane location and corresponding hair cells that resonates (above the threshold needed to the neurons to fire) in response to tones of frequency greater than about 20 kHz in humans, this mechanism cannot sense tones of greater frequency (ultrasonic). Incidentally, this is why the Fletcher-Munson iso-loudness-level curves in the SPL vs frequency plane are depicted as rising vertically at around 20 kHz: no matter how high the SPL of an ultrasonic tone you are not going to sense it by this mechanism. The ultrasonic tones get partially reflected off the eardrum, and the rest propagate through the middle and inner ears and other tissue and eventually die out as their energy dissipates as heat.

The basilar membrane and the hair cells respond to oscillating pressures exerted by external air on the eardrum. Further into the ear after the eardrum, the pressure waves are transmitted through bone, cochlear fluids and membranes at the speed of sound of the specific material. The bones in the middle ear serve mainly as a piston arrangement between two cylinders of unequal area to provide hydraulic pressure amplitude gain. The entire system including the air carrying the sound to the ear is mechanical in nature. The solids and fluids involved all have properties of density (and thus mass) and "viscosity" (and thus friction). The time derivative of the pressure oscillation at any particular point is not particularly significant. The firing of the neurons at any particular location is a response to the amplitude of the resonant oscillation of the basilar member at that location. The amplitude or amount of peak motion follows of course from Newton's Second Law of Motion. Basically it depends on the magnitude and duration of the impressed net force, the inertial mass and the damping. No macroscopic object with positive rest mass moves instantaneously by a finite non-zero amount. When a force is applied (in this case a pressure difference of fixed SPL over a small area, together with a motion-resisting frictional shear force), that lump of matter with fixed mass is accelerated, its speed increases and it suffers a displacement. The speed increases linearly with the time duration that the force is applied for and the displacement increases as the square of the duration. Then the force reverses, and the lump begins to displace in the opposite direction. So if the SPL is kept fixed, a doubling of the tone frequency halves the duration of each half period and quarters the amplitude of the displacement. Thus the displacement of the basilar membrane (as well as of all the other parts) drops off rapidly (quadratically, not linearly) with rising tone frequency. Further, the magnitude of the frictional damping force rises linearly with frequency and causes additional attenuation of the displacement amplitude for increasing tone frequency. The time derivative of pressure plays little part in these primary effects. A nearly square wave with alternating sign, which has a high derivative during its rise and fall, will produce a displacement amplitude somewhat larger than would a sine wave of the same peak amplitude (and produce a complex response, equivalent to multiple simultaneous frequencies). However, a short alternating-sign pulse of the same peak amplitude and frequency (but with each pulse lasting only a small fraction of the half-period of the wave), and having a derivative similar to the nearly square wave will produce a peak displacement less than does the sine wave. The average force applied during a half period is greater than that of the sine wave in one case and less in the other case. The square-like and pulse-like waves will still be percieved as primarily having the same frequency as the sine wave, but accompanied by some undertones and overtones of lesser amplitude. The increasing attenuation of the displacement amplitude with rising tone frequency would certainly limit how far into the ultrasonic we could sense, but the lack of resonance of the basilar membrane of high enough amplitude to trigger neuronal firing for frequencies above around 20 kHz (in young children with undamaged hearing) causes a sharp cut-off in what we can hear, so we need not actually bother calculating the attenuation at 96, 192 or 384 kHz.

We must discount the stories of audiophiles enthused by what they hear in "HiRes" audio in "sighted" listening, because of the giant confounding factor of cognitive bias. They have been primed to believe by companies touting DSD, MQA, high-bitrate PCM, who tell a seemingly coherent story and use Kahneman's and Tversky's System 1 to induce belief motivated by consumers' desires for increased music enjoyment and one-upmanship over the Joneses. Calling the tones hypersonic instead of ultrasonic not only does not change the underlying physics, but is an abuse of usage of the term hypersonic, which has had a well-established and different meaning in fluid and solid mechanics for over a century. Spending resources on research into how the brain might sense ultrasonic tones requires justification first by rigorous DBT, designed to exclude aliasing of ultrasonic tones into audible range in the equipment, by credible impartial researchers, proving that the brain can do so. As several folks have from time to time posted on ASR, such effects even if proved to physically exist, will be small and most likely not significantly change your experience of recorded music.
 
Last edited:

March Audio

Major Contributor
Manufacturer
Joined
Mar 1, 2016
Messages
6,051
Likes
8,176
Location
Albany Western Australia
Correct. Simple - the higher the frequency response, the better the transient.
.
Which is irrelevant because you can't hear it. You haven't grasped the connection of rise time of a transient and frequency response.

There have been no credible studies that show perception above the usually discussed limits (around 20khz). Some subjects will show perception a little above this but it requires extremely high spl levels.

So the discussion simply ends there.
 
Last edited:
Joined
Mar 22, 2019
Messages
35
Likes
43
Location
East Coast
I consider DSD a pox on our audio world.
Not sure why this enmity towards a technology which is part of virtually every ADC and DAC chip, where DSD bitstream is an intermediate signal created by a delta-sigma modulator, processed subsequently by a low pass filter (digital in ADC, analog in DAC).
 

PaulD

Senior Member
Joined
Jul 22, 2018
Messages
330
Likes
819
Not sure why this enmity towards a technology which is part of virtually every ADC and DAC chip, where DSD bitstream is an intermediate signal created by a delta-sigma modulator, processed subsequently by a low pass filter (digital in ADC, analog in DAC).
Because it should never be external to the DAC chip, and the complete nonsense that surrounds it in the audiofool press, and the associated misinformation around PCM.
 

Blumlein 88

Major Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
10,176
Likes
14,032
Not sure why this enmity towards a technology which is part of virtually every ADC and DAC chip, where DSD bitstream is an intermediate signal created by a delta-sigma modulator, processed subsequently by a low pass filter (digital in ADC, analog in DAC).
What PaulD said. Plus it usually isn't one bit DSD inside the chip. It can be, but usually isn't. It creates a format that is inconvenient to record with, edit and distribute. It is an additional format providing no benefits except for the audio phoo that goes with it. The PCM format in and out is what is most useful, and DSD has no purpose being pulled outside as another format.
 

trl

Major Contributor
King of Mods
Joined
Feb 28, 2018
Messages
1,174
Likes
1,118
Location
Iasi, RO
• S/N ratio of DSD at high volume is bad. At high volumes, you can hear noise. Not good at all!
This is happening on ASUS Essence One MKii too and is caused by the poor firmware implementation and/or the USB transport and DAC chip itself. However, with newer chips and implementations there should be no noise introduced by DSD decoding. Anyway, software DSD decoding might be even better, without any losses.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,056
Likes
927
DSD signal is one bit two level PDM signal. The word level (not value) is important. Level 0 or 1 at a given time does not tell anything about the exact signal value at that time. The DSD signal levels at a given time really are not samples as we know them in PCM domain. So the DSD signal levels do not represent digits. Therefore it is a bit hard to call them digital.
On the other side, DSD signal requires nothing more than low pass filtering to become analog signal. That's the only thing what DACs with separate DSD path do. DSD only DAC implementation can be fully discrete. So to say that DSD signal is more near to analog signal than PCM signal is simply true.
It is quite possible that the original meaning of DSD was something like Direct Sigma Delta but such a term would not be well marketable with commercial SACD products. If you understand the nature of DSD signal, then the term Direct Stream Digital sounds a bit artificially.
As long as it is not "Qubit", a bit represents two states. Digital data are just combination of bits representing different things, not necessarily audio. When the bits represent audio, they can be PCM or DSD.

Here are some PCM files encoded in 384kHz, 1-bit (only two amplitude values). Being encoded in 1-bit won't change the fact they are still digital data.
https://www.audiosciencereview.com/...od-for-measuring-distortion.10282/post-334093

DSD involves strong noise shaping, but it is still sampled at regular interval (e.g. multiples of 2.8224MHz), and it is strictly digital.
 
Last edited:

rkbates

Member
Forum Donor
Joined
Jul 24, 2020
Messages
27
Likes
36
Location
Down Under
As @Kal Rubinson indicated, the basilar membrane of the cochlea in mammalian ears performs something like a Fourier (frequency-based) decomposition of the sound, whether the cochlea is excited by air conduction or bone conduction. The basilar membrane resonates locally at different locations along its coiled length in response to tones of different frequencies, ranging from around 20 kHz at its base to around 20 Hz at its apex in human ears. This stimulates the hair cells at that location to release synaptic neurotransmitters that in turn stimulate the cochlear nerve fibers at that location to fire, which the brain perceives as a tone of the corresponding pitch. Check out the linear differential equation representation of a simple driven damped harmonic oscillator in the Wikipedia page on "Resonance". If sound tone amplitudes are within the normal hearing levels, the vibration response of the basilar membrane may be linear(-ish) with respect to the tone amplitude at any fixed frequency. However, at any fixed location along the membrane, the response is nonlinear with respect to the tone frequency, due to the resonance. As there is no basilar membrane location and corresponding hair cells that resonates (above the threshold needed to the neurons to fire) in response to tones of frequency greater than about 20 kHz in humans, this mechanism cannot sense tones of greater frequency (ultrasonic). Incidentally, this is why the Fletcher-Munson iso-loudness-level curves in the SPL vs frequency plane are depicted as rising vertically at around 20 kHz: no matter how high the SPL of an ultrasonic tone you are not going to sense it by this mechanism. The ultrasonic tones get partially reflected off the eardrum, and the rest propagate through the middle and inner ears and other tissue and eventually die out as their energy dissipates as heat.

The basilar membrane and the hair cells respond to oscillating pressures exerted by external air on the eardrum. Further into the ear after the eardrum, the pressure waves are transmitted through bone, cochlear fluids and membranes at the speed of sound of the specific material. The bones in the middle ear serve mainly as a piston arrangement between two cylinders of unequal area to provide hydraulic pressure amplitude gain. The entire system including the air carrying the sound to the ear is mechanical in nature. The solids and fluids involved all have properties of density (and thus mass) and "viscosity" (and thus friction). The time derivative of the pressure oscillation at any particular point is not particularly significant. The firing of the neurons at any particular location is a response to the amplitude of the resonant oscillation of the basilar member at that location. The amplitude or amount of peak motion follows of course from Newton's Second Law of Motion. Basically it depends on the magnitude and duration of the impressed net force, the inertial mass and the damping. No macroscopic object with positive rest mass moves instantaneously by a finite non-zero amount. When a force is applied (in this case a pressure difference of fixed SPL over a small area, together with a motion-resisting frictional shear force), that lump of matter with fixed mass is accelerated, its speed increases and it suffers a displacement. The speed increases linearly with the time duration that the force is applied for and the displacement increases as the square of the duration. Then the force reverses, and the lump begins to displace in the opposite direction. So if the SPL is kept fixed, a doubling of the tone frequency halves the duration of each half period and quarters the amplitude of the displacement. Thus the displacement of the basilar membrane (as well as of all the other parts) drops off rapidly (quadratically, not linearly) with rising tone frequency. Further, the magnitude of the frictional damping force rises linearly with frequency and causes additional attenuation of the displacement amplitude for increasing tone frequency. The time derivative of pressure plays little part in these primary effects. A nearly square wave with alternating sign, which has a high derivative during its rise and fall, will produce a displacement amplitude somewhat larger than would a sine wave of the same peak amplitude (and produce a complex response, equivalent to multiple simultaneous frequencies). However, a short alternating-sign pulse of the same peak amplitude and frequency (but with each pulse lasting only a small fraction of the half-period of the wave), and having a derivative similar to the nearly square wave will produce a peak displacement less than does the sine wave. The average force applied during a half period is greater than that of the sine wave in one case and less in the other case. The square-like and pulse-like waves will still be percieved as primarily having the same frequency as the sine wave, but accompanied by some undertones and overtones of lesser amplitude. The increasing attenuation of the displacement amplitude with rising tone frequency would certainly limit how far into the ultrasonic we could sense, but the lack of resonance of the basilar membrane of high enough amplitude to trigger neuronal firing for frequencies above around 20 kHz (in young children with undamaged hearing) causes a sharp cut-off in what we can hear, so we need not actually bother calculating the attenuation at 96, 192 or 384 kHz.

We must discount the stories of audiophiles enthused by what they hear in "HiRes" audio in "sighted" listening, because of the giant confounding factor of cognitive bias. They have been primed to believe by companies touting DSD, MQA, high-bitrate PCM, who tell a seemingly coherent story and use Kahneman's and Tversky's System 1 to induce belief motivated by consumers' desires for increased music enjoyment and one-upmanship over the Joneses. Calling the tones hypersonic instead of ultrasonic not only does not change the underlying physics, but is an abuse of usage of the term hypersonic, which has had a well-established and different meaning in fluid and solid mechanics for over a century. Spending resources on research into how the brain might sense ultrasonic tones requires justification first by rigorous DBT, designed to exclude aliasing of ultrasonic tones into audible range in the equipment, by credible impartial researchers, proving that the brain can do so. As several folks have from time to time posted on ASR, such effects even if proved to physically exist, will be small and most likely not significantly change your experience of recorded music.
Thanks for the reply - you're explanation really pulls together all the bits and pieces for me (and adds some new bits) so I really appreciate the effort you have put into helping me (and probably a few others) increase their understanding
 
Joined
Sep 16, 2020
Messages
52
Likes
18
Location
Slovakia
As long as it is not "Qubit", a bit represents two states. Digital data are just combination of bits representing different things, not necessarily audio. When the bits represent audio, they can be PCM or DSD.

Here are some PCM files encoded in 384kHz, 1-bit (only two amplitude values). Being encoded in 1-bit won't change the fact they are still digital data.
https://www.audiosciencereview.com/...od-for-measuring-distortion.10282/post-334093

DSD involves strong noise shaping, but it is still sampled at regular interval (e.g. multiples of 2.8224MHz), and it is strictly digital.
I reacted to previous post, which questioned that DSD signal is in its nature more near to analog than PCM signal and called that a misconception. From the point of view how delta sigma DACs are implemented, DSD signal is more near to analog. It's conversion to analog is much easier.

I didn't say that DSD is not a digital signal. It is encoded digitally, can be stored on digital media and can be processed digitally without need of A/D converter, so it is digital.

PCM signal encoded in 1 bit represents only 2 signal amplitude values. PCM is intended to use the sample values as real signal amplitude values.

On the other side, DSD levels 0 and 1 do not represent signal amplitude but rather they encode signal difference. With 1bit DSD signal you can encode any number of analog values if the sampling frequency is high enough.
So you cannot so simply compare 1 bit PCM and DSD signals.
 

rkbates

Member
Forum Donor
Joined
Jul 24, 2020
Messages
27
Likes
36
Location
Down Under
Yes, this nativedsd nonsense reminds me the article from mansr.
https://troll-audio.com/articles/pcm-and-dsd/
Misconceptions

In the world of hi-fi, it is common to encounter misleading statements about DSD, the marketing name for 1-bit audio. A few examples (found on dsd-guide.com and blog.nativedsd.com):

  • “There are no samples, there are no words, there is no code.”
  • “DSD is a lot closer to analog than PCM ever thought to be.”
  • “[PCM] has really nothing to do with audio. You have minimal resolution at zero crossing, whereby with DSD you have maximum resolution and on and on.”
  • “DSD (or the more correct non marketing term, Pulse Density Modulation – PDM) is an analog signal itself.”

These quotes are all nonsense. A DSD signal is discrete in time and amplitude, ergo digital.

------------------
So Andy, after realizing Merging's false explanations about DSD/DXD/impulse response/bandwidth, you decided to post more misinformation?
The Troll Audio explanation showing how bit depth, sampling rate and noise shaping all can be manipulated to get to audio of the required quality is excellent!
 
Joined
Sep 16, 2020
Messages
52
Likes
18
Location
Slovakia
mansr presents his point of view as the true one and calling others to be a misconception. You can find many discussions on topic PCM vs. DSD and most of them are biased. So is also with the big font on "Misconceptions". One has simply try to look at the same thing from more points of view and the misconceptions disappear.

mansr wrote:
"If the negative-going pulses are disregarded, there is indeed a resemblance to a PDM signal. This interpretation is, however, not helpful. Any similarity to PDM is a coincidence, not a design intent, and viewing it as such does not aid analysis or understanding of 1-bit audio signals."
...
"As it happens, actual PDM is in fact analogue; it just has nothing to do with DSD. "
...
"The commonly used method of producing a 1-bit noise-shaped encoding of a signal is known as sigma-delta (or delta-sigma) modulation, sometimes abbreviated SDM"

from Wikipedia:

https://en.wikipedia.org/wiki/Pulse-density_modulation
"In a PDM signal, specific amplitude values are not encoded into codewords of pulses of different weight as they would be in pulse-code modulation (PCM); rather, the relative density of the pulses corresponds to the analog signal's amplitude. The output of a 1-bit DAC is the same as the PDM encoding of the signal. "

https://en.wikipedia.org/wiki/Pulse-density_modulation
"PDM is the encoding used in Sony's Super Audio CD (SACD) format, under the name Direct Stream Digital. "

https://en.wikipedia.org/wiki/Direct_Stream_Digital
"DSD uses pulse-density modulation encoding - a technology to store audio signals on digital storage media which are used for the SACD. The signal is stored as delta-sigma modulated digital audio, a sequence of single-bit values at a sampling rate of 2.8224 MHz "

So it is about different points of view and I would rather the wikipedia one consider a "standard" view. But I have nothing against an other angle of view if it is used to show something interesting.
 
Last edited:

BDWoody

Chief Cat Herder
Moderator
Forum Donor
Joined
Jan 9, 2019
Messages
3,007
Likes
6,226
Location
Mid-Atlantic, USA. (Maryland)
As @Kal Rubinson indicated, the basilar membrane of the cochlea in mammalian ears performs something like a Fourier (frequency-based) decomposition of the sound, whether the cochlea is excited by air conduction or bone conduction. The basilar membrane resonates locally at different locations along its coiled length in response to tones of different frequencies, ranging from around 20 kHz at its base to around 20 Hz at its apex in human ears. This stimulates the hair cells at that location to release synaptic neurotransmitters that in turn stimulate the cochlear nerve fibers at that location to fire, which the brain perceives as a tone of the corresponding pitch. Check out the linear differential equation representation of a simple driven damped harmonic oscillator in the Wikipedia page on "Resonance". If sound tone amplitudes are within the normal hearing levels, the vibration response of the basilar membrane may be linear(-ish) with respect to the tone amplitude at any fixed frequency. However, at any fixed location along the membrane, the response is nonlinear with respect to the tone frequency, due to the resonance. As there is no basilar membrane location and corresponding hair cells that resonates (above the threshold needed to the neurons to fire) in response to tones of frequency greater than about 20 kHz in humans, this mechanism cannot sense tones of greater frequency (ultrasonic). Incidentally, this is why the Fletcher-Munson iso-loudness-level curves in the SPL vs frequency plane are depicted as rising vertically at around 20 kHz: no matter how high the SPL of an ultrasonic tone you are not going to sense it by this mechanism. The ultrasonic tones get partially reflected off the eardrum, and the rest propagate through the middle and inner ears and other tissue and eventually die out as their energy dissipates as heat.

The basilar membrane and the hair cells respond to oscillating pressures exerted by external air on the eardrum. Further into the ear after the eardrum, the pressure waves are transmitted through bone, cochlear fluids and membranes at the speed of sound of the specific material. The bones in the middle ear serve mainly as a piston arrangement between two cylinders of unequal area to provide hydraulic pressure amplitude gain. The entire system including the air carrying the sound to the ear is mechanical in nature. The solids and fluids involved all have properties of density (and thus mass) and "viscosity" (and thus friction). The time derivative of the pressure oscillation at any particular point is not particularly significant. The firing of the neurons at any particular location is a response to the amplitude of the resonant oscillation of the basilar member at that location. The amplitude or amount of peak motion follows of course from Newton's Second Law of Motion. Basically it depends on the magnitude and duration of the impressed net force, the inertial mass and the damping. No macroscopic object with positive rest mass moves instantaneously by a finite non-zero amount. When a force is applied (in this case a pressure difference of fixed SPL over a small area, together with a motion-resisting frictional shear force), that lump of matter with fixed mass is accelerated, its speed increases and it suffers a displacement. The speed increases linearly with the time duration that the force is applied for and the displacement increases as the square of the duration. Then the force reverses, and the lump begins to displace in the opposite direction. So if the SPL is kept fixed, a doubling of the tone frequency halves the duration of each half period and quarters the amplitude of the displacement. Thus the displacement of the basilar membrane (as well as of all the other parts) drops off rapidly (quadratically, not linearly) with rising tone frequency. Further, the magnitude of the frictional damping force rises linearly with frequency and causes additional attenuation of the displacement amplitude for increasing tone frequency. The time derivative of pressure plays little part in these primary effects. A nearly square wave with alternating sign, which has a high derivative during its rise and fall, will produce a displacement amplitude somewhat larger than would a sine wave of the same peak amplitude (and produce a complex response, equivalent to multiple simultaneous frequencies). However, a short alternating-sign pulse of the same peak amplitude and frequency (but with each pulse lasting only a small fraction of the half-period of the wave), and having a derivative similar to the nearly square wave will produce a peak displacement less than does the sine wave. The average force applied during a half period is greater than that of the sine wave in one case and less in the other case. The square-like and pulse-like waves will still be percieved as primarily having the same frequency as the sine wave, but accompanied by some undertones and overtones of lesser amplitude. The increasing attenuation of the displacement amplitude with rising tone frequency would certainly limit how far into the ultrasonic we could sense, but the lack of resonance of the basilar membrane of high enough amplitude to trigger neuronal firing for frequencies above around 20 kHz (in young children with undamaged hearing) causes a sharp cut-off in what we can hear, so we need not actually bother calculating the attenuation at 96, 192 or 384 kHz.

We must discount the stories of audiophiles enthused by what they hear in "HiRes" audio in "sighted" listening, because of the giant confounding factor of cognitive bias. They have been primed to believe by companies touting DSD, MQA, high-bitrate PCM, who tell a seemingly coherent story and use Kahneman's and Tversky's System 1 to induce belief motivated by consumers' desires for increased music enjoyment and one-upmanship over the Joneses. Calling the tones hypersonic instead of ultrasonic not only does not change the underlying physics, but is an abuse of usage of the term hypersonic, which has had a well-established and different meaning in fluid and solid mechanics for over a century. Spending resources on research into how the brain might sense ultrasonic tones requires justification first by rigorous DBT, designed to exclude aliasing of ultrasonic tones into audible range in the equipment, by credible impartial researchers, proving that the brain can do so. As several folks have from time to time posted on ASR, such effects even if proved to physically exist, will be small and most likely not significantly change your experience of recorded music.
What an excellent post.

The 'casual' knowledge on this forum is pretty amazing.

Cheers.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,056
Likes
927
So the DSD signal levels do not represent digits. Therefore it is a bit hard to call them digital.
I was replying to this claim.

I didn't say that DSD is not a digital signal. It is encoded digitally, can be stored on digital media and can be processed digitally without need of A/D converter, so it is digital.
OK then.
 
Top Bottom