• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Does DSD sound better than PCM?

Status
Not open for further replies.

John Deas

Member
Joined
Apr 24, 2019
Messages
34
Likes
3
The paper is here.

The relevant passage is:

"In some musical instruments, acoustic pressure builds up extremely fast during onset transients reaching tens of dBs within a few microseconds. For example, transient onset of xylophone shows waveforms with rise time of less than 10 μs with instantaneous peak output reaching 126dB SPL. Trumpet playing fortecan register 120-130dB peak SPL with steep rise of the waveform within only 10μs to full signal level. Snare drum reaches 130dB and cymbals 136dB peak SPL within microseconds [9].Rogowski ‘s values were measured during a short musical selection captured using 1/4 inch, Brüel&Kjaer 4135 microphone and a 192kHz, 12bit A/D conversion. Can CD-rate sampling of audio every 22.7μs register the full waveform detail of the sound of these instruments? Based on these onset requirements, to achieve a transparent recording medium, one should sample audio with less than 1μs between samples to accurately capture steep waveform changes. One revealing transient test of recording system is to use the sound of dangling keys or striking wine glasses as a source. Our familiarity with these sounds is frequently refreshed and can be used to inform us of problems with the transparency of the recording system. A brutal case of poor transparency occurred in early consumer CD players that used single D/A converter working with both channels in multiplex. Noticeable directional shifts were sometimes heard on transients when left and right channel converted the same onset in a sequence of samples."

There's some confused information in that. Most importantly, the author seems to forget the relationship between risetime and frequency.

If an acoustic source's output has a risetime of under 22.7us, that is simply because it contains frequency content above 22.05KHz (ie the maximum frequency that redbook can accurately capture, and a bit above the maximum frequency that most humans can hear).

So the higher frequency content is present, and measurable (both in the frequency domain and the time domain, which is the domain that the above passage focuses on) - but it is not audible. The highest audible content that is present simply can't have a risetime of less than around 23 or more, depending on the hearing ability of the person listening of course.

In other words, redbook cannot capture this instrument in its fullness, but it can capture every component of the instrument that is audible to humans.
Cheers for that, interesting stuff, it's something I want to investigate further....
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,631
Likes
10,205
Location
North-East
Thanks, yep seen it along time ago and it's a great explanation but sorry I just don't get the timing explanation he gives and cannot relate this in any way to music capture with the complexities of varying waveform. The paper I referred to states microseconds between samples that in the authors view will lead to loss of very(!) fast transients in the music?

Here's a post by @mansr that covers the math that explains why time resolution is not an issue: https://troll-audio.com/articles/time-resolution-of-digital-audio/
 

dc655321

Major Contributor
Joined
Mar 4, 2018
Messages
1,597
Likes
2,235
Here's a post by @mansr that covers the math that explains why time resolution is not an issue: https://troll-audio.com/articles/time-resolution-of-digital-audio/

A thousand times, yes.

If anyone is curious (and you should be), a -6dB, RBCD signal has a time resolution that looks like this (based on @mansr's derivation):

Figure_1.png



Still under 1 microsec at 10Hz - better than 1MHz sampling is not required, as the paper (above) suggests.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,522
Likes
37,050
Yep totally agree but I'm pretty sure she (Cookie Marenco) having been in the business for so long would have heard PCM through any number of pathways/DAC's and still favours DSD.
Yet, like any human, if she has a bias against anything vs DSD, no matter how she came to hold that opinion, she'll have that in her mind whenever she hears other pathways or formats. You would have to assume she heard a real difference. And that if she heard another format where that difference had been solved she'd know it. I think that is not the case in regards to DSD vs PCM. We have lots of reasons to think that she is mistaken despite her long experience. I too think mainly she might hear and like the sound of using analog tape and processing. She could commit the final result to digital in PCM and loose nothing in the sound, but she doesn't believe that.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,522
Likes
37,050
Thanks, yep seen it along time ago and it's a great explanation but sorry I just don't get the timing explanation he gives and cannot relate this in any way to music capture with the complexities of varying waveform. The paper I referred to states microseconds between samples that in the authors view will lead to loss of very(!) fast transients in the music?

I'm going to restate what was in the video in slightly different terms.
Let us imagine we are sampling at 44,100 times per second. 22.7 microseconds between samples. Let us imagine the slightly unrealistic waveform that is at zero for the first sample and is going up linearly by 1 millivolt per microsecond.

First sample is zero, second sample is 22.7 millivolts, third sample is 45.4 millivolts, and fourth sample is 68.1 millivolts.

Now let us move the start of the waveform 2 microseconds after the first sample.

First sample is zero, second sample is 20.7 millivolts, third sample is 43.5 millivolts, and fourth sample is 66.1 millivolts. Even though we sample at a different time by less than 1/10th the time between samples you can differentiate those too situations by the samples that follow.

Remember in the video he says as long as properly bandwidth limited, there is only one set of samples that fit a given waveform. So despite what you might think, and skipping over intricacies of how digital sampling works (but it does indeed work), that waveform would be sampled and reproduces so that the analog output would actually start that linear increase in between two sample points. And it would be in between them by the correct time of 2 microseconds. It might be hard to visualize this would work with multiple waveforms or complex waveforms of musical instruments, but it does indeed work as it should. So 1 microsecond timing is easy even at 44,100 sample rates. You can do more than a thousand times better as mansr's article explains.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,198
Likes
16,981
Location
Riverview FL
speed of sound = ~104,546mm/s

1 microsecond = 104546/1000000 = 0.104546mm

Better to not breathe or let your heart beat while listening, it will upset the timing of the transients...
 

danadam

Addicted to Fun and Learning
Joined
Jan 20, 2017
Messages
956
Likes
1,496
The paper is here.
That's invalid link.

For example, transient onset of xylophone shows waveforms with rise time of less than 10 μs with instantaneous peak output reaching 126dB SPL. Trumpet playing fortecan register 120-130dB peak SPL with steep rise of the waveform within only 10μs to full signal level. Snare drum reaches 130dB and cymbals 136dB peak SPL within microseconds

Does anyone has pointer to hi-res recording of such examples?

If an acoustic source's output has a risetime of under 22.7us, that is simply because it contains frequency content above 22.05KHz

Couldn't agree more (although it should be lowercase "k" :) )

I'm going to restate what was in the video in slightly different terms.

To illustrate, 2 channel file with 441 Hz tone that has the second channel delayed by 1/3 of a sample (see delay.flac.zip):
441Hz.png 441Hz.zoom.png
 

Attachments

  • delay.flac.zip
    213.6 KB · Views: 105

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,522
Likes
37,050
speed of sound = ~104,546mm/s

1 microsecond = 104546/1000000 = 0.104546mm

Better to not breathe or let your heart beat while listening, it will upset the timing of the transients...
I thought all serious audiophiles trained like yogi's so that they can slow their heart rate and breathing to nearly nothing while they are listening to music.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,522
Likes
37,050
Does anyone has pointer to hi-res recording of such examples?
http://www.hometheaterhifi.com/images/stories/audio/cymbal-samples/cymbal-reviews-index.html

You can download samples of cymbals recorded here at 176 khz sample rates using a wide bandwidth Earthworks microphone. Some have output at significant levels out to 60 khz.

I don't think the zero to 130 db in microseconds is true. For starters it would need to be several hundred khz for this to happen so we'd not hear it. For another there is no evidence of it in these recordings. Here is one example, but they are all like this. Notice though struck sharply with a stick, there are several cycles for the level to build. Makes sense. You are hitting a piece of metal, the energy travels across it and then reflects back building a resonance. Though faster than air, metal has a finite speed of sound. So it takes more than microseconds for it travel several inches across a metal cymbal and reflect back and forth a few times to build up to maximum level before a long slow decay. The imagined super steep transients of a harshly struck cymbal are much slower in measured terms than the picture in audiophile heads.

1556573533577.png


1556573533577.png
 

captain paranoia

Active Member
Joined
Feb 9, 2018
Messages
293
Likes
218
I thought all serious audiophiles trained like yogi's so that they can slow their heart rate and breathing to nearly nothing while they are listening to music

Well, I don't think I move very far in one microsecond.

I also can't slow down the speed of sound in dry air from 343m/s to 105m/s...
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,198
Likes
16,981
Location
Riverview FL
I also can't slow down the speed of sound in dry air from 343m/s to 105m/s...

That's a good point.

Speed of sound = 343m/s = 343000mm/s = .343mm per microsecond.

I can ask Nurse Ratched to adjust my restraints now.
 
Last edited:

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,399
That's invalid link.

Sorry, I realise now that I had already logged into my AES account and hence loaded the paper through a closed paywall.

This is the link that @John Deas posted earlier, but I think you need to pay to access it unfortunately.

FWIW, there is no data or graphs etc in the article for the claims in the passage I quoted in post #660. There is a footnote referencing an article called “Specific Hearing Loss in Young Percussion and Brass Wind Players Due to Music Noise Exposures” by Rogowski, Rakowski, and Jaroszewski. I can't find this article online though.

Anyway, 10us doesn't seem completely impossible to me. That would put the frequency at about 45kHz+, which could be captured with certain mics and obviously more or less any decent ADC. In the case of a trumpet, for example, the instrument tends to produce harmonics with similar or greater intensities than the fundamental many, many octaves higher. Unfortunately I can't find a spectrum of a trumpet playing any of its highest notes, but this web page shows recorded spectra of trumpeters playing middle and low Cs. Here is an example of one such spectrum:

1556616978005.png


There is more intensity in H16 than in the fundamental. With a string of extremely high-order harmonics like that, it's conceivable that the instrument could be made to produce significant content well above 20kHz, I think.

And although the author quotes pressure levels like 125-130dB, there is no mention of the distances at which these were measured. No doubt it was extremely close-range

Still, I agree with @Blumlein 88 that this would not be something that happens under normal circumstances. And it would not be an audible part of the instrument's output under any circumstances, of course.
 
Last edited:

John Deas

Member
Joined
Apr 24, 2019
Messages
34
Likes
3
Thanks for all the recent posts - I haven't replied because I'm digesting all the info' provided....:)
 

Miska

Addicted to Fun and Learning
Audio Company
Joined
Feb 20, 2019
Messages
615
Likes
448
It depends how you define “better”.

Certainly, digital mixes using 32 bit floating point processing (or even 24 bit for that matter) and then bounced using appropriate (readily available) dither and anti-aliasing will introduce less noise and distortion than a mix that has passed through an analogue console and effects units.

With some amount of digital shuffling, 32-bit floating point quickly looses it's precision. Not to even mention fixed point approaches used by hardware. Also many of the algorithms used in the digital audio gear and DAWs are far from great, cutting many corners for cost/resources or just being bad implementation.
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,399
With some amount of digital shuffling, 32-bit floating point quickly looses it's precision.

Are you suggesting that 32-bit floating creates levels of noise and/or distortion that even approach that of (even the best) analogue mixing consoles/effects units?

Also many of the algorithms used in the digital audio gear and DAWs are far from great, cutting many corners for cost/resources or just being bad implementation.

That's true. There are examples of better and worse implementations of both digital and analogue effects units. But I'd be surprised if you were trying to argue that a well-implemented digital effects processor introduces comparable amounts of noise and distortion to an equivalent (also well-implemented) analogue effects unit. Is this what you're suggesting?
 

Miska

Addicted to Fun and Learning
Audio Company
Joined
Feb 20, 2019
Messages
615
Likes
448
http://www.hometheaterhifi.com/images/stories/audio/cymbal-samples/cymbal-reviews-index.html

You can download samples of cymbals recorded here at 176 khz sample rates using a wide bandwidth Earthworks microphone. Some have output at significant levels out to 60 khz.

I don't think the zero to 130 db in microseconds is true. For starters it would need to be several hundred khz for this to happen so we'd not hear it. For another there is no evidence of it in these recordings. Here is one example, but they are all like this. Notice though struck sharply with a stick, there are several cycles for the level to build. Makes sense. You are hitting a piece of metal, the energy travels across it and then reflects back building a resonance. Though faster than air, metal has a finite speed of sound. So it takes more than microseconds for it travel several inches across a metal cymbal and reflect back and forth a few times to build up to maximum level before a long slow decay. The imagined super steep transients of a harshly struck cymbal are much slower in measured terms than the picture in audiophile heads.

View attachment 25483

View attachment 25483

Earthworks is specced up to 30 kHz?

This is one example of fairly fast rise time:
http://www.cco.caltech.edu/~boyk/spectra/11.htm#b

I did reproduce/record quite quick transients myself with claves (metal tube model), wood block, castanets and soprano glockenspiel for example. I'm pretty sure lot of the rise time limitations come from microphones. Too bad I don't have the 100 kHz Sanken microphone...
 

Miska

Addicted to Fun and Learning
Audio Company
Joined
Feb 20, 2019
Messages
615
Likes
448
Are you suggesting that 32-bit floating creates levels of noise and/or distortion that even approach that of (even the best) analogue mixing consoles/effects units?

Depends on what you are doing. You cannot compare the two directly, since the error processes are different.

That's true. There are examples of better and worse implementations of both digital and analogue effects units. But I'd be surprised if you were trying to argue that a well-implemented digital effects processor introduces comparable amounts of noise and distortion to an equivalent (also well-implemented) analogue effects unit. Is this what you're suggesting?

My point is that theory and practical real world don't always meet so nicely.

The topic covers everything used in the production work flow, not just single effects processor. Primarily analog vs digital mixing desk, latter typically associated with DAW on the signal path, while in former case DAW may run only the A/D/A and desk automation. The question is also like do you use digital desks' built-in A/D/A converters vs using external ones with analog desk.

Let's say compare Neve 88D vs Neve 88RS. I'm not going to announce outright winner for typical studio workflow. Former seems to run Analog Devices SHARC DSP because it says 40-bit floating point.

Or compared to some high voltage desks like the 5088. Which setup has best mic-pre's, best sounding compressors, etc?
 

andreasmaaan

Master Contributor
Forum Donor
Joined
Jun 19, 2018
Messages
6,652
Likes
9,399
Depends on what you are doing. You cannot compare the two directly, since the error processes are different.

No, the error processes are not the same thing as noise and distortion, which are outcomes.

Are you suggesting that the level of noise and distortion (relative to the signal) introduced by 32-bit floating are comparable in level to the noise and distortion (relative to the signal) introduced by the best analogue consoles and units?

Let's say compare Neve 88D vs Neve 88RS. I'm not going to announce outright winner for typical studio workflow. Former seems to run Analog Devices SHARC DSP because it says 40-bit floating point.

Or compared to some high voltage desks like the 5088. Which setup has best mic-pre's, best sounding compressors, etc?

This isn't a discussion about which sounds best. I've stated already that I'm agnostic on that question. The discussion here is about which is higher fidelity.

PS. I certainly agree that there are valid reasons one might choose an analogue console or effects units to mix with :) Fidelity just isn't one of them.
 
Last edited:

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,198
Likes
16,981
Location
Riverview FL
Earthworks is specced up to 30 kHz?

This is one example of fairly fast rise time:
http://www.cco.caltech.edu/~boyk/spectra/11.htm#b

I did reproduce/record quite quick transients myself with claves (metal tube model), wood block, castanets and soprano glockenspiel for example. I'm pretty sure lot of the rise time limitations come from microphones. Too bad I don't have the 100 kHz Sanken microphone...


This was my "successful" attempt at a limited transient. Tapping a pair of stainless spoons together.

Everything else I'd tried was appreciably slower/lower in the initial frequency/attack.

This with a UMIK-1.

1557878316593.png
 
Last edited:
Status
Not open for further replies.
Top Bottom