• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

BDWoody

Chief Cat Herder
Moderator
Forum Donor
Joined
Jan 9, 2019
Messages
6,949
Likes
22,627
Location
Mid-Atlantic, USA. (Maryland)
If DSD is better quality then why do audiophiles prefer R2R DACs, which look like PCM internally, as opposed to the sigma-delta DACs that look like DSD internally?

Because someone somewhere told them they were supposed to.
 

tmtomh

Major Contributor
Forum Donor
Joined
Aug 14, 2018
Messages
2,636
Likes
7,497
If DSD is better quality then why do audiophiles prefer R2R DACs, which look like PCM internally, as opposed to the sigma-delta DACs that look like DSD internally?



My Samsung phone has a function that does this. You turn on "bandwidth upscaling" in the sound settings and it brings the sample rate up to 192 kHz, adding a bunch of random UHF noise to make your music "hi-res."

Nostalgia.
 

bennetng

Major Contributor
Joined
Nov 15, 2017
Messages
1,634
Likes
1,692
the DSD version apparently retains the equivalent of 18 bits depth of amplitude information (the ENOB) in the audio passband
Digging into more this ENOB in the non-shaped passband topic (~20kHz for DSD64) and the freeware PCM-DSD_Converter, the plots suggested that it has 24 bits of passband ENOB by shaping more noise at 40-100kHz than other encoders like this:
PCM-DSD%20Conversion%20Comparision%20Part%204%2064bit(double)%20Quality%20Group%20Two.png


So I tried it myself and analyzed it with DeltaWave. Here is the reference 44.1kHz 64-bit float signal, a sweeping tone with fading amplitude (from red to blue) and a magenta background:
64f.png


Here is the 24-bit dithered version:
24.png


PCM-DSD_Converter with default settings. Noise floor looks pretty similar to the plots on that website.
4095.png


It allows user-supplied upsampling filter (FIRFilter.dat). The default filter is pretty long (4095 samples) and encoding speed is slow. I made a 511 samples one which is faster and as a side effect is has a graduate roll off from 20kHz instead of a very fast cutoff, but still has full attenuation at 22kHz. I attached the filter file so the settings can be reproduced.
511.png
 

Attachments

  • FIRFilter.zip
    2.7 KB · Views: 70
Last edited:

TimF

Senior Member
Forum Donor
Joined
Dec 15, 2019
Messages
491
Likes
874
Music, time perception, memory, and reaction times
Voss and Clark (1975) showed that the power spectrum for intensity fluctuations in a recording of Bach's Brandenburg Concerto No. 1 (Figure 2G), and in many other instances of recorded music and human voices heard over the radio, was approximately 1/f over about 3 decades of frequency. Musha (1981) also summarized several of his own studies which established that 1/f noise in the spatial frequency domain characterizes some cartoons and paintings, and that transcutaneous pain reduction is more effective when applied according to a 1/f sequence. Gilden, Thornton, and Mallon (1995) reported approximately 1/f power spectra for time series composed of the errors made by human subjects in estimating various time intervals (Figure 2H). Similar power spectra also were found for human reaction times in a memory task (Clayton & Frey, 1995), in many other traditional tasks used in experimental psychology (Gilden, 1997), in coordination of finger-tapping with a metronome (Chen, Ding & Kelso, 1997), and even in simple detection responses (Van Orden, Holden and Turvey, 2005). In psychological data, fluctuations in the dependent variable that cannot be accounted for by the changes in the independent variable(s) are called “error” in the sense of the residuals from a linear regression. Such error is usually considered to arise from a white noise process. Gilden et al. (1995) modeled time estimation errors by a linear combination of 1/f noise from an internal clock and white noise from the motor process producing a key press. Gilden (1997) extended this model to other reaction times, and in so doing, partitioned the unexplained dependent variable fluctuations, or error, into two components
1/f
and white. He found that a substantial proportion of residual error is 1/f . Ward and Richard (reported in Ward, 2002) modeled the 1/f noise component by an aggregation of three AR(1) processes with different parameters, and showed that a manipulation of decision load in a classification task, which changed the slope of the power spectrum, affected the process with the mid-range parameter most.

I post the above as mischief.
 

JustAnandaDourEyedDude

Addicted to Fun and Learning
Joined
Apr 29, 2020
Messages
518
Likes
819
Location
USA
Digging into more this ENOB in the non-shaped passband topic (~20kHz for DSD64) and the freeware PCM-DSD_Converter, the plots suggested that it has 24 bits of passband ENOB by shaping more noise at 40-100kHz than other encoders like this:
PCM-DSD%20Conversion%20Comparision%20Part%204%2064bit(double)%20Quality%20Group%20Two.png


So I tried it myself and analyzed it with DeltaWave. Here is the reference 44.1kHz 64-bit float signal, a sweeping tone with fading amplitude (from red to blue) and a magenta background:
View attachment 77188

Here is the 24-bit dithered version:
View attachment 77189

PCM-DSD_Converter with default settings. Noise floor looks pretty similar to the plots on that website.
View attachment 77190

It allows user-supplied upsampling filter (FIRFilter.dat). The default filter is pretty long (4095 samples) and encoding speed is slow. I made a 511 samples one which is faster and as a side effect is has a graduate roll off from 20kHz instead of a very fast cutoff, but still has full attenuation at 22kHz. I attached the filter file so the settings can be reproduced.
View attachment 77191

Yes, you are right. With the aggressive noise-shaping in the PCM-DSD Converter s/w, it does seem like they are able to achieve roughly 24 bit ENOB in the audio passband. Your test tone with your shorter sample for the filter with the slower roll off does enable the DSD to very nearly do as well as 24 bit dithered PCM. IIRC, I think I got that 18-bit ENOB from Siau's 2013 interview, where he was probably being conservative and basing his number on a less sophisticated noise shaping algorithm. Anyway, good detective work by you. I gotta download some of this software someday, and experiment.
 

mansr

Major Contributor
Joined
Oct 5, 2018
Messages
4,685
Likes
10,700
Location
Hampshire
Yes, you are right. With the aggressive noise-shaping in the PCM-DSD Converter s/w, it does seem like they are able to achieve roughly 24 bit ENOB in the audio passband. Your test tone with your shorter sample for the filter with the slower roll off does enable the DSD to very nearly do as well as 24 bit dithered PCM. IIRC, I think I got that 18-bit ENOB from Siau's 2013 interview, where he was probably being conservative and basing his number on a less sophisticated noise shaping algorithm. Anyway, good detective work by you. I gotta download some of this software someday, and experiment.
The noise level in the audio band can vary a lot between modulators, so quoting a single ENOB figure as applicable to DSD in general would be a mistake. The trouble is that the lower you push the noise, the greater the risk of the modulator becoming unstable or exhibiting other unwanted behaviour. In the comparison plot, notice that the green trace shows distortion products at 200 Hz intervals. The yellow trace looks good at low frequencies, but the amount of noise below 100 kHz is likely to cause audible IMD products on some DACs (iFi comes to mind). Nothing is ever simple.
 
OP
Saidera

Saidera

Senior Member
Joined
Jul 18, 2020
Messages
388
Likes
309
Location
Adelaide, South Australia, Australia
Foobar DSD vs PCM with Sonata HD Pro

10 times less CPU usage with PCM 44.1kHz instead of DSD128.

Keep ASIO volume for PCM at 100 (takes no effect for DSD ASIO) and leave the hardware volume same for flicking back and forth between DSD ASIO and ASIO. This should be a rough volume match.

And behold, there is no discernible difference let alone any sound changes leaning towards DSD. Just more CPU usage and data transfer I guess. So it is the fundamental abilities of a DAC that matter the most, not the assumption that high data rates may overcome oversampling filter inaccuracies or bring the best sound possible.

Even saying mp3s are good for outdoors while hires is for focused listening doesn’t seem relevant. Apple and Qualcomm etc have won the 24/44.1 or 48 forced standardisation path – for common consumers it is enough – accepted. Archimago is right about post-hires. Only engineers could need more.

BUT I do add that there are very subtle improvements of sound quality in very high CPU intensive FIR filter PCM to DSD conversions (60k taps (more obvious changes) or 30k taps (not so much)) which are definitely not near the realm of low level Philips based/foobar processing. The amount of extra CPU work doesn’t justify it though.

Finally there is something only achievable with special rare hardware – the analog recording of PCM to the DSD64 format and then SBM Direct noise shaping and 30k taps processing to achieve a CD file. It is like the effect plugins/hardware which engineers use but it is so much more subtle.

So the different views on a debate of DSD and PCM depends on how much access you have to all kinds of information and hardware. This is sound manipulation by format, not DSP. SBM Direct is from the late 90s. Recently Archimago has shown what SBM does in his post-hires post.

So we have a heap of ways to playback sound – noise shaping to simulate hires from CDs, upscaling using DSEE HX and its derivatives in combination with DSD upscaling in its various CPU intensities and methods… Now we do know SBM causes worse sound compared to the same non-SBM pure CD releases when the same albums are compared. (SBM Direct is exceptional to this). We know DSEE HX is just DSP mashed together – stereo enhancement and frequency spectrum cleanup which doesn’t change the sound so much (but it does) and its upscaling is separate from its DSP effect. Anything can upscale. We don’t need DSEE HX for that. We can just use our own DSP too. But DSEE HX does it for us automatically. DSD upscaling is just too intensive or requires hardware far from reach of average listeners. The watered down FPGA DSD upscaling technologies are not as good as CPU processing and still cost a lot of money.

=SBM Direct for CDs, DSEE HX for mp3s, DSD for overclocking CPUs.

Sony Marketing tried their hardest to push the message that DSD systems trump conventional playback systems we all have, but that is not the case. Our mp3s and MDs certainly lose against DSD but at a consumer level there simply is no place for DSD.
 
OP
Saidera

Saidera

Senior Member
Joined
Jul 18, 2020
Messages
388
Likes
309
Location
Adelaide, South Australia, Australia
The man behind DSD MR AYATAKA NISHIO explains:

In 1988, recordings were made in 16 bit. In 1989, Sony Classical was set up and requested for ways to record better. 20 bit recording was put forward. More ‘depth’ and ‘air’ was required. 16 bit recording led to CDs that couldn’t take advantage of the full 16 bit DR. By using 20 bits one could derive the best 16 bit audio but to do this ‘SBM’ had to be created in 1990. By focusing on reducing noise in the 500 to 5000Hz spectrum using an SBM original noise curve they succeeded in selling a lot of CDs. The noise would rise from 15kHz onwards.

DSD had to be a multiple of 44.1kHz. Unfortunately players nowadays are not 1 bit. The origin of DSD is from considerations of an A/D converter for 20 bit audio. Its processing used 4 bit 5.6 MHz DSM. When experimenting for improvements, they were comparing the spectrums of this 4 bit 5.6 MHz DSM signal and analog signal. Someone said, ‘Wait – if this were 1 bit, computing would be simplified and efficient, storage would be possible with smaller space and semiconductors wouldn’t be so burdened.’

This was the Day 1 for DSD.

In Feb 1994 they recorded harp in DSD and found it captured ‘air’. The small details wouldn’t simply disappear like a CD straight away. 20 bit was apparently good enough but they compared 44.1kHz with 2.8MHz and decided DSD was better. The difference between PCM and DSD is that PCM takes ‘flashes’ like a camera for its sampling, while DSD tries to compute until the continuous analog and resulting digital signal streams have no difference. The entire analog amount is taken into consideration by the 1s and 0s of 1 bit audio. DSD ‘remembers the past and the future and the present instant’ in terms of replacing the analog with digital. PCM only cares about the present instant.

Nishio-san’s analogy is a beer factory: the bottles are closer together if the sample rate is higher. Leakage happens between the current and the next bottle on the conveyor belt. Quantisation is simply a person choosing which size bottle to use. It has to be exact. Quantisation errors are the leakages due to not meeting the specifications. So to improve on PCM, there has to be a higher sampling rate and many more ‘specifications’ for the bottle types so that leakage is reduced. DSD only has one type of bottle, and it’s either filled or empty. There is a tank and a lever and there is no leakage. The bottles are tightly packed on the belt anyway. What this tried to show is that the analog is directly transformed into a digital equivalent.

Sure DSD can directly bring the studio sound to the consumer. But so can high rate PCM. For listeners there appears to be no difference unless DSD is compared to CD instead of 24/48. For engineers DSD is too much trouble and is nonexistent anyway. Upsampling also has questionable benefit. SBMD does have the ability to reduce PCM harshness. DSD masters are convenient in bringing analog tapes and studio masters to the masses. But in theory and in practice, DSD is simply a very necessary format we can’t live without and definitely it filled the void until high rate PCM could catch up, despite much of the DSD sound being a mysterious mechanism.

SBMD, DSEE HX, Archimago’s Post-Hires declaration: all showing that CD, mp3, and 24/48 are sufficient to convey the sound accurately. The high frequency noise we can’t hear seems a terrible waste of data.

Source: Google Books (See attachments)
 

Attachments

  • d1.PNG
    d1.PNG
    905 KB · Views: 102
  • d2.PNG
    d2.PNG
    740.2 KB · Views: 104
  • d3.PNG
    d3.PNG
    845 KB · Views: 106
  • d4.PNG
    d4.PNG
    1.4 MB · Views: 109
Last edited:

JohnYang1997

Master Contributor
Technical Expert
Audio Company
Joined
Dec 28, 2018
Messages
7,175
Likes
18,292
Location
China
Heck. The memory of the system is a disadvantage of it not merit. This means that performance may not be predictable and can be affected by the past.
 

mansr

Major Contributor
Joined
Oct 5, 2018
Messages
4,685
Likes
10,700
Location
Hampshire
Someone said, ‘Wait – if this were 1 bit, computing would be simplified and efficient, storage would be possible with smaller space and semiconductors wouldn’t be so burdened.’
Someone may of course have said that, but in that case they were wrong.
 

misterdog

Addicted to Fun and Learning
Joined
Dec 7, 2018
Messages
503
Likes
389
If I create some software to convert the sound inside my Honda Civic to sound like that of a Ferrari, will I feel like I am inside a Ferrari?

I shall market this software as FOC (Ferrari over Civic).

When musicians , engineers and record companies put as much effort into mastering and producing Red Book as they do DSD , then it sounds damm fine.
 
Joined
Aug 24, 2020
Messages
86
Likes
47
Although interesting and quite informative, this thread about DSD is yet (n+1)th attempt at showing just how useless it is supposed to be - compared to the PCM.

The reason for the above statment ? Not a single member - at least in this thread - has not even hinted at the fact that she/has ever had access to the live microphone feed and consequent monitoring of both PCM and DSD recording.
Like in ACTUALLY BEING THERE - not just spreading hearsay by other people, who did compare the three.

I am doing it - constantly. Because I record music - mostly acoustical, any genre for which musicians themselves must have no clue what electricity is. Vocal, choir, classical, jazz, ethno - as long it is unplugged. Musicians are allowed LED lamps so that they can see their sheet music - everything else is off limits.
There are cases I have to do with Nikola Tesla's "Call From The Grave" - 50/60Hz hum ( and all its harmonics ... ) - plus any ultrasonic noise from the lighting. I will try to suppres the lighting to the max ... - but sometimes, it is unfortunately impossible to do that.

The importance of the recording - and consequent mastering, IF and WHEN required - plays far more important role than does the recording medium.
A properly recorded analog cassette master will always trounce a poorly recorded hirez - regardless if it is PCM, DSD or whatever.
That means only relatively simple miking techniques that can preserve tiny time differences in original sound are really suitable. Any multimiking is bound to blur these tiny differences - or nullify them completely.

Let me say it up front - I prefer DSD over PCM. To my ears, it sounds closer to the live mic feed than does PCM. Still, this holds true only if and when DSD is of sufficiently high oversampling frequency. DSD64 - or what is more known as SACD - is just not good enough. There is far too much ultrasonic noise above 20 kHz - which can and does affect the performance in nominally audible band ( 20 Hz-20kHz ). TBH, in that case, I am likely to prefer any 24bit depth PCM with sampling at least 88.2kHz. It takes DSD128 and above for the DSD to really start to deliver.

It means that the data required for the about comparable - not SAME - audio quality, PCM will require smaller file than DSD. But once one accepts that "waste", there is a sonic reward.

DSD ( if sampling is high enough...) can preserve spatial cues better than "equivalent" PCM. For that, recording venue has to be above certain limit in size - a typical jazz club is simply too small for this to be readily audible. But once great enough concert hall or church is in question, one can hear/sense/percept how the sound "travels"... - giving even on reproduction the proper scale of size of the venue, even on 2 channel only.
For those insisting Redbook CD is all that it takes ... - bounce the master in DSD to RBCD - and you get the typical flat soundstage with next to no depth.

DSD - even the lowly DSD64 - has frequency response extended above 20 kHz. By the time present standard in DSD recording - DSD256 - is achieved, it can cover response up to 100 kHz with about 100dB SNR even at that frequency - or something in the general vicinity of these two figures.
There is music WELL above 20 kHz - even well above 100 kHz. But present definition of HiRez Audio, which requires frequency response to about 40 kHz, is a good starter and a reasonable value for the interim period, before the full 100 kHz bandwidth becomes the standard.

You might ask - why ?
https://www.cco.caltech.edu/~boyk/spectra/spectra.htm

The instrument that is on average the most alive above 20 kHz is harpsichord. In a good recording, there are overtones to at least 50 kHz. Although relatively low in level, they ARE there - and are not from lighting, CRTs, digital artefacts or any other source than the instrument itself.
The second instrument with a little less broad, but for that matter, far more precise defined content above 20 kHz, are chimes.


I agree DSD is difficult to work with, that it has to be converted into some form of PCM for editing - and , besides that, NOTHING else can be done to it in digital domain. In that, it is only slightly more forgiving tha recording direct to analog disk - multiple takes are possible, which can be later edited/spliced.
But there is no EQ, pitch autotune, or any other of untold amount of plugins available in PCM.
There are far fewer musicians capable of recording without the usual "safety net of PCM - we'll fix it in the mix" approch.
And even fewer willing to leave an error here or there in the finished product - despite the performance "in one go" containing that inexplicable but perceptible spirit a note by note perfect studio edited recordings almost always lack.

I agree that old recordings are NOT the proper representation of what DSD can do - either from analog tapes, but even more so from the early digital recording era.
I see there are now some attempts to standardize the way old recordings should be converted to DSD in order to be "legit" new DSD recordings. Although a move in right direction, the audio quality can never equal a well made DSD recording of today.

DSD will forever remain a niche within a niche - but, for those who understand it, know what it takes and are willing to go an extra mile, it will perhaps continue to represent the last safe heaven of honest music recording.
 
Top Bottom