• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Does DSD recording benefit Japanese traditional instruments?

On my return home, I will order the SACD at HMV Japan.

I have my "secret" method of extracting the intact DSD layer into DSF (DSD64 2.8 MHz 1 bit) files using "specific version model" of Sony PlayStation. Of course I can rip the CD layer in very accurate manner by using dBpoweramp CD Ripper.
I am curious to get your thoughts. I used the Sony Blu-Ray player trick to extract the DSD layer and EAC to rip and had 100% accuracy.

Interesting thread and conclusions. I have a question though - many professional studios use DSD for recording. Merging has an entire line of DSD capable processors. What is the advantage of a DSD workflow for professional studios? And if it's good for the pros, why isn't it good for us?

I don’t think DSD is used for recording that much. Most modern music is done with PCM or DXD which is PCM. There is probably no advantage to DSD in the present day (as opposed to when DSD was developed in 1995. At that time, the best PCM DAC was the PCM1702 which just had a THD+N of -96 dB.

So I think this is the first time I have seen someone in the modern day (2023) advocate for DSD recording with a rationale beyond “sounds good”.

Even being able to ABX the CD and DSD layer, if you want the ASMR effect, the CD is actually better!
 
@voodooless
That’s a great question! And @pkane brought it up too just right now via private message...

We know that the *files* match in volume but there’s no guarantee that Foobar actually matches the volume at playback.

I will run the output of my headphone amp into the E1DA and see if there’s a difference.
 
Probably mostly because of marketing. Then again I doubt many studios have a full DSD workflow. It’s much easier to convert to DSD afterwards.

I see. I actually have a contact from Merging (because I bought a Merging DAC from him). At the time, it was much earlier in my audio journey and I was sold on the idea of DSD because "it's what the pros use". Since then, I have discovered that convolution in DSD is possible but it brings my very powerful PC to its knees, and I couldn't hear much of a difference between DSD and PCM anyway. So I started wondering why the heck pros use it if it's so inconvenient and the differences are so hard to hear. Choosing DSD actually cost me money, and not just on the DAC. I had a CPU die prematurely from subjecting it to temperatures >85C due to the heavy duty cycle of DSD. I rebuilt the computer with a very powerful CPU capable of DSD convolution, but decided to do all my processing in PCM. I now have a ridiculously powerful PC with an expensive quiet cooling solution which is barely ticking along at 5% CPU usage when convolving in PCM!

Hence the question, whether there is any advantage to studios for using DSD. This thread really piqued my interest and made me wonder if I should go back to DSD.
 
you are looking for an academic paper that shows how many of them use DSD, I am not aware of one.

I’d be surprised if there were such a thing. I was expecting you could rattle off more than one.

However, the fact that Merging are still in business selling a lot of Anubis, Hapi, and Horus interfaces and have not gone under suggests that there is market demand for DSD.

Also could be suggestive of effective marketing.
 
Just one follow-up of my above message #16.

Even though I am interested in spectral comparison between native DSD layer and the CD layer of that SACD, nowadays I dare seldom listen to native DSD sound with DSD-native capable DACs. You would please refer to my specific post on my project thread for the details;
- Summary of rationales for "on-the-fly (real-time)" conversion of all music tracks (including 1 bit DSD tracks) into 88.2 kHz or 96 kHz PCM format for DSP (XO/EQ) processing: #532
 
So why do studios use it?
Audio engineers are as prone to remarkable claims about formats, the extent of human hearing, and such, as us idiots at the receiving end of their efforts. Rupert Neve believed human hearing is affected by 100kHz signal, and he has a following among engineers. We should not be surprised by such or expect rationality to exist in any part of the audio industry.

There might just be a case that the noise shaping in a DSD ADC could result in less noise when recording than a PCM ADC, but I wouldn't buy that without seeing proof.
 
Probably mostly because of marketing. Then again I doubt many studios have a full DSD workflow. It’s much easier to convert to DSD afterwards.


Also in sense vice versa...DSD was literally conceived as an archival format designed to be downconverted to multiples of CD sample rate for consumer products.

Though I'm skeptical that's the reason any studios use it today. (Indeed, years ago Emil Berliner/DG compared DSD to hi rez PCM and chose the latter for its archiving)
 
Even more interesting now. I loaded up the Foobar ABX interface in Foobar and recorded the output through my E1DA Cosmos ADC at 32-bit/176 kHz.

(Edit: When listening, I use the XLR4 output. When recording, I used the dual 3.5mm jacks to dual 3-pin XLR. Input impedance of the E1DA is 640 ohms.)

I played "A" and then played "B" as specified while recording.

There is a volume level mismatch but it's not so simple.

Reference is DSD -->Comparison is CD layer.

Initial peak values Reference: -35.791dB Comparison: -31.731dB (4.060 dB difference for peaks)
Initial RMS values Reference: -51.402dB Comparison: -50.504dB (0.898 dB difference for RMS).

Which is interesting because if I increase the volume knob, I will either match RMS or peak, but not both. Here is the delta spectogram.
1680843343555.png

1680843657408.png


DeltaWave v2.0.8, 2023-04-06T21:52:15.3197955-07:00
Reference: DSD176.wav[?] 1174212 samples 176400Hz 32bits, stereo, MD5=00
Comparison: CD176.wav[?] 1169156 samples 176400Hz 32bits, stereo, MD5=00
Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -500dB
Correct Non-linearity: False
Correct Drift:True, Precision:30, Subsample Align:True
Non-Linear drift Correction:False
Upsample:True, Window:Kaiser
Spectrum Window:Kaiser, Spectrum Size:32768
Spectrogram Window:Hann, Spectrogram Size:4096, Spectrogram Steps:2048
Filter Type:FIR, window:Kaiser, taps:262144, minimum phase=False
Dither:False bits=0
Trim Silence:False
Enable Simple Waveform Measurement: False

Discarding Reference: Start=1.5s, End=1s
Discarding Comparison: Start=1.5s, End=1s

Initial peak values Reference: -35.791dB Comparison: -31.731dB
Initial RMS values Reference: -51.402dB Comparison: -50.504dB

Null Depth=5.036dB
Cross-correlation found periodic/simple waveform! Trying simple measurement option!
X-Correlation offset: -36710 samples
Drift computation quality, #1: Excellent (1.27μs)


Trimmed 0 samples ( 0.00ms) front, 0 samples ( 0.00ms end)


Final peak values Reference: -35.791dB Comparison: -37.804dB
Final RMS values Reference: -51.374dB Comparison: -56.499dB

Gain= 6.0565dB (2.0083x) DC=0 Phase offset=-208.104438ms (-36709.623 samples)
Difference (rms) = -52.97dB [-99.42dBA]
Correlated Null Depth=15.39dB [57.02dBA]
Clock drift: 0.41 ppm


Files are NOT a bit-perfect match (match=0.28%) at 16 bits
Files are NOT a bit-perfect match (match=0%) at 32 bits
Files match @ 49.7006% when reduced to 8.31 bits


---- Phase difference (full bandwidth): 57.9239425962841°
0-10kHz: 3.61°
0-20kHz: 2.71°
0-24kHz: 2.53°
Timing error (rms jitter): 20.5ns
PK Metric (step=400ms, overlap=50%):
RMS=-99.8dBr
Median=-102.3
Max=-90.5

99%: -91.47
75%: -99.36
50%: -102.3
25%: -104.24
1%: -106.01

gn=0.49793681914921, dc=0, dr=4.09543089287368E-07, of=-36709.6228327707

DONE!

Signature: 9f5b3ff95f25e6d6d564b320f79d118a

RMS of the difference of spectra: -114.068002765477dB
DF Metric (step=400ms, overlap=0%):
Median=-2.9dB
Max=-1.9dB Min=-5.7dB

1% > -5.73dB
10% > -5.59dB
25% > -3.75dB
50% > -2.93dB
75% > -2.16dB
90% > -1.64dB
99% > 0.0dB

Linearity 13.8bits @ 0.5dB error


So I decided to edit the recording of the DSD layer and amplify it by 4.06 dB to match the PEAKS.

Test #1: ABX Matched Peaks (p = 0.0021)
Because the E1DA has a minimum 1.7V sensitivity, I had to really boost my headphone amp. Previously I was -31.5 at low gain. I went to -0 dB at low gain and it still wasn't loud enough, so I switched to high gain and ended up at -8 dB on the volume.

Initial peak values Reference: -31.731dB Comparison: -31.731dB
Initial RMS values Reference: -47.342dB Comparison: -50.504dB

At this point of the evening, the CD layer did not give me a very strong ASMR effect. Even when I knew the reference "B" mode was the CD layer, I could only occasionally generate the ASMR effect. So for this ABX test, I had to keep switching back and forth until I actually thought I felt something. I still got a p=0.0021 which is statistically significant.

1680844688889.png

DeltaWave v2.0.8, 2023-04-06T22:07:05.8074594-07:00
Reference: DSD176_matchPEAK.wav[?] 1174212 samples 176400Hz 32bits, stereo, MD5=00
Comparison: CD176.wav[?] 1169156 samples 176400Hz 32bits, stereo, MD5=00
Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -500dB
Correct Non-linearity: False
Correct Drift:True, Precision:30, Subsample Align:True
Non-Linear drift Correction:False
Upsample:True, Window:Kaiser
Spectrum Window:Kaiser, Spectrum Size:32768
Spectrogram Window:Hann, Spectrogram Size:4096, Spectrogram Steps:2048
Filter Type:FIR, window:Kaiser, taps:262144, minimum phase=False
Dither:False bits=0
Trim Silence:False
Enable Simple Waveform Measurement: False

Discarding Reference: Start=1.5s, End=1s
Discarding Comparison: Start=1.5s, End=1s

Initial peak values Reference: -31.731dB Comparison: -31.731dB
Initial RMS values Reference: -47.342dB Comparison: -50.504dB

Null Depth=10.534dB
Cross-correlation found periodic/simple waveform! Trying simple measurement option!
X-Correlation offset: -36710 samples
Drift computation quality, #1: Excellent (1.27μs)


Trimmed 0 samples ( 0.00ms) front, 0 samples ( 0.00ms end)


Final peak values Reference: -31.731dB Comparison: -33.72dB
Final RMS values Reference: -47.314dB Comparison: -52.434dB

Gain= 2.0061dB (1.2598x) DC=0.00008 Phase offset=-208.10444ms (-36709.623 samples)
Difference (rms) = -48.91dB [-95.25dBA]
Correlated Null Depth=16.82dB [53.71dBA]
Clock drift: 0.41 ppm


Files are NOT a bit-perfect match (match=0.18%) at 16 bits
Files are NOT a bit-perfect match (match=0%) at 32 bits
Files match @ 49.7021% when reduced to 7.64 bits


---- Phase difference (full bandwidth): 62.7350578409832°
0-10kHz: 3.81°
0-20kHz: 3.00°
0-24kHz: 2.85°
Timing error (rms jitter): 20.6ns
PK Metric (step=400ms, overlap=50%):
RMS=-94.1dBr
Median=-96.7
Max=-84.7

99%: -85.66
75%: -93.74
50%: -96.72
25%: -98.87
1%: -100.39

gn=0.793767648814304, dc=7.88221567760325E-05, dr=4.09711334223552E-07, of=-36709.6232590757

DONE!

Signature: cb92623304f0b9d846a97c0e416d9246

RMS of the difference of spectra: -110.056908410277dB
DF Metric (step=400ms, overlap=0%):
Median=-2.9dB
Max=-1.9dB Min=-5.7dB

1% > -5.73dB
10% > -5.59dB
25% > -3.75dB
50% > -2.93dB
75% > -2.16dB
90% > -1.64dB
99% > 0.0dB

Linearity 12.2bits @ 0.5dB error

foo_abx 2.1 report
foobar2000 v1.6.12
2023-04-06 22:07:39

File A: DSD176_matchPEAK.wav
SHA1: c920901871e6caf8b791ffb476414abbad993e3b
File B: CD176.wav
SHA1: f4f0f03fa19828b89c012b9e5902a044e5ba3998

Output:
ASIO : Sony Headphone Amplifier Driver
Crossfading: NO

22:07:39 : Test started.
22:08:39 : 01/01
22:08:52 : 02/02
22:09:34 : 03/03
22:10:08 : 04/04
22:10:14 : 05/05
22:10:20 : 05/06
22:11:05 : 06/07
22:11:14 : 06/08
22:11:20 : 07/09
22:11:27 : 08/10
22:11:35 : 09/11
22:11:48 : 10/12
22:11:54 : 11/13
22:12:04 : 12/14
22:12:11 : 13/15
22:12:22 : 14/16
22:12:22 : Test finished.

----------
Total: 14/16
p-value: 0.0021 (0.21%)

-- signature --
4958b6611892343b269ae40124fc1c738d2a6574

Test #2: ABX Matched RMS (p = 0; volume difference is notable)
1680845117639.png


DeltaWave v2.0.8, 2023-04-06T22:24:09.2693705-07:00
Reference: DSD176_matchRMS.wav[?] 1174212 samples 176400Hz 32bits, stereo, MD5=00
Comparison: CD176.wav[?] 1169156 samples 176400Hz 32bits, stereo, MD5=00
Settings:
Gain:True, Remove DC:True
Non-linear Gain EQ:False Non-linear Phase EQ: False
EQ FFT Size:65536, EQ Frequency Cut: 0Hz - 0Hz, EQ Threshold: -500dB
Correct Non-linearity: False
Correct Drift:True, Precision:30, Subsample Align:True
Non-Linear drift Correction:False
Upsample:True, Window:Kaiser
Spectrum Window:Kaiser, Spectrum Size:32768
Spectrogram Window:Hann, Spectrogram Size:4096, Spectrogram Steps:2048
Filter Type:FIR, window:Kaiser, taps:262144, minimum phase=False
Dither:False bits=0
Trim Silence:False
Enable Simple Waveform Measurement: False

Discarding Reference: Start=1.5s, End=1s
Discarding Comparison: Start=1.5s, End=1s

Initial peak values Reference: -34.893dB Comparison: -31.731dB
Initial RMS values Reference: -50.504dB Comparison: -50.504dB

Null Depth=11.893dB
Cross-correlation found periodic/simple waveform! Trying simple measurement option!
X-Correlation offset: -36710 samples
Drift computation quality, #1: Excellent (1.27μs)


Trimmed 0 samples ( 0.00ms) front, 0 samples ( 0.00ms end)


Final peak values Reference: -34.893dB Comparison: -36.882dB
Final RMS values Reference: -50.476dB Comparison: -55.596dB

Gain= 5.1681dB (1.813x) DC=0.00005 Phase offset=-208.104438ms (-36709.623 samples)
Difference (rms) = -52.07dB [-98.4dBA]
Correlated Null Depth=17.77dB [66.78dBA]
Clock drift: 0.41 ppm


Files are NOT a bit-perfect match (match=0.25%) at 16 bits
Files are NOT a bit-perfect match (match=0%) at 32 bits
Files match @ 49.7027% when reduced to 8.16 bits


---- Phase difference (full bandwidth): 50.1018817100633°
0-10kHz: 3.65°
0-20kHz: 2.75°
0-24kHz: 2.58°
Timing error (rms jitter): 20.6ns
PK Metric (step=400ms, overlap=50%):
RMS=-96.9dBr
Median=-99.6
Max=-87.4

99%: -88.41
75%: -96.54
50%: -99.59
25%: -101.73
1%: -103.24

gn=0.551560833948283, dc=5.47706997664166E-05, dr=4.10209660015685E-07, of=-36709.6228602561

DONE!

Signature: 060a5f82d974e690f4d12a987fd06891

RMS of the difference of spectra: -113.170683502368dB
DF Metric (step=400ms, overlap=0%):
Median=-2.9dB
Max=-1.9dB Min=-5.7dB

1% > -5.73dB
10% > -5.59dB
25% > -3.75dB
50% > -2.93dB
75% > -2.16dB
90% > -1.64dB
99% > 0.0dB

Linearity 13.4bits @ 0.5dB error

foo_abx 2.1 report
foobar2000 v1.6.12
2023-04-06 22:20:46

File A: DSD176_matchRMS.wav
SHA1: 95a6c5abbb6bb72bf6ee1c6957669a8b660129ec
File B: CD176.wav
SHA1: f4f0f03fa19828b89c012b9e5902a044e5ba3998

Output:
ASIO : Sony Headphone Amplifier Driver
Crossfading: NO

22:20:46 : Test started.
22:21:23 : 01/01
22:21:31 : 02/02
22:21:42 : 03/03
22:21:48 : 04/04
22:21:58 : 05/05
22:22:05 : 06/06
22:22:10 : 07/07
22:22:18 : 08/08
22:22:24 : 09/09
22:22:29 : 10/10
22:22:38 : 11/11
22:22:44 : 12/12
22:22:49 : 13/13
22:22:55 : 14/14
22:23:01 : 15/15
22:23:08 : 16/16
22:23:08 : Test finished.

----------
Total: 16/16
p-value: 0 (0%)

-- signature --
2e5df8425e2c0a6dbfa69fbdcae61b421e025324

With this comparison, the volume difference was notable. I could hear it across the entire track (including bass) whereas before there was just a subtle difference in the stringed instruments. Running my headphone amp as -31.5 dB low gain the first time and then having to go past 0 dB low gain to get into the -8 dB high gain mode, may alter the bass response too.

There is a difference in sound between playing the music normally versus recording the music and then boosting the amplification of the recording at the headphone amp stage. Because it is a copy of a copy, I am adding more THD+N.

The sound of DSD is ultrasonic noise

There are silly levels of ultrasonics when recording the DSD playback. Remember, I'm running this through a zero negative feedback headphone amplifier which only makes things worse. You cannot truly volume match this because the waveforms are fundamentally different. PK Metric remains in the -90 dB range because the spectral content is the same in the 20Hz - 20KHz.

1680845360708.png


Over the last few months, as I've done my measurements of different setups, I've recently made the statement that "if there is such a thing as a DSD sound, it’s ultrasonic noise.”

I also made a comment when measuring a tube amplifier: "I wonder if tubes produce the effect of analog dither"

So my opinions
1) Ultrasonic noise might be generating IMD into the audible band.
2) Ultrasonic noise may account for the “preferred” coloration of DSD or tube amplification.
3) There is no missing fundamental effect when listening exclusively to ultrasonics. Although I did describe a pressure sensation when trying it.
4) Both the CD and SACD layer are enjoyable but the SACD layer is preferred to me.

And my facts
1) Digital files at the same volume are not necessarily played back at the same actual volume
2) Ultrasonic noise is very high which makes waveform matching difficult
3) In my signal chain, with all sorts of a volume matching strategies, there remains a detectable difference in ABX testing
4) Listener fatigue is real. My ability to discern the ASMR effect in the late evening was worse than earlier in the day.
5) For this album, there are differences between the SACD and CD layer.
 
Last edited:
Those differences between peak and RMS values suggest extra peak limiting on the CD. And the difference file shows that also with the very short spikes, these are the ones left on the DSD file but limited on the CD. So are you really comparing 2 identical files?

Edit: Clipping the CD file will do almost the same thing, so maybe thats whats going on.
 
Last edited:
1) Digital files at the same volume are not necessarily played back at the same actual volume

Right. This is especially true when mixing formats between DSD and PCM. DSD is usually recorded at -3 to -6dB relative to PCM to leave some headroom so as to not overload the S-D modulator. SACD specifies -6dB, for example, although most DSD content I've seen seems recorded or converted closer to -3dB. Some playback software automatically make adjustments for this difference, so if two files, DSD and PCM are produced for the same dynamic range the result will be a different playback level. This is why it's important to level match at the output, not at the input :)
 
So are you really comparing 2 identical files?

Definitely not, identical. But that is part of the hypothesis or question.

We cannot confirm that DSD 11.2 is necessary for the recording, but it should be able to confirm that it was “worthwhile” to distribute in DSD 2.8 as opposed to plain CD.

My answer is that it is ABX’able. It doesn’t seem like a simple mastering choice, because the first set of tests show that the inputs are very similar. But it does show that the outputs are even more different.

The outputs still have a PK Metric of -90 dB or so, yet visually you can see visually very big differences.
 
So the conclusion is that you are listening to different masters and different playback levels? Yeah, no wonder you hear differences.
 
I myself was also able to hear difference in blind tests between DSF and FLAC downloaded from one audiophile site. But after converting DSF to CD quality 44100 WAV I was suddenly not able to here any difference in blind tests.

Here is the actual thread:

Therefore most probably I was hearing difference not between formats but between masters.
Easy test which saved me expensive trips down the audiophile rabit holes.
 
Hello @GXAlan and friends,

The specific hybrid SACD arrived on my desk, and I could successfully extract the DSD layer into DSF files using old model of Sony PlayStation and "the" unofficial SACD ripping tool. Of course, I could rip the CD layer into AIFF PCM 16 bit 44.1 kHz files using dBpoweramp CD Ripper.

I am on my intensive process of analyzing the track-11 in DSF and AIFF using MusicScope.

My comparative listening with my audio setup will be performed very soon, as follows.

1. Native DSD vs. AIFF using JRiver MC --> single DAC (OKTO DAC8PRO in 2-Ch stereo mode, of course capable of DSD and PCM) --> single HiFi amplifier (Accuphase E-460) --> passive LCR-network --> five-way SP setup (with L&R sub-woofer, L&R super-tweeter) under objectively proved/determined exact level matching in 0.1 dB precision.

2. DSD vs. AIFF using my DSP-based multichannel multi-SP-driver multi-amplifier time-aligned stereo 5-way 10-channel audio system; in this case both of DSF and AIFF would/should be converted into 88.2 or 96 kHz 24 bit PCM on-the-fly by JRiver for DSP processing with EKIO again under objectively proved/determined exact level matching in 0.1 dB precision.


BTW, I have already converted the extracted DSF file of Track-11 into 16 bit 44.1 kHz AIFF using dBpoweramp Music Converter and did comparative objective analysis by MusicScope between the DSF->AIFF file and the CDLayer-ripped-AIFF file.

Therefore, intensive comparative listening will be also done soon;
3. DSF->AIFF file vs. CDLayer-ripped-AIFF using my DSP-based multichannel multi-SP-driver multi-amplifier time-aligned stereo 5-way 10-channel audio system; in this case both of the AIFF tracks would/should be converted into 88.2 or 96 kHz 24 bit PCM on-the-fly by JRiver for DSP processing with EKIO again under objectively proved/determined exact level matching in 0.1 dB precision.

Hopefully, the details of my above investigations will be posted here in this thread within a few weeks.
 
Last edited:
Therefore most probably I was hearing difference not between formats but between masters.

Even if you use same SACD player or same software audio player (like JRiver, Roon), there could be subtle difference(s) of internal DAC procedure on DAC processor chip or on the software for DSF feed vs. PCM feed.

"Difference between the masters" would of course too could cause audible difference(s).

The issue on audible/objective difference between DSF(DSD) and PCM, therefore, would be always so complicated and never to be perfect, I assume.
 
Last edited:
The issue on audible/objective difference between DSF(DSD) and PCM, therefore, would be always so complicated and never to be perfect, I assume.

After DSF was converted to PCM I was not able to hear the difference in blind tests on the same hardware.
This along with scientific consensus gives me 99% assurance that the difference was audible because of master or transcoding parameters.
 
Back
Top Bottom