Well, I can imagine too, that the original idea was for its intended eventual use to be, after the system was vetted, as an archiving format.
Certainly as a recording or production format, it was rather not fit for purpose. That had to be known very early on. Your/BE's scenario would mean that mean Sony knew from the start that likely they'd have to use PCM somewhere in the production chain of SACDs (DXD or "DSD wide', as it later was known). Rather than that being a consequence of re-orienting DSD to commercial SACDs, as in my scenario.
Indeed, they probably knew from the beginning that editing is DSD would be too complex because of its deviation from 2’s complement representation. I work very close to Merging (geographically speaking) who developed DxD with Philips (which, as you suggest, is PCM), and I always wanted to say hi and thanks to them but I don’t have a good reason to bother them
Message edited for correction, additions and hyperlinks added on September 7 and 13, 2025. The additions are written in italic.
As explained by Black Elk on SHF, Philips, contrary to Sony, did not have a pro-audio division and thus chose to work with Merging by transferring expertise on digital signal processing and let them put on the market the appropriate stuff to support SA-CD production. According to BE, Philips's engineers thought that processing at 16 FS PCM (705.6 ksps in 44.1 ksps based system or 768 ksps in 48 ksps based system) was necessary to keep the time domain performance of DSD signals, but Merging didn't have the technical platform to achieve that sample rate at the time. As a result, Philips and Merging acknowledged to fell back on 8 Fs PCM (352.8 ksps or 384 ksps), which was within the possibility of Merging
's Pyramix Virtual Studio hardware and software portfolio at the time. Merging decided to name its format system DXD, which angered Sony, which owns the rights to the DSD acronym and felt that Merging was commercially parasitic by taking advantage of the fame of the DSD name. But eventually everyone agreed to allow Merging to use the name DXD.
The Merging system was not put on the market before 2004. Before that, DSD was already editable, within some limitations, thanks to Sony hardware and software.
However, conversion of DSD to PCM and back to DSD for export after signal processing has been done was never the single production option in Pyramix.
The following is my understanding from years of researches and reading in the Internet archives.
As soon as DSD support was added to the Pyramix system, it offers the choice between a "DXD Mixing Project" and a "DSD Project" (see for instance Appendix III of the old Pyramix 4.3 user manual). The former relies on the conversion of the entire DSD files to be processed in DXD and opens the ability to use the full fledged production tools, such as equalization, mixing and dynamic processing. The latter is capable of editing only but keep the DSD files to be edited as is.
The DSD editing capability of the Pyramix DAW had been design in the early 2000s by Philips engineers Derk Reefman and Peter Nuijten and presented at the 110th AES Convention in Amsterdam.
The authors claimed that their processing offers more flexibility than a previous system devised by Sony (see below) and that it was incorporated in the Pyramix system. Actually, DSD support was added to Pyramix Virtual Studio from issue 4.0 of the software engine which was released by Merging in the first half of the year 2002. As can be clearly seen in the above block diagram, the two DSD streams to be past together to make an edit are left unchanged at the input and at the output of the switch matrix, except for the few milliseconds of the duration of the edit point. Incidentally, the authors also made interesting statements about results of double-blind tests conducted at the "Natlab" (the former Philips audio research laboratory in Eindhoven, Netherlands) with trained listeners coming from Philips and Polyhymnia recording teams, especially the fact that the low pass-filter following the gain cells C1 and C2 seen on the above graph was noticeable when the corner frequency was set to 50 kHz, but not when it was set at 80 kHz or higher.
The hardware part of the Pyramix system in which DSD cross-fading as well as DXD processing were performed included the Merging Mykerinos PCI audio card which was based on a Philips TriMedia processor.
However popular Pyramix was and still is, it was not the first editing system for DSD. The very first one was the product of a joint Sony/Sonic Solutions effort to prototype a 2 channels DAW. The complete system was presented at 104th AES Convention in Amsterdam in 1998 by Ayataka Nishio (Sony) and James A. Moorer (Sonic Solutions). An open-access version of this paper is available on J. A. Moorer's website.
Sony engineer Ayataka Nishio is credited to have led the development of the first digital signal processor to edit DSD that was used in this DAW
: the Sony CXD2926. He had presented this chip as early as 1997 at the
102nd AES Convention in Munich. This chip was capable of level control, switching between DSD streams, cross-fading, fade-in/fade out and, under some conditions, equalization. The level control is absolutely straightforward. The switching part was the most difficult to implement because, as pointed out by NTTY, a DSD stream has no "0" level and hence, signal-dependent DC levels are part of the DSD signal
, no to mention the fact that by its very nature, the state of each DSD sample depends on the accumulation of previous samples. Dealing with switching requires very thoughtfully designed processing with carefully synchronized fade out/fade in process to avoid abrupt transition between two DSD streams. By the way, a
fade out/fade in processing was incorporated in the decoder (of the Sony CXD275x or Philips "Furore" family) of every SA-CD player from the very beginning of the format to avoid "clicks" at the start of a track or from track to track.
A. Nishio and one of his colleague has gone out of their way to describe the output of the delta-sigma modulator of the cross-fade process in this chip in a paper they presented at the 110th AES Convention in Amsterdam, which contains many digital FFT analysis to illustrate the process step by step.
Based on this chip, Sony
had later created the
Sonoma hardware/software recorder/editor combination to support production.
According to Black Elk, this DAW was not put on the pro audio market at the beginning, but lent to various facilities in the world to help support recording music for SA-CD production. It is still in use today in some studios, as can be seen here:
https://www.sonymusicstudio.jp/s/studioen/page/equipment?ima=5651#section5
Sonoma was a mobile recorder/editor running on computer under Windows environment. Here is the back of such a computer, with the connector side of an early Sonoma processing card, which interfaces with outboard A/D and D/A converters:
And here is a view of the Sonoma processing card, with four Sony CXD2926 DSD edit processors, which were dubbed "E-chip":
It can be noted that the DSP chips are labelled "CXD2926AQ". In the Sony Semiconductor naming system, the suffix letters "Q" means "QFP package" (as can be seen) and "A" means "Improved specifications". The visible chips thus belong to a second generation. According to Nishio's presentation, the CXD2926 is a 4 tracks input/2-channels output chip, hence the 4 chips on the processing board, which can handle up to 8 channels DSD (remember an SA-CD has
up to 8 channels total, 2 stereo, 6 in multichannel).
The little ad-on card at the upper right corner of the Sonoma processing board is another part of the pro-environment created by Sony Oxford, Sony's European pro-audio division: the
Super-MAC digital multi-track interface. The output RJ45 connector of the interface can be seen at the back of the computer on the first photography. It was able to transmit up to 24 DSD channels or 6 to 48 channels of PCM Audio from 16 to 24 bits and 44.1 ksps to 384 ksps through Ethernet cable. The Super-MAC interface was later improved to the
Hyper-MAC standard
, capable of transmitting up to 192 channels of DSD audio.
I wrote that the CXD2926 is capable of equalization under some conditions. It is there where PCM intervened, but not through a conversion from DSD and back to DSD. The process is much weirder and is described in Ayataka Nishio's patent
s on the subject (
US 5,835,043 and
US 5,946,402). A PCM signal path with suitable sample rate is derived from the input DSD stream by a decimating filter. This PCM signal path contains only the frequency pass-band of interest (obviously). This PCM signals is equalized as desired, then the computation results are
added directly to the DSD stream after interpolation up to the DSD sample rate.
Alternatively, the processed PCM signal is subtracted from the input PCM stream to get the difference between the two. This difference signal is
then linearly interpolated up to the DSD sample rate and added to the input DSD stream.
In each case, the DSD signal had been appropriately delayed of the time necessary to do the processing on the PCM streams. The addition
of the interpolated PCM signal to the DSD signal produces a multi-bit signal at DSD sample rate which is re-modulated back to 1 bit by a following sigma-delta modulator. Weird, but it works that way.
According to the above mentioned presentation of the CXD2926 by A. Nishio, DSD streams are decimated to 24 bits, 8 FS (352.8 ksps) for the purpose of computation in the "E-Chip". But Nishio's patents state on the one hand, that the decimation of the DSD stream is done to a lower 1 FS sample rate (44.1 ksps), and on the other hand that his methods were able to obtain significant cost-savings in the computational hardware compared to direct processing of the DSD stream (more on that latter; also read a J. A. Moorer's article from 1996 about that very subject). Therefore, it can be deduced that the methods described in said patents may have been implemented in the second generation Sony SA-CD multichannel decoders (the CXD2752 and CXD2753) for channel trim level control and limited bass-management features* as an economically realistic way to realize in consumer SA-CD players what had been previously done in the professional hardware.
But that was just the beginning.
Soon, Sony Oxford entirely discarded the weird process that have been devised by Nishio for equalization and designed digital processing on multi-level delta-sigma modulation. For that matter, DSD 1 bit delta-sigma input streams are transformed into multi-level (at 8 bits depth coding) sigma-delta streams at the same 2.8224 MHz sample rate as the input signal in order to interface them with internal digital processing blocks on an entirely new processing board. It is this intermediate interfacing digital format that has been dubbed "DSD-Wide". The rationale behind the technical choice made by Sony Oxford and the overall system description was exposed at the
110th AES Convention in Amsterdam in 2001. It is from this presentation that I extracted this rare low-res photos of the new processing board (I hope publishing them is fair use):
Later on, Sony Oxford leveraged this new expertise on signal processing on multi-level delta-sigma modulation by releasing on the pro market D-MAP processors sold to third parties. D-MAP is the acronym of "
DSD-Modular Audio Processing". It takes the form of small PCB boards where algorithms developed by Sony Oxford are implemented in FPGAs (Field Programmable Gate Array). Various modules having special capabilities were put on the market, such as this one dedicated to make compressors/limiters, where we can see a photo of the hardware module (excerpt from a promotional brochure at the time):
Sony Oxford has even developed a technological demonstrator of a fully digital preamplifier on the basis of a D-MAP development board in the hope to attract the interest of the consumer electronics manufacturers (excerpt from a Sony Oxford commercial brochure of the time):
Of course, as we all know, no consumer electronic manufacturer expressed interest to incorporate this technology in their products.
However, to my knowledge, the Sony CXD2926 DSD processor has been used at least in some high-end consumer products: the various versions of the emm Labs CDSD and the Esoteric K-03 and D-05.
As far as the pro-audio market is concerned, what has been achieved ?
Sonic Solutions released a seemingly rare DAW extension based on a first processing card named
DSD.1 with Sony CXD2926 processors and later a more powerful extension named
DSD.X with Sony Oxford D-MAPs. To my knowledge, this system is the only DSD-capable DAW that has ever been created to run on Apple's Mcintosh system. It has been used by some in the UK, including the BBC: I was able to find a thread on a UK pro-audio forum, where a person was desperately in search of the system to read the digital file format used by this Sonic Solutions DAW and was led towards the BBC. It should be
noted stressed that the Sonic Solutions guy named James A. Moorer had previously put forward a presentation about DSD signal editing with Ayataka Nishio at an AES Convention in 1998.
Besides, J. A. Moorer had written on the subject of signal processing applied on 1-bit streams as early as 1996 at the 101st AES Convention in Los Angeles (the content of this paper is in free access on J. A. Moorer's website). So, there must have been a relatively old relationship between Sony and Sonic Solutions on that subject. Here is a screenshot of a long gone Website with a view on the hardware parts:
Probably the most well known DSD-capable DAW is
SADiE DSD8 from the series 5 (from 2003 onward), which used
Sony Oxford D-MAP MixEQ modules and Sony CXD2926 "E-Chip" processors:
As for Sonoma, the software system was lately transferred to Gus Skinas's
Super Audio Center (Boulder, Colorado, USA)
around 2004.
Gus Skinas has kept developing and distributing it for a while till an end date I don't remember. But I have heard from Mr Skinas in an interview he gave that the last Sonoma system he had delivered was for the Daft Punk, to be used during the production of the famous
Random Access Memory album. The hardware side of the Sonoma system still made use of D-MAP modules from Sony Oxford. The system has been extended to
24 tracks, and then again to 32 tracks, as can be seen in this ad blurb of the time:
Another last client for D-MAP modules is Genex, which has released a DSD-capable DAW, but at a later date (2007). It was the "
Mix+" hardware/software engine. Here are views of the hardware enclosure and the internal processing card with D-MAP modules whose processors are covered with heat sinks (photos found on an auction website!):
At that time, Oxford had already split from Sony in 2006 and had became known as Oxford Digital. To this day, this enterprise still has processing of 1-bit DSD in its technology licensees portfolio.
Of course, all of that is old history, even if numerous of the above mentioned DAWs are still in use in multiple places.
DSD-Wide is often presented as a PCM signal and thus had been nicknamed "PCM-Narrow" by some. But, to the best of my understanding, that is definitely not true for the following fundamental reasons.
PCM is a kind of modulation where the analogue signal of interest (the audio) is sampled at a certain rate and the resulting samples of the signal are converted in digital words which correspond to the respective level of each sample of the signal of interest. It straightforwardly is a direct description of the signal of interest.
Sigma-Delta modulation does not contain samples of the signal of interest (the audio). The system relies on noise-shaping, whose very principle is to take into account the state of the output, in order to shape the noise (the quantification error). What is sampled is not the signal of interest but an error signal which is the product of the processing of both the input and the output of the system. The fact that the modulation contains only two levels (1 single bit) or multiple levels (eventually expressed with multiple bits in a digital system) depending of the design of the modulator does change nothing to the fundamental difference,
ie a delta-sigma modulator samples a derivative function of the signal, not the signal itself.
At least that's what I understand about how both systems work. But I also think that once you get out of the cold technical analysis, all the debates around this subject boil down to the debate about the sex of angels.
Note:
* As anecdotal evidence of this hypothesis, I can personally testify that I encountered unexpectedly huge distortion when replaying with a disc player based upon a Sony CXD2753 chip with the trim level function enable one of those rare early SA-CDs that do contain DSD signals briefly over-modulated above the 0 dB SA-CD prescribed by the Scarlet Book specifications. As it is well known that DSD signal modulated above 0 dB SA-CD might clip PCM stages when 0 dBFS PCM is set to equal 0 dB SA-CD, ie
with no headroom to accommodate over-modulated DSD peak levels in the process of conversion from DSD to PCM, it is possible that the distortion was the product of clipping in the PCM domain mixed with the DSD signal prior to re-modulation back to 1 bit or saturation of the output sigma-delta modulator due to excessive digital input level.