As explained by Black Elk on SHF, Philips, contrary to Sony, did not have a pro-audio division and thus choose to work with Merging by transferring expertise on digital signal processing and let them put on the market the appropriate stuff to support SA-CD production. According to BE, Philips's engineers thought that processing at 16 FS PCM (705.6 ksps in 44.1 ksps based system or 768 ksps in 48 ksps base system) was necessary to keep the time domain performance of DSD signals, but Merging didn't have the technical platform to achieve that sample rate at the time. As a result, Philips and Merging acknowledged to fell back on 8 Fs PCM (352,8 ksps or 384 ksps), which was within the possibility of Merging hardware and software portfolio at the time. Merging decided to name its format system DXD, which angered Sony, which owns the rights to the DSD acronym and felt that Merging was commercially parasitic by taking advantage of the fame of the DSD name. But eventually everyone agreed to allow Merging to use the name DXD.
The following is my understanding. from years of researches and reading in the Internet archives.
The Merging system was not put on the market before 2004. Before that, DSD was already editable, within some limitations, thanks to Sony hardware and software.
Sony engineer Ayataka Nishio is credited to have led the development of the first digital signal processor to edit DSD for production purpose. He had presented this chip as early as 1997 in an AES convention in Munich. Based on this chip, Sony had created the Sonoma hardware/software recorder/editor combination to support production. This DAW was not put on the pro audio market at the beginning, but lent to various facilities in the world to help support recording music for SA-CD production. It is still in use today in some studios, as can be seen here:
https://www.sonymusicstudio.jp/s/studioen/page/equipment?ima=5651#section5
View attachment 392238
Sonoma was a mobile recorder/editor running on computer under Windows environment. Here is the back of such a computer, with the connector side of an early Sonoma processing card, which interfaces with outboard A/D and D/A converter:
View attachment 392211
And here is a view of the Sonoma processing card, with four Sony CXD2926 DSD edit processors, which were dubbed "E-chip":
View attachment 392218
It can be noted that the DSP chips are labelled "CXD2926AQ". In the Sony Semiconductor naming system, the suffix letters "Q" means "QFP package" (as can be seen) and "A" means "Improved specifications". The visible chips thus belong to a second generation.
According to Nishio's presentation, the CXD2926 is a 4 tracks input/2-channels output chip, hence the 4 chips on the processing board, which can handle up to 8 channels DSD (remember an SA-CD has 8 channels total, 2 stereo, 6 in multichannel). This chip was capable of level control, switching between DSD stream, cross-fade, fade-in/fade out and, under some conditions, equalisation. The level control is absolutely straightforward. The switching part was the most difficult to implement because, as pointed out by NTTY, a DSD stream has no "0" level and hence, signal-dependant DC levels are part of the DSD signal. Dealing with switching requires very thoughtfully designed processing with carefully synchronised fade out/fade in process to avoid abrupt transition between two DSD streams. By the way, such a processing was incorporated in the decoder (of the Sony CXD275x or Philips "Furore" family) of every SA-CD player from the very beginning of the format to avoid "clicks" at the beginning of a track or from track to track.
The little ad-on card at the upper right corner of the Sonoma processing board is another part of the pro-environment created by Sony Oxford, Sony's European pro-audio division : the Super-MAC digital multitrack interface. It was able to transmit up to 24 DSD channels or 6 to 48 channels of PCM Audio from 16 to 24 bits and 44.1 ksps to 384 ksps through Ethernet cable. This interface was later improved to the Hyper-MAC standard. The output RJ45 connector of the interface can be seen at the back of the computer on the first photography.
I wrote that the CXD2926 is capable of equalisation under some conditions. It is there where PCM intervened, but not through a conversion from DSD and back to DSD. The process is much weirder and is described in Ayataka Nishio's patent on the subject (US 5,835,043). A PCM signal path with suitable sample rate is derived from the input DSD stream. This PCM signal path contains only the frequency pass-band of interest (obviously). This PCM signals is equalised as desired, then subtracted from the input PCM stream to get the difference between the two. This difference signal is linearly interpolated up to the DSD sample rate and add to the input DSD stream, which had been appropriately delayed of the time necessary to do the processing on the PCM streams. This addition produced a multi-bit signal at DSD sample rate which is re-modulated back to 1 bit by a following sigma-delta modulator. Weird, but it works that way.
But that was just the beginning.
Soon, Sony Oxford entirely discarded the weird process that have been devised by Nishio for equalisation and designed digital processing on multi-level delta-sigma modulation. For that matter, DSD 1 bit delta-sigma input streams are transformed into multi-level (at 8 bits depth coding) sigma-delta streams at the same 2.8224 MHz sample rate as the input signal in order to interface them with internal digital processing blocks on an entirely new processing board. It is this intermediate interfacing digital format that has been dubbed "DSD-Wide". The rationale behind the technical choice made by Sony Oxford and the overall system description was exposed at an AES convention in Amsterdam in 2001. It is from this presentation that I extracted this rare low-res photos of the new processing board (I hope publishing them is fair use):
View attachment 392226
Later on, Sony Oxford leveraged this new expertise on signal processing on multi-level delta-sigma modulation by releasing on the pro market D-MAP processors sold to third parties. D-MAP is the acronym of "DSD-Modular Audio Processing". It has the form of small PCB boards where algorithms developed by Sony Oxford are implemented in FPGAs (Field Programmable Gate Array). Various modules having precise capabilities were put on the market, such as this one dedicated to make compressor/limiter, where we can see a photo of the hardware module (excerpt from a promotional brochure at the time):
View attachment 392227
Sony Oxford have even developed a technological demonstrator of a fully digital preamplifier on the basis of a D-MAP development board in the hope to attract the interest of the consumer electronics producers (excerpt from a Sony Oxford commercial brochure of the time):
View attachment 392228
Of course, as we all know, no consumer electronic producer expressed interest to incorporate this technology in their products.
As far as the pro-audio market is concerned, what have been achieved ?
Sonic Solutions released a seemingly rare DAW extension based on a first processing card named DSD.1 with Sony CXD2926 processors and later a more powerful extension named DSD.X with Sony Oxford D-MAPs. To my knowledge, this system is the only DSD-capable DAW that have ever been created to run on Apple's Mcintosh system. It has been used by some in the UK, including the BBC : I was able to find a thread on a UK pro-audio forum, where a person was desperately in search of the system to read the digital file format used by this Sonic Solution DAW and was lead towards the BBC. It should be noted that a Sonic Solutions guy named James A. Moorer had previously put forward a presentation about DSD signal editing with Ayataka Nishio at an AES convention in 1998. So, there must have been a relatively old relationship between Sony and Sonic Solutions on that subject. Here is a screenshot of a long gone Website with view on the hardware parts:
View attachment 392230
Another client of Sony Oxford for D-MAP modules is Genex, which has released a DSD-capable DAW, but at a later date (2007). It was the "Mix+" hardware/software engine. Here are views of the hardware enclosure and the internal processing card with D-MAP modules whose processors are covered with heat sinks (photos found on an auction website!):
View attachment 392232
View attachment 392233
Last, probably the most well known DSD-capable DAW is SADiE 5.0, which used Sony Oxford D-MAP modules:
View attachment 392235
As for Sonoma, the software system was lately transferred to Gus Skinas's Super Audio Center (Boulder, Colorado, USA), that has kept developing and distributing it for a time till an end date I don't remember. But I have heard from Mr Skinas in an interview he gave that the last Sonoma system he had delivered was for the Daft Punk, to be used during the production of the famous
Random Access Memory album. The hardware side of the Sonoma system still made use of D-MAP modules from Sony Oxford. The system has been extended till 32 tracks, as can be seen in this ad blurb of the time:
View attachment 392236
Of course, all of that is old history, even if numerous of the above mentioned DAWs are still in use in multiple places.
DSD-Wide is often presented as a PCM signal and thus had been nicknamed "PCM-Narrow" by some. But, to the best of my understanding, it si definitely not for the following fundamental reason.
PCM is a kind modulation where the analogue signal of interested (the audio) is sampled at a certain rate and the resulting samples of the signal are converted in a digital words which correspond to the respective level of each sample of the signal of interest. It straightforwardly is a direct description of the signal of interest.
Sigma-Delta modulation does not contain samples of the signal of interest (the audio). The system rely on noise-shaping, whose very principle is to take into account the state of the output, in order to shape the noise (the quantification error). What is sampled is not the signal of interest but an error signal which is the product of the processing of both the input and the output of the system. The fact that the modulation contains only two levels (1 single bit) or multiple levels (eventually expressed with multiple bits in a digital system) depending of the design of the modulator does change nothing to the fundamental difference, ie a delta-sigma modulator samples a derivative function of the signal, not the signal itself.
At least that's what I understand about how both system work. But I also think that once you get out of the cold technical analysis, all the debates around this subject boil down to the debate about the sex of angels.