Introduction: In 2021, many USB DACs and portable DAPs are offering "DSD support" and "native DSD playback", which are important keywords in terms of sales. Also in DSD, there are more and more devices that can support formats such as DSD128 (5.6MHz), which is twice as fast as DSD64 (2.8MHz), and DSD256 (11.2MHz), which is four times as fast. It's useful for home USB DACs to be able to support such high sample rates, as it shows off the technical prowess of the manufacturer (although, conversely, there's little way to stand out unless you do). I think that from 2023 onwards it will become the norm for all chips to be able to handle all these high rates.
Classical recording work: It is widely known that DSD is a format that works best for 'one shot' recordings, such as jazz and classical concerts, or digitising old analogue recording tapes. Once recorded, DSD data is cumbersome to edit (or rather, degrades each time it is edited), so high-res PCM is better for studio albums where various takes can be cut and pasted together at a later date. In the case of classical recordings, for example orchestral studio recordings, it was common practice to set up a number of independent microphones for each instrumental section, record each separately (48 tracks or so) digitally, and then cut and paste the recordings later on the computer, adjusting the timbre and volume of each section. This was a common practice.
A recording producer and engineer must be able to understand not only the sound quality, but also the flow and content of the music, for example, "the trumpets should be more prominent here, so let's increase the volume of the trumpet track", or "the violins are covered by the singers here, so let's suppress them a little". It's not just about sound quality, it's about understanding the flow and content of the music and enhancing the musicality. If you look at snapshots of studios of the time, you will often see a producer with a score in his hand and a conductor soundchecking a passage that has just been recorded. In the same way that a film director is obsessed with a single emotional scene, it was commonplace in music to spend two or three hours re-shooting a one-minute passage in order to get it right, and to spend 100 hours editing a one-hour album after the recording session. It was a world in which it was commonplace to spend 100 hours editing a one-hour album.
It is rare to find a violin symphony by a famous performer where the orchestra and the violinist's solo performance were recorded on separate days in different studios. This is a common practice in rock and pop music, the so-called karaoke method. By recording separately, it's easier to edit the song later, and if you don't like the sound of a single note (or if you played it wrong), you can fix it. There is a famous story about a great opera singer who, because of age, could no longer play the high notes of his music, so he had to replace the high notes with another singer's voice. All of this is the result of painstaking multiple recording and careful editing on a generous budget. These days, the music industry, not just classical records, is on a tight budget, and all studios, no matter how good their multiplex recording facilities, want to produce albums in the shortest possible time, as the labour cost of re-recording and editing dozens of times is too high. The members of the orchestra were paid by the hour or overtime, as determined by the trade union, so it costs money to keep them in the studio for re-recording. Also, music fans have been complaining that the edited studio recordings are too perfect and uncomfortable, not at all like the actual live concert experience. If this is the case, it's better to record a live concert performance in high quality and sell it, so that you don't have to edit it later. In the 80's, 90% of new releases on the major labels were studio recordings and 10% were live recordings, but nowadays the situation seems to have reversed!
Most classical concerts are performed three or four times over the course of a week, so for recent albums, I tend to keep the recorder running every day, and cut and paste the best performances of each movement to make a single album (the important ones). (It would be a shame if the whole work was rejected because an audience member coughed. For such simple editing work, DSD is not too difficult. The recording microphones at the concert venue can be pre-balanced using the venue's analogue mixer, and the finished tracks for stereo and surround sound can be recorded on a DSD recorder. All that's left to do is to write the cover photo and liner notes and either make the SACD or sell it as a download on a DSD distribution site.
Why DSD? Whenever the subject of DSD comes up, the question that always heats up the debate is "why bother with DSD and not hi-res PCM? In conclusion, the only answer is that it was the best way to go at the time. Also, unlike audiophiles, many studios use the same recording equipment for 10 or 20 years, because reliability is a priority.
If you go back to around 1999, there was a format war between DSD (SACD) from the Sony-Philips camp and high-res PCM (DVD Audio) pushed by Panasonic, JVC and others because of its higher sound quality than CD. Nowadays, DVD Audio with its high-resolution PCM seems to be the right choice, but at the time it was difficult to get sponsorship from the music industry and so it lost out marketing-wise to SACD. In fact, "better sound quality than a CD" was just a pretext, but the real intention of the media industry was to find a new "copy-protected format" to replace the CD, which had become an all-you-can-copy format with the spread of computers and CD-R drives. At the time, there was a lot of piracy and a lot of focus on copy control CDs and digital rights management (DRM). I think there was a sense of crisis in the industry. As a result, SACD has become the next generation of media, and for classical music lovers who have long been sensitive to 'high quality sound', SACD has become the format of choice, with its ultra-high resolution and surround sound!
The advent of DSD: In order to compare the difference in sound quality between DSD and PCM, let's take a look back at the debut of the DSD format. CDs are recorded at 44.1kHz 16bit PCM, so why was 2.8MHz 1bit DSD born? From a listener's point of view, aside from the theory of sound quality, it seems to me that the biggest turning point for DSD to take root in a commercial sense was the introduction of the SAA7350 series of D/A converter (DAC) chips by Philips in 1991.
At that time, despite the explosion of the CD, pure 16-bit PCM DACs were reaching their limits and the quest for perfection was becoming a costly quagmire. 16 resistive switches were embedded in the DAC chip and each one had to be exactly right. Each resistor has to be exactly right. The DAC chip is made by laser cutting 16 thin copper resistors into small pieces, so that even a few microns of error is not enough to achieve 16-bit accuracy. For example, the Philips TDA1541A and Burr Brown PCM56 were very popular with audiophiles at the time. The less accurate chips were sold at a lower price. For example, the Philips TDA1541A had a crown mark on the chip surface, and the best players had two crowns (double crown), while the rival Burr-Brown had a "K" rank for the highest grade, and "J", "no mark", "L", etc. below that. The price of a single chip was more than 10 times higher than the price of a K-rated chip and the price of an unmarked chip was more than 10 times higher than the price of a K-rated chip. (You can still find fake chips on eBay, for example, where someone has forged the crown mark). For example, the distortion (THD) specs were -92dB for the "K-rated", and -82dB for the "unmarked", so it seemed to me that they were doing their best, almost on the edge, when the logical dynamic range of a 16-bit CD is said to be 96dB.
SAA7350 and bitstream: This is where the Philips SAA7350 series DAC chip comes in, which uses a high-speed 1-bit conversion circuit to convert 16-bit PCM data, eliminating the need for 16 resistors to be laser cut out. With one bit, the analogue conversion circuit is simply a single fast-acting switch. Philips called this the "bitstream method" at the time. In other words, a 1-bit DAC determines the pouring of water or beer (Nishio's example) or milk in a factory by how many times a single tap is opened and closed at high speed, whereas a 16-bit DAC determines the volume of water or beer by how many taps are opened and closed simultaneously. Rather than having to carefully adjust the size and water pressure of all 16 taps at the factory to ensure that all 16 taps are as designed, moving just one tap at high speed is cheaper, easier and more accurate.
The SAA7350 has an astonishing -96dB THD specification as a standard product, even without taking into account rankings and selection, and more importantly, the SAA7350 can be purchased for only about $3, whereas previous high-end 16-bit DACs cost around $20 each. In other words, a cheap CD player with the SAA7350 was "on specs alone" more powerful than a high-end CD player that stuck with an expensive 16-bit DAC. At the time, high-end 16-bit DACs were becoming increasingly costly, with multiple DAC chips installed in parallel instead of just one, to reduce distortion rates.
In the case of 16-bit DACs, it is necessary to use a number of high-grade capacitors to stabilise the switch current, and even if the same DAC chip is used, the cost of the peripheral components will obviously change the sound quality specifications. However, with the SAA7350, as long as the high precision crystal clock was taken care of, the rest of the circuitry was relatively low impact and low cost. It was around this time that the topic of "DAC clock jitter" became popular.
Even then there was a battle between PCM and 1bit CD players in the audiophile world, just as there is now with hi-res PCM v DSD. However, as we all know, the sound quality of audio equipment is not only determined by the specification of the DAC chip, but also by many other factors such as the stability of the power supply and the performance of the analogue circuitry. There was a big difference in sound quality between the PCM CD players, which had evolved into big machines, and the light small 1-bit players.
Apart from the Philips SAA7350 DAC, at the same time Sony was using a 1-bit DAC called the PULSE DAC, and there was also the Panasonic MASH. All of them are excellent chips, but only Philips was able to popularise them in many areas. Burr Brown, one of the biggest 16-bit manufacturers, also introduced a fully 1-bit DAC chip at this time, such as the PCM69, the successor to the PCM54. This means that almost all DAC manufacturers recognised the benefits of 1-bit in some way or another. There are two reasons why the Philips SAA7350 was so important. Firstly, the chip was available in bulk at a low price and was used by many audio manufacturers, not just Philips. It's not surprising, because it was cheap and powerful. The other reason why the SAA7350 is so important in this DSD story is that it has a digital output for 1-bit data.
1-bit dedicated DAC TDA1547: The SAA7350 was certainly a good DAC chip, but from the outset, its top priority was to mass produce it at low cost, and they felt that further improvements in sound quality could be achieved for high-end audio. So instead of releasing a higher version of the SAA7350 chip, Philips opted to take the high-speed 1-bit data from the SAA7350 and convert it to analogue on another high performance chip. This analog conversion chip is the TDA1547. The TDA1547 is a "1 bit data only" DAC chip that cannot handle PCM, and thanks to its accurate and unmatched current switching, it has a distortion (THD) of -101dB and a signal-to-noise ratio of 111dB, dramatically better than the SAA7350 alone. At last, they'd reached an age where DAC chips in ordinary consumer equipment could consistently outperform the 44.1kHz 16-bit of CDs. At the time, it was not common to send or receive 1-bit data, so the SAA7350 and TDA1547 pair was intended to be used as a set. This set was nicknamed the "DAC7" and was widely used in high-end CD players of the time. The DAC7 systems of that era are still the most appealing in terms of sound quality. It was also possible to oversample 44.1kHz PCM data from a CD by a factor of 8 before sending it to the SAA7350 (i.e. 44.1 x 8 = 352.8kHz), so it was possible to use an oversampling chip such as the popular Japan Precision Circuits (now Seiko NPC) SM5803. In other words, if you look at the DACs in Philips CD players at the time, they were a combination of three things: a chip that oversampled the CD PCM 44.1kHz to 352.8kHz, a chip that converted the PCM 352.8kHz to 1bit at high speed, and a chip that converted the 1bit to analogue. The upper limit for the Philips frequency was 44.1kHz x 8 x 24 = 1 bit 8.47MHz , but the actual frequency varies. Around 1995, the TDA1307 chip was developed, combining the first two steps (i.e. the oversample chip and the 1-bit conversion chip), further streamlining the 1-bit conversion of CDs.
SACD players: The configuration of this Philips DAC shows that already in the early 90s, two data formats, "352.8kHz PCM" and "high speed 1bit (i.e. DSD)", were being handled in real time in CD players. This was also the case with the A/D converters used in digital recording, which led to the idea that it would be better to record, store and play back music in an intermediate format, rather than sticking to the 44.1kHz 16-bit format of the CD. I think it is natural that the SACD recording media proposed by Sony Philips was based on 2.8MHz 1bit. From Philips' point of view, a 2.8MHz 1 bit recording can be easily converted to high quality analogue using a single TDA1547 chip. The reason for choosing 2.8MHz (DSD64) instead of 8.47MHz (DSD192), which is the highest speed the TDA1547 can convert, is probably due to the limited capacity of a single SACD disc (approx. 4GB), including surround recordings. In fact, the Marantz (Philips) SA-1 player, which made its SACD debut, has a simple circuit where the PCM data from the CD is converted to DSD via the TDA1307, and the DSD data from the SACD is fed directly into the TDA1547 chip. This is the shortest route to high quality sound that Philips had ever devised. As a side note, Philips in the Netherlands and Marantz in Japan have had a close relationship since the early days of the CD, Philips specialising in the development and manufacture of CD reading lasers and DAC chips, but outsourcing the finished CD player to Marantz, an established audio manufacturer. Many models were sold under the Philips name in Europe and under the Marantz name in Japan and the USA. However, Japan's bubble economy burst and the PCs, MP3s and iPods were coming. When SACD was introduced, the future of the music industry was already bleak in Japan... Marantz did not only stick to Philips DAC chips, but also used Seiko NPC, Burr Brown, and Cirrus Logic's newly developed CS4397 DAC, to which many Philips engineers had moved, at the same time.
Sony: On the other hand, Sony, the other leading player in DSD, used a different 1-bit system from Philips, called PULSE DAC, at about the same time. It started in 1990 with the CXD2552 chip (used in the CDP-X777ES). The method is the same as for Philips, first oversampling 44.1kHz to 352.8kHz with an oversampling chip, then pulsing to 1 bit with a CXD2552 chip, but while Philips uses PDM (pulse density), Sony uses PLM (pulse length) to indicate amplitude. Four years later, in 1994, Sony introduced a bridging "current-pulse DAC" system, in which the voltage pulse output from the CXD2552 is converted into a powerful analogue current pulse by a subsequent chip. When it first debuted, it looked like a gimmick, but now it seems that what Sony was trying to do was a "direct amplification of 1-bit pulses". In other words, for Sony, 1-bit is not just a digital data format, but a neutral entity that treats digital as analog. The S-MASTER amplifier, which has since been used in Sony's AV amplifiers and high-resolution Walkmans, is in principle the same as a current-pulse DAC, with the digital data first converted into PLM voltage pulses, and then driven directly to the speakers or headphones by a powerful current switch. That's how it works. That's where Sony's tradition comes in. Unlike Philips, which supplied DAC chips to many audio manufacturers, large and small, Sony's DAC chips were almost exclusively for its own CD players, so their performance is still largely a mystery. Around the time of the arrival of SACD, Sony, like Philips, ceased to develop dedicated DAC chips. The first Sony SACD player, the SCD-1, used Sony's CXD8594 DAC chip, but from the third generation SCD-XA5400ES, Sony began to use Burr Brown's DSD1796 and other DACs, as did Marantz. The sound quality seems to have changed dramatically during this transition. In 1999, the 2.8MHz 1bit DSD system was a proven high-quality technology in 1bit DACs for CDs, but since then, DAC chip manufacturers such as Burr-Brown and Cirrus Logic have been developing even faster DACs with 128x and 256x oversampling, as well as 4-8bit DACs instead of 1bit. And as a result, 2.8MHz DSD adopted for SACD has been treated as "a format with 'halfway between PCM and multibit' sound quality born in a transitional period". For example, the advanced current segment method used by Burr-Brown since 2003, just after the debut of SACD, combines the best of both PCM and 1bit DAC, with purely resistive conversion for the top 6 bits of 24bit data and high-speed 5 bit delta-sigma modulation for the bottom 18 bits. It's the best of both worlds.
In the end, it is interesting to note that whatever the recording format, the modern D/A conversion process is right in the middle of both DSD and PCM. This means that the debate between PCM and DSD is based on historical evolution and the limitations of the technology 20 years ago. Without getting into the actual human hearing range debate, the idea that "numerical specifications" such as sample rate and dynamic range are "good enough already" is a futile exercise because, like "megapixels" in digital cameras, progress will not stop while there is demand. It's a futile exercise.
The problem with DSD64: One thing that needs to be taken into account when considering DSD is the fact that the DSD64 (2.8MHz 1bit) format used in SACD is "not as high-res as it sounds".
In many high-resolution download shops, DSD albums are priced higher than PCM 192kHz, but in reality they are not that high performance "on spec". (This does not mean that the sound quality is poor). In principle, DSD64 has high frequency noise from around 30kHz due to noise shaping techniques, and this needs to be cut by an analogue filter during playback. Without noise shaping, there would be no high frequency noise, but on the contrary, the noise floor in the audible band would increase (the dynamic range would be worse than on a CD). In other words, the high-frequency range of DSD64 is not the theory that some high-resolution believers claim, such as "reproducing the ultrasonic waves of musical instruments inaudible to the human ear", but is actually a kind of trash bin or storage space for noise shaping, to drive out the audible noise.
At the time, an SACD spokesman for Sony (being a random dude not particularly interested in music) was asked, "How many kilohertz can DSD play? The guy replied, "In principle, it can record up to 100kHz. But we use a technology called noise shaping, which means.... I think that's what... It's just like the hi-rez boom! False. Such misleading brochures were common
Again, the advantage of the DSD method is the simplicity of "direct analogue conversion of digital data", and it was never intended to reproduce high frequencies outside the audible band. In reality, many SACD players and DSD-compatible DACs have circuitry that cuts off frequencies above about 24kHz (or at most 50 kHz) for safety reasons. So, in terms of high frequency reproduction, it's not that different from 48kHz PCM. There are also filters for 30kHz. Of course, the noise generated by noise shaping is not related to music. It may have a healing effect, for example. Some people might think that if the noise is inaudible to the human ear, then it's OK to leave it out without filtering, but that's the danger - even if high frequencies are inaudible to humans, amplifiers and speakers will do their best to reproduce them. There was a recall of Sony speakers that failed due to high frequency noise. Most amplifiers are equipped with a high-frequency filter as a safety measure, to prevent the incoming high-frequency noise from doing anything bad. However, audiophiles are usually averse to unnecessary filters and protection circuits, so the more expensive high-end amplifiers don't have these preventative measures in place. This means that the amplifier amplifies the huge amount of high-frequency noise many times over, which can burn out the speakers. In addition, some manufacturers who develop their own DACs and amplifiers have neglected to conduct sufficient tests, thinking naively, "Our DACs are equipped with high-frequency filters, so the amplifiers we sell in pairs will not need filters. In fact, there was a British manufacturer who had a circuit design that amplified infinitely. In fact, there was an incident where a British manufacturer's amplifier was burnt black by high frequency when connected to a SACD player made by another company.
The most annoying thing about DSD64 is that the starting point of such high frequency noise and the residual noise (so-called noise floor) varies considerably depending on the generation and performance of the DSD recorder. This is especially the case with older DSD converters, such as RCA Living Stereo's DSD remasters, which are quite loose in their shaping, but sound very good. (The noise floor is a remaster of a 1950s tape recording...).
DSD A/D converter: The number of so-called DSD A/D converters, which convert analogue audio to DSD, is limited. It is often misunderstood that since there are many DACs (D/A converters) that convert DSD to analogue, there must also be many ADCs (A/D converters) that convert analogue to DSD, but this is not the case. The reason for this is simple: there is a huge difference in demand between ADCs and DACs. If you release a low-cost, high-performance DAC chip, you can sell 100,000 or 1,000,000 of them because they are used in consumer audio products. In other words, ADCs are a market where you can't spend a lot of money on performance because it's unprofitable. For example, field recorders, MD players and other devices that play and record digitally at the same time need both an ADC and a DAC, so at the time, Sony produced a chip that contained both an ADC and a DAC in one piece. However, if you look at the specifications of these chips, you will find that the ADC is often an order of magnitude inferior to the DAC. The same applies for PC codecs.
In the early days, there were no mass-produced off-the-shelf DSD recorders, so even major labels like Philips used self-made converters. In the early days of SACD, there were only Meitner emmLabs DSD recorders, which Philips loaned to label studios (e.g. TELARC label), and later Prism Sound ADA and dCS 904 converters were popular for both PCM and DSD (RCA Living Stereo's SACD remasters, early Pentatone, etc.), so it's a bit of a fad, and "XX label bought a new converter" is always a topic of conversation in the inner circle. On the other hand, in the US, high resolution PCM recording had been widespread in the film industry before DSD, so many studios were using 192kHz PCM converters such as Pacific Microsonics, which was the mainstream at the time. There were not many of them. Whether it's high resolution PCM or DSD, A/D converters of this period used delta-sigma modulation noise shaping anyway, so there was a lot of difference in sound quality between converters. The problem was that DSD64 was just barely OK in terms of high frequency reproduction and dynamic range, so for studios that had already moved to high resolution PCM, there was little benefit in replacing it.
Nevertheless, the DSD faithful have never stopped, and many studios have continued to use DSD recorders for many years. These days, technology has evolved in its own way, with the simple answer being that if you record at twice the speed (5.6MHz) or four times the speed (11.2MHz), then the high frequency noise can be pushed further out, and the audible noise floor (dynamic range) is better. Of course, this doesn't mean that high-frequency noise disappears completely through noise shaping, but what used to occur from the 30kHz range with DSD64 is now around 80kHz with DSD128, and the aim is to push it to a point where it no longer matters. From a recording point of view, it's like, "I don't have an instrument that can produce such high frequencies, and I don't have a microphone that can record satisfactorily...
Difference in sound quality between DSD and PCM: But what about sound quality? The recent 'DSD vs Hi-Res PCM' debate is exactly parallel to the 1-bit vs multi-bit CD player debate of 25 years ago. Even back in 1991, 1-bit was seen as "delicate, high-resolution, with great air and imaging", and multi-bit was seen as "sturdy, with strong bass and energy". It is rare to record the same performance on both DSD and PCM recorders at the same time, and even if such a comparison could be made, the differences in sound quality between the recorders would have to be taken into account. Also, of course, analog conversion with DACs follows a different path for PCM and DSD, and each manufacturer - ESS, AKM, BB, Cirrus Logic - has its own strengths and weaknesses. It's a difficult topic to reach a clear conclusion on, but the most important thing is to listen. It's easy to make the mistake of concluding, for example, that a certain label uses a certain company's A/D converters, so the sound is harder.
If you ask them what kind of microphones they used for recording, many of them can't answer. In addition, there are factors such as the placement of the microphones and the acoustics of the concert hall, and to begin with, the instruments and performers are different. The A/D converter is the part of the recording process that contributes to the sound quality, but because it is the easiest piece of equipment to talk about in terms of specifications, it tends to be talked about and compared with other pieces of equipment. It's like an engine in a car. One interesting example is an album that was previously sold as SACD or DSD64, but was later sold as a download in DXD, the original master. (DXD is another name for high-resolution PCM 352.8kHz). In this case, one would naturally expect the original PCM 352.8kHz file to have better sound quality than the DSD64 down-converted for SACD, however, comparing the sound quality in the same studio environment, the DXD sounded hard, crisp and "studio monitor-like", while the DSD sounded more realistic, with reduced edges, more depth in the space and a clear emphasis on the "air" around the performer. I'll leave it at that. It was most likely related to the DAC and not the format. There does not seem to be much benefit in digging out obsolete papers on DSD at this point. We simply have to wait and see what new innovations arise in this space.
The sound quality does depend on the DAC you use, so it's a luxury to be able to choose the sound/DAC you like, and the short-sighted debate about superiority that often occurs on internet forums is probably just an argument between people who don't actually listen to much music! I am willing to defend DSD subjectively, and even promote DSD recording, but since none of my dongles make DSD sound awesome, I use both, DSD for 44.1 rates and DXD for 48 kHz rates.
Word count: 4915