• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The Truth About Music Streaming

Punter

Active Member
Joined
Jul 19, 2022
Messages
252
Likes
1,220
Napster-Logo-2003.png

In 1999 the world of music was changed forever with the launch of Napster. Napster was a peer-to-peer file sharing system biased towards music in MP3 format. Considering the internet was still largely on dial-up at the time, network speeds were favourable for music file sharing as a typical four minute MP3 file could be as small as 3Mb at a bitrate of 96kBps. To access the P2P network, one just had to download and install the Napster app and start choosing files. As soon as your app had stored a file or even part of it, the P2P functionality would begin sharing that file with other users. It didn’t take long for the music industry to react however and by 2001 Napster was shut down, buried under an avalanche of lawsuits. Even though it ultimately failed, Napster woke the music business to the realisation that the internet was a threat to its existing business model. Over at Apple however, Steve Jobs saw the coupling of music and the internet as an opportunity. Prior to the demise of Napster, Apple had launched iTunes in January 2001 which was aimed at allowing users to create a personal music library on their PC and followed that up with the launch of the iPod in October 2001. The portable MP3 player was nothing new but Steve Jobs thought that the existing units were either cheap and nasty or too big and clunky. The iPod was an instant success and by the time the product was discontinued in 2022, Apple had sold around 450 Million units worldwide. With the launch of the iTunes Store in 2003, Apple had a tightly integrated software/service/hardware environment that for a while was the dominant player in the digital music market. The next big thing would be launched in 2007 in the form of the first iPhone.

Devices and networks

As iPhones and other smartphones proliferated, so the mobile network became more capable and ubiquitous. It wasn’t until the 2G digital networks started coming on line that there was any useful data bandwidth available to mobile phone users and for practical purposes, the speeds available were nothing special (around 40KBps)until 3G superseded it. 3G offered data speeds of around 144kBps and with ongoing revisions eventually reached over 14MBps. 3G networks started coming online at the end of 2009 which boosted the utility of smartphones enormously.
Cellular_network_standards_and_generation_timeline.svg.png

The birth of streaming

Even with this expanded bandwidth, music streaming was still not a real force majeure until well into the 2000’s. One of the pioneers, Spotify, had started up in 2006 but didn’t make much of a splash until it opened up registrations for UK subscribers in 2010. The demand was so overwhelming that the open registration model was switched to invitation only to cope with the traffic. Spotify launched in the United States in July 2011. Apples answer to Spotify was Apple Music which launched in 2015. Since then there have been numerous streaming services launched like Pandora and Deezer with some like Tidal promising audio quality to “Master” level.

Formats, algorithms and compression

MP3 is the granddaddy of music compression, its full name is MPEG-1 Audio Layer III or MPEG-2 Audio Layer III. MPEG stands for the Motion Picture Expert Group. The MPEG was established in 1988 by the initiative of Dr. Hiroshi Yasuda and Dr. Leonardo Chiariglione. The first MPEG meeting was in May 1988 in Ottawa, Canada. By the late 1990s and continuing to the present, MPEG had grown to include approximately 300–500 members per meeting from various industries, universities, and research institutions combining their expertise for the express purpose of developing methods and standards for digital file compression. Video was the primary goal but audio compression was a natural development in the process. The outcome of the groups efforts compressing audio was impressively good. Compared to the file size of an uncompressed CD sample, the same audio as an MP3 can commonly achieve a 75 to 95% reduction in file size. For example, an MP3 encoded at a constant bitrate of 128 Kbit/s would result in a file approximately 9% of the size of the original CD audio. As a result of this and the popularity of the format, compact disc players increasingly adopted support for playback of MP3 files on data CDs by the early 2000s. Naturally, as bitrates are reduced, compression artefacts become more and more noticeable often manifesting as an audible harshness which can be quite fatiguing to listen to. However, criticising MP3 for lack of audio quality is missing the point, it’s purpose was to enable the transfer of audio files over moderately fast computer networks.
Mp3filestructure.svg (1).png

So how is data compression achieved? Compression is achieved by editing out parts of the audio that the encoder determines are inaudible or masked by louder components of the audio. Because of this process of editing, the compression is classified as “lossy” as opposed to “lossless”. This process is the outcome of the study of psychoacoustics which is the branch of psychophysics involving the scientific study of sound perception and audiology—how humans perceive various sounds. More specifically, it is the branch of science studying the psychological responses associated with sound (including noise, speech, and music). Psychoacoustics is an interdisciplinary field of many areas, including psychology, acoustics, electronic engineering, physics, biology, physiology, and computer science. In audio file compression, the primary focus is on masking which is important for all types of lossy encoding. To explain masking I’m going to quote a Wikipedia article:

“Suppose a listener can hear a given acoustical signal under silent conditions. When a signal is playing while another sound is being played (a masker), the signal has to be stronger for the listener to hear it. The masker does not need to have the frequency components of the original signal for masking to happen. A masked signal can be heard even though it is weaker than the masker. Masking happens when a signal and a masker are played together—for instance, when one person whispers while another person shouts—and the listener doesn't hear the weaker signal as it has been masked by the louder masker. Masking can also happen to a signal before a masker starts or after a masker stops. For example, a single sudden loud clap sound can make sounds that immediately precede or follow inaudible. The effects of backward masking is weaker than forward masking. The masking effect has been widely studied in psychoacoustical research. One can change the level of the masker and measure the threshold, then create a diagram of a psychophysical tuning curve that will reveal similar features. Masking effects are also used in lossy audio encoding, such as MP3.”

https://en.wikipedia.org/wiki/Psychoacoustics

Subsequently, all types of music compression encoders perform similar functions. Encoding software and the accompanying decoding software are collectively named as a “codec” and anyone who has made use of different file compression would be familiar with the term and maybe also the universe of options available. Among these options are some of the more commonly known and used codecs, AAC (Advanced Audio Codec), Vorbis, FLAC, WAV and a bunch of others.

Sample rate, bit depth and bit rate

  • Sample rate is the number of audio samples recorded per unit of time.
  • Bit depth measures how precisely the samples were encoded, the resolution.
  • Bit rate is the amount of bits that are processed per unit of time and relates to ne needed bandwidth on a network.
So a file encoded at 44.1Khz with a bit depth of 16 and a bit rate of 128 is a waveform which is sampled 44,100 times in a second, at a resolution of 16 bits per sample and requires 128Kb/s of bandwidth on the network. If this is a recorded file sitting on a hard drive, the bigger the numbers the bigger the file, for streaming, the bigger the numbers the more bandwidth required to transmit the file.

The mechanics of streaming

So what is the actual structure of a streaming audio service? Well, it’s pretty simple. The digital files are stored on the host system and exist there in their encoded form, AAC in the case of Apple Music and Vorbis in the case of Spotify. The subscriber accesses the files via an app or player that accesses the chosen file and streams it off the host system. The host system is usually a combination of file servers with intermediate streaming servers. The client is actually connected to the streaming server. The app or website will have various features built into it to create playlists and libraries and other user-centric functions but really it’s no different or more complex than you playing a file on your notebook pc that’s stored on your home server or NAS via your home router/modem. Whether lossy or lossless files are chosen the mechanics are all identical. The encoded audio streams are assembled in a container "bitstream" such as MP4 or FLV. The bitstream is delivered from a streaming server to a subscriber using a transport protocol, such as RTMP (Real-Time Messaging Protocol) or RTP (Real-time Transport Protocol) which are both Audio/Video specific protocols. Music streaming employs the “unicast” method of server to node connection so the server can supply multiple connections to individual destinations meaning that multiple destinations can select and receive the same content. This scenario means that thousands of subscribers can request and receive the same file at the same time and it will play for them individually. This method is demanding on servers and more importantly network infrastructure, more so than “multicasting” where a single source can be accessed by multiple subscribers but if the subscriber joins after the beginning of the stream, they won’t be able to see the start. This is common for live streamed events.

Streaming at a network level

The common thread in all streaming services is the network which forms a backbone upon which the stream can be transmitted. This is the same for your home network or accessing a file from Tidal or Spotify. The streaming data is contained in “packets” of digital data with the most common being the IPv4 form. Internet Protocol version 4 is the fourth version of the Internet Protocol (IP). It is one of the core protocols of standards-based internetworking methods in the Internet and other packet-switched networks. This graphic shows the structure of a single “packet” of IPv4 data.
ip-packet-header-fields (1).png

Now I’m not going to pull this apart in detail because………eyes……. glazing…………over…… However, one thing needs to be illuminated, particularly in the scope of music streaming and that is the error correction built into the protocol.

Error correction

IPv4 has a header checksum to detect errors in the layer-3 IPv4 packet header, and it discards any packets not matching the header checksum, the payload never reaching the transport layer. Routers only check the IPv4 header checksum. If the header is corrupted the packet is dropped. Payload or higher-layer errors are not detected at the router. The loss of packets is not terminal for the stream, most systems will have buffered a cache of packets that are being decoded by the receiving system. The buffer also permits the re-sending of corrupt or missing packets. Packet losses are only significant when the buffer runs dry and incoming packets are corrupted or missing. In this situation, audible artefacts can be heard on the system decoding the stream. Ethernet networks are largely immune to any form of analogue noise or interference. Ethernet cabling employs the same techniques as balanced audio cabling inasmuch as the cable pairs are twisted to take advantage of “common mode rejection” where hum and noise electromagnetically cancels in the cable and subsequent buffer amplifier circuits. Regardless of this, any kind of “noise” would have to cause packet corruption or loss to impinge on the Ethernet signal at all.

A to D, D to A

If there’s one area where audiophile fantasies can run wild it’s in the analogue-to-digital and the digital-to-analogue part of the process. In truth, there isn’t a lot of variety in the hardware involved in this part of the operation. There might be a slew of manufacturers and models but inside, there are a limited range of operational integrated circuits that perform the lion’s share of the A-D/D-A work. ESS, Analog Devices, Texas Instruments and AKM are a few of the leading DAC manufacturers but when you dip into the specs, you find very similar approaches to the job of decoding a digital audio signal. Indeed there has to be a relatively uniform method as the job of conversion relates to the standard that the audio was encoded to in the first place. Much of the variation at the manufacturing level comes down to the choice of outboard components and critical circuit board design. In a practical sense, the subscriber hanging off a streaming server connection has zero influence over how the stream is encoded, the only choice that can be made is the equipment on the decoding end.

The “High End” DAC
Weiss_502.png

Lets’ pick the Weiss Engineering DAC502. This unit retails for something in the order of $10,000US. It seems to have garnered very positive reviews from such sage institutions as Stereophile. However it’s just a shiny box that contains a circuit board with a bunch of op-amps, resistors and caps surrounding two ESS Sabre D/A chips.

https://www.esstech.com/wp-content/uploads/2021/03/ES9038PRO-Datasheet-v3.7.pdf

This manufacturer has chosen to run one decoder per channel and parallel the outputs to improve DNR and SNR levels. Nothing special, it’s a configuration suggested on the chip manufacturer’s data sheet. If you read the manufacturers blurb on the unit you’ll find a lot of stuff that seems to suggest that they’re doing something special with the clock and “room EQing” etc which is all built into the chip and nothing to do with Weiss Engineering. In fact, I would suggest that the only “custom” part of this unit is the daughter board which runs the front panel display. So how much is an ESS Sabre chip? Fifty bucks will get you one so this Weiss 502 has $100 worth of DAC chips in it. ESS chips also feature in DAC units from $500 to $1,200US, even portable units for thirty bucks as the VCC can be as low as 3.3V.
_MG_4458.jpg

https://hifimediy.com/product/sabre-dac-uae23/

Audiophile network equipment

So what do we know about music streaming? We know that we have no control over the encoding and streaming of the music we listen to. We also know that a group of up to 500 experts formulated the compression algorithms that form the basis of all audio and video file compression techniques. We also know that the methods used to transmit these files over networks are done so in ‘packets’ based on protocols that have been designed by network engineers to be robust and reliable, they have error correction built into them and their only enemy is a slow/intermittent connection that can cause lost, late or corrupted packets. Moreover, analogue noise is not an issue on digital networks unless it is causing corrupt/lost packets. Another bogeyman in the audiophile universe is “jitter”. Jitter is a product of signal degradation and can be caused by external interference or simple cable attenuation. Buffering is the primary anti-jitter tactic and any modern digital electronic system will have some capacity for buffering. If this proves insufficient for the prevailing conditions, jitter would present itself as audible artefacts like clicks, pops or dropouts.

Therefore, it is absolute fantasy to represent any type of computer network hardware as being special from the perspective of audio streaming. This is particularly true of modern equipment and networks that now exceed the amount of capacity and speeds that a simple audio stream requires. If you’re a fan of Tidal, their (lossy) MQA format streams at 1.4Mbs. If you’re regularly watching HD video on your TV, that’s streaming at 5Mbs and 4K is running 20Mbs. Not even a thirty buck dumb switch or a poverty spec home router will have a problem with a 1.4mbs stream! But there’s plenty of breathless BS out there supporting manufacturers claims of improved performance/sound with their overpriced, repackaged junk products.

1022no.promo_.jpg

https://www.stereophile.com/content/nordost-qnet-network-switch-qsource-linear-power-supply

My assertion is that one of this junk is purpose built for audio. Unfortunately, network terminology has opened up another avenue for makers of bogus products to dazzle suggestible punters with. They can now spout their drivel, peppering it with acronyms and technical terminology to sound as plausible as ever to the uneducated. Quality issues with music playback are almost exclusively related to file compression and in this era of terabyte disk drives and gigabyte network hardware, there’s almost no reason to compress audio to the point it begins to sound bad.

I just HAVE to include this excerpt from the comments section on this review:

"All I can say is... wow. This is an IMPRESSIVE amount of nonsense. Absolutely anyone who knows how the Ethernet system actually works will agree. Ethernet data is packetised and error checked at every stage. Each data packet arrives either wholly intact or is discarded and re-sent as many times as needed for an intact delivery. The final data assembled from packets can only be 100% perfect or is rejected entirely. So you'll have complete data dropouts or perfect data. Nothing in between whatsoever." Amen brother...

As always, I don’t represent my posts as the last word. I’m happy to be corrected or educated by other contributors.
 
The problem i have with music streaming from streaming services is otherwise:

1: You don't own any music, if the service goes down, so does your "music collection" and you have no way to recuperate it. Idem when you don't pay your subscription fee (for whatever reason). This is by far my biggest objection against streaming services. And also the artists don't get paid a lot for streaming their tracks. I'm close to many local artists, and they earn pennies from it, while buying their cd or lp makes them a lot more money.

2: the stream is often compressed in lossy format. I don't want that, i want full quality, at least redbook cd quality (44.1 16bit uncompressed). Flac or Alac format is not an issue, but no MQA and no lossy formats.

But at the end, i'm my own streaming service, with all my ripped cd's and vinyl records stored on a central NAS server that i can acces from any digital device in my housenetwork. That network is very standard to IT standards (i'm an IT system engineer) with very standard cisco and d-link gear. And like that i stream all the time. But there i got most of my music or in flac or in wav format... I don't use an iphone or so to listen to music (i hate headphones) and in my car i got the oldfashioned usb stick full of music that i change from time to time...
 
I've read right here on ASR that the internet is dirty. I imagine all the bits need a good scrub ;)
 
Impressive post. I'll take the time to read it carefully when work allows. Funny to see an Ipv4 packet diagram. I (and many others) used to live and breath the hardware/microcode details of all sorts of protocols and encapsulations. A lot of it has evaporated from my head.

Thanks for the effort to produce this overview.
 
The problem i have with music streaming from streaming services is otherwise:

1: You don't own any music, if the service goes down, so does your "music collection" and you have no way to recuperate it. Idem when you don't pay your subscription fee (for whatever reason). This is by far my biggest objection against streaming services. And also the artists don't get paid a lot for streaming their tracks. I'm close to many local artists, and they earn pennies from it, while buying their cd or lp makes them a lot more money.

2: the stream is often compressed in lossy format. I don't want that, i want full quality, at least redbook cd quality (44.1 16bit uncompressed). Flac or Alac format is not an issue, but no MQA and no lossy formats.

But at the end, i'm my own streaming service, with all my ripped cd's and vinyl records stored on a central NAS server that i can acces from any digital device in my housenetwork. That network is very standard to IT standards (i'm an IT system engineer) with very standard cisco and d-link gear. And like that i stream all the time. But there i got most of my music or in flac or in wav format... I don't use an iphone or so to listen to music (i hate headphones) and in my car i got the oldfashioned usb stick full of music that i change from time to time...

It would not be a smart move to get rid of the music collection just because you subscribe to a streaming service, I don't think many people do that.

Today you got many platforms to choose from that have lossless audio.
 
It would not be a smart move to get rid of the music collection just because you subscribe to a streaming service, I don't think many people do that.
Oh, you would be surprised how many do. I got some very nice vinylrecords out of sales like that.

Today you got many platforms to choose from that have lossless audio.
That is a very recent evolution, and a very good one, but still see point 1. I prefer to buy more albums with the money i save from not subscribing.
 
Last edited:
Streaming is like rent: if you pay for a limited amount of time (or for a specific purpose) then it's great, do you job and that's it! But if you pay for years and years, it's better to buy your favorite tracks and go with a free/ad sponsored account. Time is of the essence - you may end up in that situation when after many years of paying for streaming, the moment you stop, you don't have anything...
 
Streaming is like rent: if you pay for a limited amount of time (or for a specific purpose) then it's great, do you job and that's it! But if you pay for years and years, it's better to buy your favorite tracks and go with a free/ad sponsored account. Time is of the essence - you may end up in that situation when after many years of paying for streaming, the moment you stop, you don't have anything...
I spent far more on CDs than I do streaming. At one point I was buying 10+ CDs month. Apple and Amazon start at about $10 month... Not really a big deal and less than what I was paying for one CD 20 years ago. My biggest concern with streaming is, when will they start raising the cost to something really painful?
 
Being quite ignorant about streaming, it's not feasible to record the files in some manner?
 
I spent far more on CDs than I do streaming. At one point I was buying 10+ CDs month. Apple and Amazon start at about $10 month... Not really a big deal and less than what I was paying for one CD 20 years ago. My biggest concern with streaming is, when will they start raising the cost to something really painful?
Same here...
I don't remember, how much I was spending on CD.. I got to close to 3000 CD.. I ripped these to a NAS; this was my most favorite source of music. This NAS was backed-up by another in my office and, yet another NAS one at home and .. Streaming came...
Then I realized that I could not reliably tell the difference between music in 250 Kb/s mp3, even less VBR 250 AAC or ogg-vorbis and started listening through Spotify.. Game changer , more music, more convenience then, I discovered Tidal in lossless , then Apple Music and my collection is gathering digital dusk. It is on a 2 HDD...
Paying about $20/month is IMO, worth it fore the virtually infinite libraries of music at my disposal via Apple and Spotify. I sometimes, am concerned about the streaming companies raising their prices... I find myself inching closer to $75.oo/month for movies subscription with Netflix, Disney, Amazon,etc.
Still, music streaming has been a game-changer and a source of enjoyment for me.

Peace.
 
Last edited:
Thank you so very much for posting! I learned quite a lot from this!
 
You don't own any music
And Amen for that. I've bought so many bad records and CD's that I couldn't throw away because I paid for them I couldn't count them all. If the service goes down I've got 80k tracks on my NAS for all those bad CD's I can live a day without the stream.
 
I've read right here on ASR that the internet is dirty. I imagine all the bits need a good scrub ;)

And, apparently, it's also "noisy". I put my ear against my router the other day but I couldn't hear anything, so I stuck the coaxial connector in my ear: still nothing! I obviously don't have the required hearing sensitivity for picking up Internet noise... :p
 
And, apparently, it's also "noisy". I put my ear against my router the other day but I couldn't hear anything, so I stuck the coaxial connector in my ear: still nothing! I obviously don't have the required hearing sensitivity for picking up Internet noise... :p
A little snake oil will clean that noise right up!
 
Streaming is like rent: if you pay for a limited amount of time (or for a specific purpose) then it's great, do you job and that's it! But if you pay for years and years, it's better to buy your favorite tracks and go with a free/ad sponsored account. Time is of the essence - you may end up in that situation when after many years of paying for streaming, the moment you stop, you don't have anything...

I listen to lots of new music, stuff that I would otherwise not hear if I didn't subscribe to a streaming service, since I cannot abide listening to the radio - that's for various reasons: partly because radio stations don't generally play the kind of music I like, but primarily because I absolutely despise having to listen to advertising. So, the so-called "free" accounts are of no use to me, either, because they often don't carry the less popular music I prefer, and have advertising. If I really like a band I will buy their music, but I don't see "not having anything" as being a problem, since I'm happy to carry on paying for the streaming services.
 
Same here...
I don't remember, how much I was spending on CD.. I got to ht epoint of almost 3000 CD.. I ripped these to a NAS and at one point, this was my most favorite method of listening to music. I had a NAS backed-up by another in my office and one at home and .. Streaming came...
Then I realized that I could not reliably tell the difference between music in 250 Jb/s mpw, even less VBR 250 AAC or ogg-vorbis and started listening through Spotify.. Game changer , more music, more convenience then, i discovered Tidal in lossless , then Apple Music and my collection is gathering digital dusk. It is on a 2 HDD...
Paying about $20/month is IMO, worth it fore the virtually infinite libraries of music at my disposal via Apple and Spotify. I sometimes, am concerned about the streaming companies raising their prices... I find myself inching closer to $75.oo/month for movies subscription with Netflix, Disney, Amazon,etc.
Still, music streaming has been a game-chnager and a sourc of enjoyment for me.

Peace.
Here’s a fun trick I’ve learned... Between my girl, her kids, my daughter and me, I am paying a ridiculous amount in monthly streaming. The TV streams alone are ridiculous and impossible to manage unless I quit my job. So I put all my streaming subscriptions on a “backup“ credit card that I say I lost every 12 months or so. Now I’m not paying for Outlander on xyz network cause my girl already binged it. Yup, you do have to update the card info for services you are using but they will send you an email to renew, I promise. :) I figure I’ve saved enough to buy a pair of $1000 speakers doing it already.
 
The IP protocol only provides error correction for the header. On top of it, you have the two main transport protocols, TCP and UDP. TCP guarantees error free delivery by providing checks and a handshake protocol. UDP does not - it just delivers whatever payload and lets the layers above implement whatever they need to work.

Early on, UDP was used for a lot of "real time" applications. I think pretty much every streaming platform now leverages TCP. I know Spotify does as does Tidal. That said, it's in no way necessarily superior - you could conceivably implement error-free connectivity on UDP as well, and in some ways, you could do a better job at implementing "application aware" optimizations. But clearly the streaming providers have over time defaulted to TCP because of the built-in error correction for payloads and it means they don't have to re-invent anything, just use it: that's faster and cheaper.

That also means that the whole talk about "audiophile grade" network equipment and Ethernet cables is utter and complete mental diarrhea. If banks and hospitals and the military run their mission critical apps on networks, we know that the network is only relevant because of availability considerations, and *never* because you need "better cleaner bits". If your music system is as important as a heartrate monitor or missile detection systems or billion $ financial transactions, then go invest in *reliability* via equipment and interface *redundancy* with quality network equipment but avoid any "audiophile" claims that immediately highlight the fact that vendor has no clue what they are doing other than milking your wallet shamelessly.
 
Time is of the essence - you may end up in that situation when after many years of paying for streaming, the moment you stop, you don't have anything...
Oh please please let's not turn experiencing art into yet another race of consumption and ownership;) I don't need to own music in order for it to enrich my life. Following this false analogy, you should stop attending concerts, theatre plays, movie screenings, museum exhibitions and so on, since you pay for interacting with stuff you don't possess...
 
Back
Top Bottom