• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

When is AI going to regenerate the lost data on recordings from cd or other digital source?

Status
Not open for further replies.

smoking_man

Member
Joined
Jan 25, 2026
Messages
31
Likes
2
The 16bit 44.1khz standard looses musical information from the original performance, this cannot be recreated by an ordinary DAC. Will AI eventually fill in the gaps?
 
Any data to back this assertion?
It should be obvious there is only one or two microphones recording the sound scape in a live performance and all digital music is frequency band limited. The resolution of cd means that the wave form is not exactly the same as when it was recorded because of the sample rate. Then there is reproduction in your listening room that needs optimising with AI.
 
a musical performance does not just create one wave, like cd. It produces a multitude of waves which are simplified down to one waveform when recorded.
 
It should be obvious there is only one or two microphones recording the sound scape in a live performance and all digital music is frequency band limited. The resolution of cd means that the wave form is not exactly the same as when it was recorded because of the sample rate. Then there is reproduction in your listening room that needs optimising with AI.
The resolution of the CD is higher than the resolution of your ears
 
a musical performance does not just create one wave, like cd. It produces a multitude of waves which are simplified down to one waveform when recorded.
If you capture a performance with only one microphone, you only have that amount of spatial information. No AI in the world could faithfully recreate the missing info from the other 2 or 10 mics you would have placed in the room to gather more info on how it sounds in the other corners of the room. You can just add something like a virtual sorround DSP to your output at home, which tries to upmix/"up-spatialise" the resulting sound to somehow sound "better" on your specific setup (stereo speakers, headphones, 5.1, whatever). You could also train a machine learning/"AI" model to perform that upmix and it would probably do it well. This does not re-create the lost information, though - the DSP or the AI model just alter the signal in a way that creates a specific illusion of how it might have sounded in a simulated room/sorround setup.

The idea that apart from that spatial info, you loose "waves" (?) by capturing music is not correct. All waves coming from different instruments, humans, speakers and so on create sound pressure fields, which are superimposed at all points in a space resulting in a local sound pressure level. The microphone captures that sound pressure at one specific point (its own position) as a function of time. The only things you loose are sounds below the noise or sensitivity floor, those above the maximum level and the frequencies outside of the bandwidth of the recording system. As recording equipment captures a broader frequency range than humans can hear, the latter is not a concern for recording or reproduction. And as long as the recording engineer dialed in the levels correctly, noise or clipping should also be (almost) inaudible.
 
It should be obvious there is only one or two microphones recording the sound scape in a live performance and all digital music is frequency band limited. The resolution of cd means that the wave form is not exactly the same as when it was recorded because of the sample rate. Then there is reproduction in your listening room that needs optimising with AI.
The wave form is never the same as the event in the room—microphones are transducers, after all, subject to all the usual issues with transducers. It's not for nothing that recording engineers who are making money have a lot of microphones to work with as they all sound different. Nobody records with just two microphones for professional work with the rare exception of minimalist recordings of chamber groups. The resolution of the waveform is far more likely to be accurate with digital, Redbook standard recordings than analog recordings. Analog tape has inherent and quite audible flaws baked into the formula, like various speed variations and the loss of high frequency content as the level increases. Digital recording, at 16 bits, sample rate of 44.1 kHz, covers the dynamic range that would be audible in a domestic environment, the upper limit of 20.5 kHz covers the upper limit of what we humans can hear. I've done a lot of work transcribing LPs to reel to reel tape at 15 ips. Those transfers lost high-frequency energy, had audible hiss, simply weren't as good at transfers to 16/44.1 digital media.

I don't think running older digital recordings through some AI process will make anything any better. I made a lot of digital recordings using a pair of Neumanns and a DAT recorder, they were fine as long as the group playing the music was small and the venue was acoustically friendly. I've heard plenty of early (1980s) digital recordings of classical music, the best were very good. See no point in using any AI tech to spruce them up.

On the other hand, as demonstrated by the MAL de-mixing technology* used in various Beatles-related projects there is the possibility of using AI technologies to turn mono recordings into something like genuine stereo recordings. I suspect this technology will come into play in ways both good and bad in our futures.

*developed by the award-winning sound team led by Emile de la Rey at Peter Jackson’s WingNut Films Productions Ltd, it's a variety of machine learning

 
Last edited:
If there was only one sound wave, why would the sound wave change as I move a microphone in anechoic chamber?
 
AI can create near perfect video from its data set including voice and background sound, I think it would be able to manage to create near perfect audio from nothing which does not require video.
 
AI can create near perfect video from its data set including voice and background sound, I think it would be able to manage to create near perfect audio from nothing which does not require video.
Yes it can. You can create full songs and albums from just a simple prompt now. Therefore, you can create pretty much any audio effect using AI.

But again: Information that is lost, is lost. You can use any filter, DSP or AI to add some effect to a recording. But it will just be some version of what could have been, it will not be what actually was.
 
a musical performance does not just create one wave, like cd. It produces a multitude of waves which are simplified down to one waveform when recorded.
A "sound" (acoustic energy in the human frequency range) is a sum of many pressure disturbance waves in the air. The red book audio standard is enough to store these waves information in the digital domain for a human.
If there was only one sound wave, why would the sound wave change as I move a microphone in anechoic chamber?
There no "only one sound wave", as I said before, the sound is composed by a sum of waves in superposition. You might have seen a RTA spectrogram (FFT), the electric signal analyzed by the RTA is just one signal, but "inside" it there is a lot of waves which the FFT can decompose partially those waves. This change you talking about is related to the source and the microphone position, each point in scape there will be a different summation of waves. Supposing an ideal anechoic chamber with an ideal omnidirectional point source, an ideal microphone would only "perceive" the pressure level variation, not a frequency response change.

I'm going to supose you are talking about the sound that a microphone wouldn't capture in different locations? If so, the problem is not the "resolution" of the digital medium. Those non-captured waves can't be storage if those weren't captured first. In this case, we might be talking about rendering sounds, not storing it. Imagine a FPS game, a game doesn't store sound information por each point the receiver (player) pass by, it renders the sound.
 
a musical performance does not just create one wave, like cd. It produces a multitude of waves which are simplified down to one waveform when recorded.
Nope. You can sum a bunch of sines together and create a complex waveform. There's no limit on how many.
 
It is much less exciting than you would think, but we already have all of this.

MQA added false harmonics to extend frequency past the Nyquest limit. It was a dumb idea and flawed plan.

There are already algos to address record pops and tape hiss.

Recordings have been broken down by AI back into their constituent tracks and remixed into new recordings (e.g. separating vocals from backing instruments).

The Beatles Abbey Road album has a remaster with bass that was too low to be recorded on the 1960s era equipment but was reconstructed by data analysis of the overtones.
 
Status
Not open for further replies.
Back
Top Bottom