• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

AIiven

manisandher

Addicted to Fun and Learning
Audio Company
Forum Donor
Joined
Nov 6, 2016
Messages
864
Likes
964
Location
Royal Leamington Spa, UK
I thought some people here might be interested in my new venture:
AIiven - header v4 _ white.png

AIiven ('aiiven', but pronounced “aliven”) is a purist, AI-driven audio processing system focused on undoing the damage caused by the loudness war. The core task is to recover lost dynamics from over-compressed music while preserving the original tonal balance.

I've completed the proof of concept, and am very encouraged. Now pressing on with the first complete training run. (Was hoping the DGX-Spark would be here by now, so the 5090 will have to do.)

Mani.
 
Cool.

Be cooler if you provided an example of your proof of concept. I admit I am skeptical, not in the sense of "I don't believe you," but rather in the "let me see more so I can form an opinion" sense.
 
I admit I am skeptical, not in the sense of "I don't believe you," but rather in the "let me see more so I can form an opinion" sense.

An 'opinion' of what, and from whom?

I intend to use this thread as a sort of journal, for anyone who's interested in following the AIiven journey. And happy to give some pointers to anyone who's interested in training their own net.

It may ultimately be that I offer a free app for anyone to use on their own material. But let's see how things pan out.

Cool.

Be cooler if you provided an example of your proof of concept.

I can't share any pre/post Alivened audio files, because I wouldn't want to get into any copyright problems. And I definitely can't share my current (PoC) network/weights.

Not much to go by, but here's what my current network does to a very compressed track:
1759146661922.png


No strange artifacts in the post-processed file. I'm expecting the first full training run to perform substantially better than this.

Mani.
 
An 'opinion' of what, and from whom?

My opinion, based on whatever information you share. Seeing "4" and "6" doesn't give me much to work with. You did something that changed a number, sure. But so what? What does that show me that I can hang my hat on? How do I know that the "6" result isn't just from adding something that should NOT be there outside the dynamic range of "4"?

I can't share any pre/post Alivened audio files, because I wouldn't want to get into any copyright problems.

Waveforms are not copywrited.

And I definitely can't share my current (PoC) network/weights.

Don't care about those, would never ask for them if I did. How about a GENERAL description of your method?
 
(Was hoping the DGX-Spark would be here by now, so the 5090 will have to do.)
The 5090 will probably be faster than a DGX Spark if the model fits into its memory. The DGX Spark has quite poor memory bandwidth (the 5090 is about 6x faster!), and therefore doesn't really have top performance.

As for your venture: very cool if it works!

Where do you get the non-warr-damaged source files to train with?
 
The 5090 will probably be faster than a DGX Spark if the model fits into its memory. The DGX Spark has quite poor memory bandwidth (the 5090 is about 6x faster!), and therefore doesn't really have top performance.

5090 definitely more than good enough for now, was just looking to the future.

As for your venture: very cool if it works!

Cheers. I'm very confident it'll work, based on my PoC.
Where do you get the non-warr-damaged source files to train with?

Probably best not to comment right now. The corpus isn't huge, ~1TB for this first full training run.

Mani.
 
Thanks for the waveforms.

I get not being ready to share your general approach yet. You are developing something, there may be commercial applications, IP issues and such. Totally understand.

So I'll just assume something like this for the time being: "Tell AI to do things and keep telling it to do different things until good results happen." Feel free to correct me if I am wrong on that. ;)
 
I love models!
"consider a spherical recording, or radius r and mass m"*

_____________
* this is a reference to an old joke. I heard it from a physical biochemist of no mean repute, but I reckon it's familiar to most of the folks who lurk at a place like ASR.
1759152416942.jpeg
 
AIiven ('aiiven', but pronounced “aliven”)
Nothing personal but I don't like the name, it's almost clever. Putting AI in the name isn't necessary any more than it was necessary to prefix everything with 'e', 'i' or 'cyber' in 1999.
I can't share any pre/post Alivened audio files, because I wouldn't want to get into any copyright problems. And I definitely can't share my current (PoC) network/weights
This is concerning. There's tons of creative commons and public domain recordings out there. Download one, slam it through Waves L3 or whatever, Aiiven it, show us all 3 versions, maybe even deltawave results?

Btw how are you training it? There are only so many recordings where there are 2+ versions where the only difference is compression.

Despite my cynicism about AI in general it's an interesting idea, but I have no idea if it would work well, so definitely keen to hear something.

All creative tools are sold on the strength of the demo, I look forward to hearing one!
 
Can the user feed it a sample (or a set of samples/examples) and tell it, "I want it to sound like this"?

I assume AI has a lot of potential in fixing-up old recordings, or badly produced recordings, but I don't know if we are there yet and we need to be able to tell it what we want.

Dynamics are very complicated. It's impossible to know what was done or what the sound/waveform was like before compression/limiting. Maybe the song originally started-out quiet and ended loud and now it's loud all the way through, etc... And usually different processing is done on different tracks before mixing, and then more compression & limiting during mastering so the song might have to be unmixed first.




...My general opinion about AI is: The future will be annoying. ;P And I "don't feel good" about taking the humanity out of art.

But fixing-up old analog recordings, or recordings that would have been better if they had the technology is OK with me. And altering the "art" of recordings that were intentionally "destroyed" by the loudness war is also OK with me.
 
And altering the "art" of recordings that were intentionally "destroyed" by the loudness war is also OK with me.
Capitalism in its purest form. Reminds me of Dr. Seuss's The Sneetches, wherein Sylvester McMonkey McBean takes advantage of the two tribes of Sneetches' prejudices.
 
Back
Top Bottom