As far as coding is concerned, I don't use AI to code the overall logic or code algorithms I use. I may occasionally discuss suggestions of alternative options, optimizations etc, but I always ask it to discuss and not generate any code (in that respect, all the models I use handle the "do not" case properly, that was an issue with previous models.
One area where it helps is use interface generation. Typically user interfaces are my least worry and, when I end up my coding session, I am usually too exhausted mentally to bother implementing a user friendly interface. The progress in recent models is that I can feed them a code base, instruct them not to touch the core and generate an GUI.
This typically works very well, the interface is clean and it exposes the tunable parameters in a logical way. However, the models totally lack common sense...
I was recently working on a program that consist of very memory and computationally intensive tasks. The tunable parameters, the legends and other textual informations were accessible on the left panel, the result of the computations in a preview pane on the right. I of course wanted to test different parameters, update the results (large image) and add eventual comments when satisfied.
As soon as I started playing with the sliders, I noticed my computer went 100% CPU/GPU for a very long period of time. There was no way to move the slider without triggering multiple new computations... I suggested the problem to the model
Model:
Yes! The slider is triggering the update on every tiny movement. We can make it only update when you release the slider. Here's how:
Solution: Use mouse button release event instead of continuous updates
So far so good. But some sliders that were supposed to be inactive (their main section was inactive) we still active.
Me:
Again from a user interface responsiveness point of view, it seems that if I play with the sliders when the option is disabled, the computation still happens in the background. Could those two sliders have an inactive state when the option they give parameter to is disabled?
Model:
Excellent observation! Let's disable the sliders when their parent options are disabled. We need to...
But that wasn't the end. If you assumed the model would have fixed all the controls, you would be wrong
Me:
Oh, I just noticed that the problem I had with the sliders is general - changing the text leads to image regeneration for each deleted letter and each typed letter. Can we make all those options trigger only when the change is final? For the text I assume it would be when the mouse leaves the box?
Model:
Yes! Let's make all the text inputs only update when you're done editing.
Maybe I am a bit harsh for the model here, or optimistic about humans, but if I had had an human intern tasked with coding an interface around my code, I would have expected saying something like "Kid, don't restart the model compute hundreds of times when a parameter slider moves" to be enough for the guy to fix all those triggers.
And the model had everything in its context. We even had discussed some potential optimizations and I had received an encyclopedic (and mostly correct) lecture on the pluses and minuses of the optimizations. It gave the impression that it "knew" about the heavy compute, it even gave me a very good ballpark estimate of the potential performance gains. But it did not see a problem with calling the heavy part constantly.
It could be, as Steve Jobs once famously said, that "I am using it wrong" It could be that if, instead of going at it in a conversational way, I had offloaded the whole thing to a fully agentic / deep thinking pipeline of agents, they would have figured it out all by themselves. But the drawbacks would have been up to 100-1000 times more tokens (burn, baby, burn) and my complete disconnection from the code.
Amodei recently said that soon we wouldn't verify code as we don't verify compiler output (well, we do, but that's another topic, most users don't). That's a bit scary imho, even if the hallucinations/mistakes are solved, we might end up being satisfied monkeys passively watching code doing stupid things in the background.