I Went All-In on AI. The MIT Study Is Right.

AutistoMephisto@lemmy.world · edit-2 3 months ago

I Went All-In on AI. The MIT Study Is Right.

edgemaster72@lemmy.world · 3 months ago

Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive.

And all they’ll hear is “not failure, metrics great, ship faster, productive” and go against your advice because who cares about three months later, that’s next quarter, line must go up now. I also found this bit funny:

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me… I was proud of what I’d created.

Well you didn’t create it, you said so yourself, not sure why you’d be proud, it’s almost like the conclusion should’ve been blindingly obvious right there.

AutistoMephisto@lemmy.world · 3 months ago

The top comment on the article points that out.

It’s an example of a far older phenomenon: Once you automate something, the corresponding skill set and experience atrophy. It’s a problem that predates LLMs by quite a bit. If the only experience gained is with the automated system, the skills are never acquired. I’ll have to find it but there’s a story about a modern fighter jet pilot not being able to handle a WWII era Lancaster bomber. They don’t know how to do the stuff that modern warplanes do automatically.

LOGIC💣@lemmy.world · 3 months ago

It’s more like the ancient phenomenon of spaghetti code. You can throw enough code at something until it works, but the moment you need to make a non-trivial change, you’re doomed. You might as well throw away the entire code base and start over.

And if you want an exact parallel, I’ve said this from the beginning, but LLM coding at this point is the same as offshore coding was 20 years ago. You make a request, get a product that seems to work, but maintaining it, even by the same people who created it in the first place, is almost impossible.

ctrl_alt_esc@lemmy.ml · 3 months ago

I agree with you, though proponents will tell you that’s by design. Supposedly, it’s like with high-level languages. You don’t need to know the actual instructions in assembly anymore to write a program with them. I think the difference is that high-level language instructions are still (mostly) deterministic, while an LLM prompt certaily isn’t.

Cocodapuf@lemmy.world · 3 months ago

Once you automate something, the corresponding skill set and experience atrophy. It’s a problem that predates LLMs by quite a bit. If the only experience gained is with the automated system, the skills are never acquired.

Well, to be fair, different skills are acquired. You’ve learned how to create automated systems, that’s definitely a skill. In one of my IT jobs there were a lot of people who did things manually, updated computers, installed software one machine at a time. But when someone figures out how to automate that, push the update to all machines in the room simultaneously, that’s valuable and not everyone in that department knew how to do it.

So yeah, I guess my point is, you can forget how to do things the old way, but that’s not always bad. Like, so you don’t really know how to use a scythe, that’s fine if you have a tractor, and trust me, you aren’t missing much.

jimmy90@lemmy.world · 3 months ago

yeah i don’t get why the ai can’t do the changes

don’t you just feed it all the code and tell it? i thought that was the point of 100% AI

raspberriesareyummy@lemmy.world · 3 months ago

So there’s actual developers who could tell you from the start that LLMs are useless for coding, and then there’s this moron & similar people who first have to fuck up an ecosystem before believing the obvious. Thanks fuckhead for driving RAM prices through the ceiling… And for wasting energy and water.

khepri@lemmy.world · 3 months ago

They are useful for doing the kind of boilerplate boring stuff that any good dev should have largely optimized and automated already. If it’s 1) dead simple and 2) extremely common, then yeah an LLM can code for you, but ask yourself why you don’t have a time-saving solution for those common tasks already in place? As with anything LLM, it’s decent at replicating how humans in general have responded to a given problem, if the problem is not too complex and not too rare, and not much else.

raspberriesareyummy@lemmy.world · 3 months ago

As you said, “boilerplate” code can be script generated - and there are IDEs that already do this, but in a deterministic way, so that you don’t have to proof-read every single line to avoid catastrophic security or crash flaws.

InvalidName2@lemmy.zip · 3 months ago

And then there are actual good developers who could or would tell you that LLMs can be useful for coding, in the right context and if used intelligently. No harm, for example, in having LLMs build out some of your more mundane code like unit/integration tests, have it help you update your deployment pipeline, generate boilerplate code that’s not already covered by your framework, etc. That it’s not able to completely write 100% of your codebase perfectly from the get-go does not mean it’s entirely useless.

Soggy@lemmy.world · 3 months ago

Other than that it’s work that junior coders could be doing, to develop the next generation of actual good developers.

JcbAzPx@lemmy.world · 3 months ago

If it’s boilerplate, copy/paste; find/replace works just as well without needing data centers in the desert to develop.

raspberriesareyummy@lemmy.world · 3 months ago

And then there are actual good developers who could or would tell you that LLMs can be useful for coding

The only people who believe that are managers and bad developers.

keegomatic@lemmy.world · 3 months ago

You’re wrong, whether you figure that out now or later. Using an LLM where you gatekeep every write is something that good developers have started doing. The most senior engineers I work with are the ones who have adopted the most AI into their workflow, and with the most care. There’s a difference between vibe coding and responsible use.

raspberriesareyummy@lemmy.world · 3 months ago

There’s a difference between vibe coding and responsible use.

There’s also a difference between the occasional evening getting drunk and alcoholism. That doesn’t make an occasional event healthy, nor does it mean you are qualified to drive a car in that state.

People who use LLMs in production code are - by definition - not “good developers”. Because:

a good developer has a clear grasp on every single instruction in the code - and critically reviewing code generated by someone else is more effort than writing it yourself
pushing code to production without critical review is grossly negligent and compromises data & security

This already means the net gain with use of LLMs is negative. Can you use it to quickly push out some production code & impress your manager? Possibly. Will it be efficient? It might be. Will it be bug-free and secure? You’ll never know until shit hits the fan.

Also: using LLMs to generate code, a dev will likely be violating copyrights of open source left and right, effectively copy-pasting licensed code from other people without attributing authorship, i.e. they exhibit parasitic behavior & outright violate laws. Furthermore the stuff that applies to all users of LLMs applies:

they contribute to the hype, fucking up our planet, causing brain rot and skill loss on average, and pumping hardware prices to insane heights.

keegomatic@lemmy.world · 3 months ago

We have substantially similar opinions, actually. I agree on your points of good developers having a clear grasp over all of their code, ethical issues around AI (not least of which are licensing issues), skill loss, hardware prices, etc.

However, what I have observed in practice is different from the way you describe LLM use. I have seen irresponsible use, and I have seen what I personally consider to be responsible use. Responsible use involves taking a measured and intentional approach to incorporating LLMs into your workflow. It’s a complex topic with a lot of nuance, like all engineering, but I would be happy to share some details.

Critical review is the key sticking point. Junior developers also write crappy code that requires intense scrutiny. It’s not impossible (or irresponsible) to use code written by a junior in production, for the same reason. For a “good developer,” many of the quality problems are mitigated by putting roadblocks in place to…

force close attention to edits as they are being written,
facilitate handholding and constant instruction while the model is making decisions, and
ensure thorough review at the time of design/writing/conclusion of the change.

When it comes to making safe and correct changes via LLM, specifically, I have seen plenty of “good developers” in real life, now, who have engineered their workflows to use AI cautiously like this.

Again, though, I share many of your concerns. I just think there’s nuance here and it’s not black and white/all or nothing.

raspberriesareyummy@lemmy.world · 3 months ago

While I appreciate your differentiated opinion, I strongly disagree. As long as there is no actual AI involved (and considering that humanity is dumb enough to throw hundreds of billions at a gigantic parrot, I doubt we would stand a chance to develop true AI, even if it was possible to create), the output has no reasoning behind it.

it violates licenses and denies authorship and - if everyone was indeed equal before the law, this alone would disqualify the code output from such a model because it’s simply illegal to use code in violation of license restrictions & stripped of licensing / authorship information
there is no point. Developing code is 95-99% solving the problem in your mind, and 1-5% actual code writing. You can’t have an algorithm do the writing for you and then skip on the thinking part. And if you do the thinking part anyways, you have gained nothing.

A good developer has zero need for non-deterministic tools.

As for potential use in brainstorming ideas / looking at potential solutions: that’s what the usenet was good for, before those very corporations fucked it up for everyone, who are now force-feeding everyone the snake oil that they pretend to have any semblance of intelligence.

keegomatic@lemmy.world · 3 months ago

violates licenses

Not a problem if you believe all code should be free. Being cheeky but this has nothing to do with code quality, despite being true

do the thinking

This argument can be used equally well in favor of AI assistance, and it’s already covered by my previous reply

non-deterministic

It’s deterministic

brainstorming

This is not what a “good developer” uses it for

Randelung@lemmy.world · 3 months ago

Maybe they’ll listen to one of their own?

raspberriesareyummy@lemmy.world · 3 months ago

The kind of useful article I would expect then is one exlaining why word prediction != AI

jali67@lemmy.zip · edit-2 3 months ago

Don’t worry. The people on LinkedIn and tech executives tell us it will transform everything soon!

Unlearned9545@lemmy.world · 3 months ago

Fractional CTO: Some small companies benefit from the senior experience of these kinds of executives but don’t have the money or the need to hire one full time. A fraction of the time they are C suite for various companies.

Diplomjodler@lemmy.world · 3 months ago

Or he’s some deputy assistant vice president or something.

bitjunkie@lemmy.world · 3 months ago

Deputy assistant to the vice president

pdxfed@lemmy.world · 3 months ago

Great article, brave and correct. Good luck getting the same leaders who blindly believe in a magical trend for this or next quarters numbers; they don’t care about things a year away let alone 10.

I work in HR and was stuck by the parallel between management jobs being gutted by major corps starting in the 80s and 90s during “downsizing” who either never replaced them or offshore them. They had the Big 4 telling them it was the future of business. Know who is now providing consultation to them on why they have poor ops, processes, high turnover, etc? Take $ on the way in, and the way out. AI is just the next in long line of smart people pretending they know your business while you abdicate knowing your business or employees.

Hope leaders can be a bit braver and wiser this go 'round so we don’t get to a cliffs edge in software.

Ancalagon@lemmy.world · 3 months ago

Tbh I think the true leaders are high on coke.

ripcord@lemmy.world · 3 months ago

I’m trying

Agent641@lemmy.world · 3 months ago

I cannot understand and debug code written by AI. But I also cannot understand and debug code written by me.

Let’s just call it even.

dejected_warp_core@lemmy.world · 3 months ago

To quote your quote:

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

I think the author just independently rediscovered “middle management”. Indeed, when you delegate the gruntwork under your responsibility, those same people are who you go to when addressing bugs and new requirements. It’s not on you to effect repairs: it’s on your team. I am Jack’s complete lack of surprise. The idea that relying on AI to do nuanced work like this and arrive at the exact correct answer to the problem, is naive at best. I’d be sweating too.

fuck_u_spez_in_particular@lemmy.world · 3 months ago

The problem though (with AI compared to humans): The human team learns, i.e. at some point they probably know what the mistake was and avoids doing it again. AI instead of humans: well maybe the next or different model will fix it maybe…

And what is very clear to me after trying to use these models, the larger the code-base the worse the AI gets, to the point of not helping at all or even being destructive. Apart from dissecting small isolatable pieces of independent code (i.e. keep the context small for the AI).

Humans likely get slower with a larger code-base, but they (usually) don’t arrive at a point where they can’t progress any further.

CarbonatedPastaSauce@lemmy.world · 3 months ago

Something any (real, trained, educated) developer who has even touched AI in their career could have told you. Without a 3 month study.

AutistoMephisto@lemmy.world · 3 months ago

What’s funny is this guy has 25 years of experience as a software developer. But three months was all it took to make it worthless. He also said it was harder than if he’d just wrote the code himself. Claude would make a mistake, he would correct it. Claude would make the same mistake again, having learned nothing, and he’d fix it again. Constant firefighting, he called it.

felbane@lemmy.world · 3 months ago

As someone who has been shoved in the direction of using AI for coding by my superiors, that’s been my experience as well. It’s fine at cranking out stackoverflow-level code regurgitation and mostly connecting things in a sane way if the concept is simple enough. The real breakthrough would be if the corrections you make would persist longer than a turn or two. As soon as your “fix-it prompt” is out of the context window, you’re effectively back to square one. If you’re expecting it to “learn” you’re gonna have a bad time. If you’re not constantly double checking its output, you’re gonna have a bad time.

ctrl_alt_esc@lemmy.ml · 3 months ago

It’s still useful to have an actual “study” (I’d rather call it a POC) with hard data you can point to, rather than just “trust me bro”.

some_designer_dude@lemmy.world · 3 months ago

Untrained dev here, but the trend I’m seeing is spec-driven development where AI generates the specs with a human, then implements the specs. Humans can modify the specs, and AI can modify the implementation.

This approach seems like it can get us to 99%, maybe.

phed@lemmy.ml · 3 months ago

I do a lot with AI but it is not good enough to replace humans, not even close. It repeats the same mistakes after you tell it no, it doesn’t remember things from 3 messages ago when it should. You have to keep re-explaining the goal to it. It’s wholey incompetant. And yea when you have it do stuff you aren’t familiar with or don’t create, def. I have it write a commentary, or I take the time out right then to ask it what x or y does then I add a comment.

Nalivai@lemmy.world · 3 months ago

They never actually say what “product” do they make, it’s always “shipped product” like they’re fucking amazon warehouse. I suspect because it’s some trivial webpage that takes an afternoon for a student to ship up, that they spent three days arguing with an autocomplete to shit out.

Evotech@lemmy.world · 3 months ago

Just ask the ai to make the change?

SocialMediaRefugee@lemmy.world · 3 months ago

Just sell it to AI customers for AI cash.

MintyFresh@lemmy.world · 3 months ago

You just won capitalism. You and musk can go to Mars now. Well send a postcard

KazuyaDarklight@lemmy.world · 3 months ago

My big fear with this stuff is security. It just seems so “easy”, without knowledgeable people, for AI to write a product that functions from a user perspective but is wide open to attack.

just_another_person@lemmy.world · 3 months ago

No shit

AutistoMephisto@lemmy.world · 3 months ago

What’s interesting is what he found out. From the article:

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

very_well_lost@lemmy.world · 3 months ago

Typical C-suite. It takes them three months to come to the same conclusion that would be blindingly obvious to anyone with half a brain: if you build something that no one understands, you’ll end up with something impossible to maintain.

DupaCycki@lemmy.world · 3 months ago

Personally I tried using LLMs for reading error logs and summarizing what’s going on. I can say that even with somewhat complex errors, they were almost always right and very helpful. So basically the general consensus of using them as assistants within a narrow scope.

Though it should also be noted that I only did this at work. While it seems to work well, I think I’d still limit such use in personal projects, since I want to keep learning more, and private projects are generally much more enjoyable to work on.

Another interesting use case I can highlight is using a chatbot as documentation when the actual documentation is horrible. However, this only works within the same ecosystem, so for instance Copilot with MS software. Microsoft definitely trained Copilot on its own stuff and it’s often considerably more helpful than the docs.

Rhoeri@lemmy.world · 3 months ago

AI is hot garbage and anyone using it is a skillless hack. This will never not be true.

nullroot@lemmy.world · 3 months ago

Wait so I should just be manually folding all these proteins?

Rhoeri@lemmy.world · 3 months ago

Do you not know the difference between an automated process and machine learning?

nullroot@lemmy.world · 3 months ago

Yes? Machine learning has been huge for protein folding and not because anyone is stupid, it’s because it’s a task uniquely suited for machine learning, of which there are many. But none of that is what this AI bubble is really about, and even though I find the underlining math and technology fascinating, I share the disdain for how the bulk of it is currently being used.

5gruel@lemmy.world · 3 months ago

The thing with being cocky is, if you are wrong it makes you look like an even bigger asshole

https://en.wikipedia.org/wiki/AlphaFold

The program uses a form of attention network, a deep learning technique that focuses on having the AI identify parts of a larger problem, then piece it together to obtain the overall solution.

Rhoeri@lemmy.world · 3 months ago

Cool, now do an environmental impact on it.