Vibecoding: The good, the bad, and the ugly. Do I vibecode?

If you were to ask me: "Do you vibecode?" Just like most, if not all, questions asked to a developer, the answer is probably going to be: it depends.

I see LLM and code copilot tools a bit like I see my IDE, the compiler, a framework, or my PC: a tool. But unlike my IDE, the compiler, a framework, or my PC, I also see them as "juniors"*. Frankly, the temptation to just throw a problem at a model, and leave it to figure it out on its own is great.

How do I see code copilots?

I said earlier that I saw copilots as juniors. That's a bit of an overstatement. More specifically, I see them as scholar interns. They have a lot of knowledge, straight out of school, with lots of theory, but very little practice. It's not a bad thing per se, but at the very least I can expect interns to ask me when they don't know (sometimes they don't, but that's an entirely different problem).

How do I use code copilots?

In general, the process I go through with my interns is a bit like this: we receive a feature story and the appropriate designs (I'm skipping some steps for simplicity's sake), we extrapolate from the designs the necessary API endpoints, database table columns, etc. Then, we discuss the implementation and architecture and how it links back to the existing structure, and then we implement.

I do the same with LLMs.

First, I give it the context: a bunch of markdown files pre-generated in advance much like documentation I would give an intern before they even start touching their keyboard. Then, I give it the feature story, and ask: how would you do it? I read it, then begin a discussion in Ask mode. And then, once I am happy with the plans, I open a new session, give it the context, and the implementation plan.

How good is it?

In terms of quality, to be honest I was surprised. Even without going through a planning phase, it does pretty okay. Not as good as I would've liked sometimes, but again: I kind of see it as an intern, and I don't expect much from interns. With the planning phase however, I get results that I would say are pretty professional. It could be better, but again, so is most of the code everyone makes.

I do want to emphasise that I don't give it free rein on everything, nor do I give it gigantic feature stories. The usecases most range from add a button here to create a new form and store the submissions in database. I'm not asking it to build an entire app in one go. I do it pretty much like we always do. Which is step by step, story by story, sprint by sprint.

Now, on we go with the first good point: it lowers the activation energy. There are tasks that are not hard. Just… annoying: Boilerplate, wiring things together, adding validation, updating types, updating the documentation! None of it is intellectually challenging, but it still takes time and focus. A copilot removes a lot of that friction. You describe what you want, and suddenly 70% of the scaffolding is there. It reduces the “blank page” problem to almost zero.

Second: it is very good at filling in the obvious gaps. Sometimes you know exactly what you want to do, but you don’t want to spend five minutes remembering the exact syntax. Or the correct parameters. Or the specific way a library expects something to be structured. A Copilot is surprisingly good at the boring parts of memory. The stuff you’ve done a hundred times but don’t want to context-switch for. It keeps you in flow instead of forcing you to search documentation every ten minutes.

Third: it forces you to be clearer. If you give vague instructions, you get vague or wrong code. To get good output, you need to provide context such as constraints, existing patterns, expected behavior, edge cases. In a way, it rewards structured thinking. Explaining a feature clearly to a copilot feels a lot like explaining it to a junior. And if you can’t explain it clearly, that’s usually a signal that the feature itself isn’t well defined yet. This is a good feedback loop before or after refinements!

Fourth: it is fast. Obvious, but still worth stating. Even when it gets things slightly wrong, it produces something instantly. Iterating with it is cheap. You can explore approaches, discard them, try another angle, all in minutes. What I love about it is it dramatically compresses the "try something and see" cycle.

How bad is it?

If you don't have high expectations like I do, it's pretty okay so in terms of "bad," there isn't much. However, I do have a few... things I dislike.

First: it always want to jump into the implementation. When I told you it reminds me of interns fresh out of school, I really wasn't joking. During planning phases, it almost always asks me whether I want to start implementing. It always jumps straight into: "are we done? Can I code?" mode. A bit like an impatient junior trying to prove themselves.

It's fine when it happens with interns, since coding actually helps them learn and is, overall, a good way for them to learn the codebase and the ways of working. But an AI doesn't learn by coding. It just spits out code and it kind of forgets about it. So, I get a little frustrated.

Second: it doesn't ask questions. As an example, if I tell it to make a button at a certain spot, it sometimes doesn't even ask what kind of button. Sure, sometimes it can guess based on context and adjacent code snippets, but even when it doesn't, it just makes stuff up. I already know that LLM suffer a lot from hallucinations, but this is still a small pain point I wanted to mention.

Third: it is overly confident. Interns are sometimes overly confident too, but they usually show hesitation. They’ll say "I think this works" or "I’m not entirely sure about this part." There’s at least a signal. Copilots don’t hesitate. Even when they are guessing or the context is incomplete because the correct answer depends on business rules that were never specified. The output looks clean, structured, and authoritative. If you’re not careful, you can mistake that tone for correctness. As I said earlier, it rarely says: "I don’t have enough information."

And because the code compiles, or looks coherent at first glance, the mistakes can be subtle. Which means you can’t lower your guard. If anything, you have to review it more critically than you would review an intern’s work precisely because it sounds so sure of itself.

One of the things that was added to ChatGPT I liked was the "alternative options" where you could choose between different outputs. This would be a good fix. Now, I know that good vibe coders would use different models running in parallel, but that's a bit overkill in my opinion. That's why I don't really see that as much as a pain point and more of a "could be better" point.

Finally, it amplifies existing mistakes. Copilots are very good at picking up patterns from the codebase. That’s one of their strengths. They look at what’s already there, and they try to be consistent with it. In theory, that’s exactly what you want. In practice, no codebase is perfect. There are legacy shortcuts, old patterns you wouldn’t introduce today and slight architectural compromises made under deadline pressure six months ago. Things you fully intend to clean up "later." Heh.

An intern might copy those too, sure. But at least there’s room for discussion. You can tell them: “Yes, that’s there, but we don’t do it like that anymore.” Or they might even question it by themselves. Copilots usually don't. If something was done a certain way once, it may assume that’s the canonical way. It will replicate the same workaround just because it exists as a truth to it.

It doesn’t know which patterns are intentional and which are historical accidents. So instead of just helping you move faster, it can also help you quietly scale your technical debt. And that’s probably the most ironic part. The better it gets at mimicking your codebase, the more the following prompts will do the same.

Do I vibecode?

As I said before, the answer depends. It depends on what, you say? Well, it depends on the definition of "vibecoding."

If vibecoding means throwing a vague idea at a model and shipping whatever comes out, then no. I don’t. That sounds less like engineering and more like gambling.
If vibecoding means using a copilot to move faster, to scaffold, to explore, to handle the boring parts, then yes. Absolutely.

So do I vibecode? It depends.