AI made coding more enjoyable
A couple weeks ago I tried agentic coding agents for the first time. Very late to the party, I know. I didn’t know what tasks I would do with it. And the notion of letting it do all my coding, as many AI zealots suggest, doesn’t sit well with me. I’m responsible for the work I put out, AI generated or not. So vibe coding is out of the question for anything beyond a first draft or prototype.
For these reasons trying out agentic AI for coding had low priority for me. But as they were more and more hyped, I wanted to see if there’s something I could use. If these tools can make me more productive, I want to have them.
Another factor was that I discovered a couple weeks ago that Poe, the chat product I use (which supports virtually all models), started offering API access. This means I can use an open source coding agent, connect it to the Poe API, and use the tokens I already pay for. Until now I used only ~20% of the Poe tokens, so this is a great way to make use of the remaining 80%. At the same time I don’t have to pay for multiple AI subscriptions.
As a start I immersed myself in the AI coding landscape. I had a list of products and a bunch of articles saved in my feed reader, which I checked out and read in one go. When I do this, it sparks ideas. Ideas like automatically adding logging and observability code based on the rules I defined. This was a strong motivational boost.
Enjoyable parts of coding with AI
Cline works so well that I quickly accustomed to it. The Cline team built something amazing, kudos!
Tedious writing
Writing fatigue is real. When I procrastinate, often it’s because there’s a lot of tedious writing involved (like boilerplate code, lots of error handling, etc.), not because it task is complicated.
In these cases, I usually know exactly how to do it. Which means reviewing the resulting code is trivial, and that makes it a perfect task for AI.
Does it make me faster? Probably not. But does it make it less effort and more enjoyable, especially when I’m exhausted? Hell yes!
To me, by far the most enjoyable thing about coding with AI is that it can take over tedious writing for me. Because honestly, writing fatigue is real. When I procrastinate, often it’s because there’s a lot of tedious things involved, like boilerplate code, lots of error handling, etc., not because it task is complicated.
With Cline I just tell it what to do. First in plan mode to make sure it does what I want it to do, and won’t do anything weird or unexpected. Then in act mode to change the code. The last step is reviewing the changes, usually do a few adaptations, and commit.
Code analysis
Another use-case is analyzing code. Either code I just wrote to see if it could be better (whatever that means), or existing code for faster understanding. Sometimes even for navigation (find where X happens), but that’s only when I’m feeling especially lazy.
It’s like having an all-knowing coworker at your fingertips, who knows every pattern, every API, every best practice, and can compare the code against it. This is particularly helpful when I’m working in areas I’m not very familiar with.
It makes mistakes, and this is where my own judgment comes in. I decide if I like its output or not. But the suggestions it makes draw from a huge repository of knowledge, which otherwise wouldn’t be accesible to me.
Small tasks
Coding agents can often implement small changes, or find and fix simple bugs, all on their own. If I know roughly where the bug is I can add hints to increase the success rate.
This is not revolutionary by any means. What I love about it is that I can write the message, leave the computer, come back a couple minutes later, and have code to review. If it is what I expected it’s a nice little dopamine hit.
Lower activation cost
I’m not a machine. My motivation rises and falls during the day and week. When motivation is low, sometimes there’s a tedious writing task I don’t want to do, sometimes it’s a larger task I put off. I know I have to do these tasks, and eventually I’ll start.
With AI, I can delegate the task to it. I could tell it to do the whole thing, or some part of it. Or maybe to give me an overview of how a specific subsystem currently works. Or even, as stupid as it sounds, find a specific file for me. Whatever it is that I need as a starting point. And that’s basically zero effort.
This workflow reduces the motivation and willpower it takes to start a task. Which reduces the time I spend procrastinating, especially during times with low motivation.
After a couple weeks with coding agents I can say that, for me, they’re not about becoming 10x faster. I doubt current AI tools can do that if the complexity of the codebase is beyond a certain threshold. Maybe that changes in the future. They’re more about raising my output quality and catching the lows in my motivation and still be productive.
Aside: Evaluation phase
The first tool I tried was Cline. Mostly because it’s open-source and, allows arbitrary OpenAI compatible APIs, and you can choose whatever model you like. So actually the first thing I did was try out all state of the art models and compare how they work. There was one specific task I made them all do, it was about analyzing and refactoring a specific file, a relatively large React hook. Never before has this file been edited so much.
The models I tried out first were
- GPT-5
- Claude Sonnet 4
- Claude Opus 4.1
- Gemini 2.5 Pro
- Qwen3 Coder
GPT-5 was worse than expected, but that’s mostly because the Poe API doesn’t support reasoning effort yet. Which means it runs on minimal reasoning. That makes it comparatively bad and rules it out for anything useful.
Qwen3 Coder also didn’t work well. It didn’t properly call the tools Cline provided. With the hype around Qwen models I expected more, so it was kind of a letdown.
Gemini 2.5 Pro was ok, but not as good as the others.
Clause Sonnet 4 and Opus 4.1 were the best from that list. They solved the task quite well, and worked very well with Cline, which is not surprising since Cline recommends these models and is designed with/for them.
What I found interesting in my tests is that, even though Opus costs ~7x as much as Sonnet, in many cases the resulting costs were close to even. Opus used far fewer tokens to achieve the same result.
After these initial tests I discovered additional models, that were just released:
- Grok-4 Fast
- GPT-5 Codex
Initially I disregarded Grok because of the controversy around Twitter/X and the early missteps of the Grok models (ahem MechaHitler ahem). But when I tried it out I was blown away.
The cost of it is a fraction of the Claude models, about 1/10th of Sonnet, and the quality is very close. It doesn’t work as well, but in my experience the generated code is less verbose and has fewer changes to the original code. Which I like. I don’t want the AI to rewrite half my codebase for a simple request. This just adds unnecessary code review effort.
So, unexpectedly, Grok-4 Fast is a top contender.
Later I tried out GPT-5 Codex as well. It worked a lot better than plain GPT-5, and with Cline would put it on par with Grok-4 Fast (can’t say anything how well it works with other coding agents). One thing I noticed and highly appreciate, during the agent loop, GPT-5 checked out other instances of the same pattern in my code. At first I was confused why it wanted to read seemingly random files, until I realized what it does. I was impressed.
I haven’t yet, but it’s likely possible to get Claude models or Grok to do the same with a simple rule. Nonetheless, nice detail of GPT-5 Codex.
These are my experiences with the different models. For now I’m sticking with Grok-4 Fast, simply because the quality to cost ratio is incredible, and the minor improvements of other models aren’t worth the increased cost.
Claude 4.5 Sonnet
Between writing the first draft and now Claude 4.5 Sonnet was released. Which shows that a post like this can never really be up to date.
Anyway, I tried it out briefly. In my tests it wasn’t much better than Claude 4 or Grok-4. This is likely because the tasks I let AI handle are relatively simple. I assume that the improvements of the new models are more apparent for more complex tasks.
Other tools
There are a bunch of other tools as well. I tried out Continue, which was a big step down from Cline. Other tools like Cursor, Windsurf, Claude Code, etc. are still on the list. My expectation is that they may work slightly better, but won’t be a step change. Maybe I’ll be proven wrong.