This website does not display correctly in Internet Explorer 8 and older browsers. The commenting system does not work either in these browsers. Sorry.

Topic of the day

My experience with the current wave of AI tooling in software development

As one of the authors of the Thoughtworks Technology Radar I’d known about the transformer technology at the heart of the current wave of AI for quite a while. We had mentioned Google’s BERT in 2019. I had also read the Stochastic Parrot paper when it came out in 2021.

However, like for almost everyone else, the release of ChatGPT proved to be a turning point. Now, this technology wasn’t a component to be used in the systems we built, it became something with potential to assist in the writing of software itself.

Then, GitHub Copilot became available, and I started experimenting with it almost immediately. You can read a post on me Taking Copilot to difficult terrain on this site. At the time I found Copilot helpful to a certain extent but I didn’t feel it could write complex code.

What I did last summer

Fast forward to 2025 and the arrival of much more capable models and better tooling. During the summer I used Windsurf with Sonnet 3.5 to implement a feature for CCMenu. (CCMenu is an open source Mac app that I maintain.) If you want to see the details, have a look at the commits in July 2025.

Windsurf did speed up writing code, but I felt short-changed. The AI was doing what I love, writing code, while I now had to do stuff I don’t like: writing specifications and reviewing mediocre code. My LinkedIn post on the topic got a fair bit of attention:

Agentic coding

The semantic diffusion of the term “agentic” happened with breathtaking speed, but in the context of software development it’s probably fair to say that an agentic tool is defined by its abilities to work in multiple files at the same time and to call external tools.

I started experimenting with Claude Code, implementing the same feature again that I had written with Windsurf in the summer. Of course, that’s the ultimate irony: rather than speeding me up the tools slowed me down because I couldn’t resist experimenting with them. That said, this time round I felt that, finally, the LLM-based tool was an actual help.

Parts of my experience are covered in my report on martinfowler.com. Or, if you understand German, you can watch me replay the experiment while chatting with Roland on his Never Code Alone channel.

What’s next?

During my time at Thoughtworks I used Copilot in commercial settings, too. And I was involved with the work on CodeConcise, an internal accelerator that helps teams understand large legacy codebases. With legacy modernisation in particular, I saw a lot of value in LLM-based tooling.

In my experience, though, the reality is far, far removed from visions of fleets of agents building complex systems out of thin air. Understanding what needs to be built is not something that the current breed of tools addresses in any sensible form, an idea I expressed in this post:

So, 2026 looks like another interesting year. Will the tooling evolve further? Will overblown expectations meet reality? And what will use of those tools cost in the end, once the vendors stop heavily discounting the cost of tokens in their race to “win” the market?

About me

I'm an experienced technologist and software engineer, always interested in emerging technologies and software excellence. more

Popular content

Github repositories