Agents Everywhere, Apparently
For the last few years companies have been working hard to convince us that AI is going to replace huge swathes of jobs and workers. This isn't cynical marketing from a few outliers, it seems to be a genuine belief held at the top of some of the biggest companies in the world, and there's enough real-world output to make the claim sound plausible on paper. Yet every time I actually sit and use one of these tools, or watch someone else use one, I see the same thing happen. It makes things up. It gets sidetracked. It forgets simple instructions halfway through a task. It builds the wrong thing entirely and then presents it back to you with full confidence.
So where exactly are all these replaced jobs going to come from, when the tool that's supposed to be doing the replacing can't reliably finish what you asked it to do without a person sat next to it checking the output?
The latest version of this story is agents. The world-changing breakthrough of feeding an LLM a constant stream of screenshots and asking it what to do next. Burning tokens, and water, and electricity at a frankly alarming rate, all in the name of replacing human tasks that a person could have done in half the time without setting fire to a lake in the process. I've been pitched a few of these now, sat through the slick demos and the carefully chosen scenarios, and the reality of what's being sold is a complete fantasy. They all need human oversight. They all get things wrong a meaningful chunk of the time. Some of the numbers being quoted internally are around twenty percent failure rates, and that's the figure the people selling them are willing to admit to.
A twenty percent failure rate on a human worker would get them sacked. On an AI agent, it's apparently a triumph of innovation worth billions in investment. The maths only works if you're allowed to ignore everything that goes wrong, and right now an awful lot of the industry is happy to do exactly that. We've convinced ourselves these tools are more effective than they actually are.
What keeps getting glossed over is that the oversight isn't a small layer on top, it's most of the work. If I have to read every email the agent drafts, check every form it fills in, verify every decision it tried to make on my behalf, then I've not been replaced. I've been turned into a quality controller for a machine that produces output faster than I can meaningfully check it. That isn't a productivity gain. That's just a different shape of the same job, with worse pay and more frustration.
I'm not anti-AI. I use these tools all the time, and there are bits of my work where they genuinely help. The point isn't that the technology is useless. The point is that the gap between what's being sold and what's being delivered is enormous, and somewhere in that gap is the assumption that millions of people are about to lose their jobs to a thing that can't currently be trusted to book a meeting room without inventing a building that doesn't exist.
Maybe the agents will get there eventually. Maybe in five years the failure rate is down to something that justifies the hype. Right now, though, the pitch and the product are not the same thing, and I'm tired of pretending otherwise every time someone wheels out another screenshot loop and calls it the future of work.