Humans have been automating tasks for centuries. Now AI companies see a path to profit in tapping into our passion for efficiency, and they have a name for their solution: agents.
AI agents are autonomous programs that perform tasks, make decisions, and interact with environments with little human intervention. They are now the focus of all large companies dealing with AI. Microsoft has “copilots” designed to help companies automate things like customer service and administrative tasks. Google Cloud CEO Thomas Kurian recently outlined a pitch for six different AI productivity agents, and Google DeepMind just poached the co-head of OpenAI for its AI video product Sora to work on developing a simulation for training AI -Agents to work. Anthropic has released a feature for its AI chatbot Claude that allows anyone to create their own “AI assistant.” OpenAI includes agents as Stage 2 in its 5-stage approach to achieve AGI, or human-level artificial intelligence.
Obviously there are many autonomous systems in computer science. Many people have visited a website with a pop-up customer service bot, used an automated voice assistant feature like Alexa Skills, or written a humble IFTTT script. But AI companies argue that “agents” — it’s better not to call them bots — are different. Instead of following simple, memorized instructions, they believe agents will be able to interact with environments, learn from feedback, and make decisions without constant human input. They could perform tasks such as making purchases, booking trips or planning meetings, adapting to unforeseen circumstances and interacting with systems that could include humans and other AI tools.
Artificial intelligence companies hope agents will provide a way to monetize powerful, expensive AI models. Venture capital is pouring into AI agent startups that promise to revolutionize the way we interact with technology. Companies envision a leap in efficiency, with agents handling everything from customer service to data analysis. For individuals, AI companies are ushering in a new era of productivity by automating routine tasks, freeing up time for creative and strategic work. The end goal for true believers is to create an AI that is a true partner and not just a tool.
“What you really want,” said OpenAI CEO Sam Altman MIT Technology Review Earlier this year: “Is this exactly what will help you?” Altman described the killer AI app as a “super-competent colleague who knows absolutely everything about my entire life, every email, every conversation I’ve ever had “But it doesn't feel like an extension.” It can do simple tasks right away, Altman added, and for more complex tasks it tries them, but returns with questions when necessary. Technology companies have been trying to automate the personal assistant since at least the 1970s, and now they're promising they're finally getting closer to that goal.
At an OpenAI press event ahead of the company's annual Dev Day, Head of Developer Experience Romain Huet demonstrated the company's new real-time API with an assistant agent. Huet gave the agent a budget and some restrictions for purchasing 400 chocolate-covered strawberries and asked him to place an order via a phone call to a fictitious store.
The service is similar to a 2018 Google reservation bot called Duplex. However, this bot could only handle the simplest scenarios – it turned out that a quarter of its calls were actually made by humans.
While this order was placed in English, Huet told me that he gave a more complex demo in Tokyo: He asked an agent to book a hotel room for him in Japanese, conduct the conversation in Japanese, and then ask him to confirm in English to call back it's done. “Of course I wouldn’t understand the Japanese part – he just covers it,” Huet said.
But Huet's demonstration immediately sparked concern in the room full of journalists. Couldn't the AI assistant be used for spam calls? Why didn't it identify itself as an AI system? (Huet updated the demo for the official Dev Day, says one attendee, making the agent pose as “Romain's AI assistant.”) The unease was palpable and not surprising — even without agents, AI tools are already being used for deception.
There was another, arguably more immediate problem: the demo didn't work. The agent lacked sufficient information and incorrectly recorded the dessert flavors, resulting in him automatically inserting flavors like vanilla and strawberry into a column instead of saying he didn't have that information. Agents often encounter problems with multi-step workflows or unexpected scenarios. And they use more energy than a traditional bot or voice assistant. Their need for significant computing power, particularly when reasoning or interacting with multiple systems, makes large-scale operations costly.
AI agents offer a starting point potentialbut for everyday tasks they are not yet significantly better than bots, assistants or scripts. OpenAI and other labs aim to improve their reasoning through reinforcement learning hope Moore's Law continues to enable cheaper and more powerful computers.
So if AI agents aren't very useful yet, why is the idea so popular? In short: market pressure. These companies are sitting on powerful but expensive technology and are desperate to find practical use cases to exploit Also charge users fees. The gap between promise and reality also creates a compelling hype cycle that drives funding, and coincidentally, OpenAI raised $6.6 billion just as it started hyping agents.
AI agent startups have secured $8.2 billion in investor funding in the last 12 months
Big tech companies have rushed to incorporate all kinds of “AI” into their products, but are hoping that AI assistants in particular could be the key to unlocking revenue. Huet's AI calling demo exceeds what models can currently do at scale, but he told me he expects features like this to appear more frequently as early as next year as OpenAI refines its “foundational” o1 model.
Currently, the concept appears to be largely limited to enterprise software stacks rather than consumer products. Salesforce, a customer relationship management (CRM) software provider, launched an “agent” feature to much fanfare a few weeks before its annual Dreamforce conference. This feature allows customers to create a customer service chatbot using natural language in minutes via Slack, instead of spending a lot of time programming one. The chatbots have access to a company's CRM data and can process natural language more easily than a bot that doesn't rely on large language models, potentially making them better at limited tasks like asking questions about orders and returns.
AI agent startups (still an admittedly nebulous term) are already becoming quite a lively investment. According to PitchBook data, they secured $8.2 billion in investor money in the last 12 months, spread across 156 deals, an 81.4 percent increase year-over-year. One of the more well-known projects is Sierra, a customer service representative similar to Salesforce's latest project, created by former Salesforce co-CEO Bret Taylor. There's also Harvey, which offers AI agents for lawyers, and TaxGPT, an AI agent for handling your taxes.
Despite all the enthusiasm for agents, these high-risk missions raise a clear question: Can they really be trusted with something as serious as law or taxes? AI hallucinations, which have often tripped up ChatGPT users, are currently not in sight. Even more fundamental, as IBM presciently stated in 1979, “A computer can never be held accountable” – and as a consequence, “A computer must never make a management decision.” Instead of autonomous decision makers, AI assistants are best viewed as what they really are: powerful but imperfect tools for low-stakes tasks. Is this worth the big bucks AI companies hope people will get?
There is currently market pressure and AI companies are struggling to monetize. “I think 2025 will be the year that agent systems finally reach the mainstream,” said Kevin Weil, OpenAI’s new chief product officer, at the press event. “And if we do it right, we'll get to a world where we can actually spend more time doing the human things that matter and a little less time staring at our phones.”