Top 30 AI Agents: Functions and Autonomy Compared

▼ Summary
– MIT’s CSAIL lab has published an AI Agent Index, analyzing the functionality and capabilities of 30 leading AI agents based on 1,350 data points.
– The three main categories of agents identified are enterprise workflow platforms, chat applications with agentic tools, and browser-based agents.
– The most common use cases for these agents are research/information synthesis and automating business workflows like HR, sales, and IT.
– Levels of autonomy vary significantly, from low-autonomy chat assistants to high-autonomy browser and enterprise agents that operate with minimal human intervention.
– The development of these AI agents is geographically concentrated, with most developers based in the United States and China.
A recent analysis from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) provides a detailed look at the current landscape of AI agents, categorizing their functions and comparing their levels of autonomy. The study, which examined over 1,300 data points, identifies the most impactful agents available to developers and users today, moving beyond the usual headline-grabbers to highlight specialized tools across enterprise, consumer, and developer domains.
The research team organized the leading agents into three primary categories. The largest segment, comprising 13 of the 30 systems reviewed, focuses on enterprise workflow automation. These platforms are designed to streamline business tasks across departments like HR, sales, and IT. Notable examples in this group include Microsoft 365 Copilot, ServiceNow AI Agents, and IBM watsonx Orchestrate. Following closely are chat applications enhanced with agentic tools. This category includes 12 systems, such as Anthropic Claude Code and OpenAI’s ChatGPT Agent, which integrate extensive tool access within conversational interfaces. The third category consists of five browser-based agents, like Perplexity Comet and ByteDance Agent TARS. These tools interact directly with web browsers and computer systems, performing tasks with a higher degree of background execution, which researchers note presents distinct risks compared to standard chat-based search tools.
When it comes to practical applications, research and information synthesis emerges as the most common use case, featured in 12 of the profiled agents. This functionality spans both consumer assistants and enterprise platforms. The second most prevalent use is workflow automation, enabled by 11 agents, primarily within enterprise products. Another significant function is automating graphical user interface (GUI) or browser tasks, such as form filling or online booking, which is present across seven different models.
The study reveals considerable variation in how independently these agents operate. Chat-first assistants like Google Gemini and the standard OpenAI ChatGPT exhibit the lowest autonomy levels, functioning in a turn-based manner where they execute a single action and await further user instruction. On the opposite end, browser agents offer higher autonomy with limited options for user intervention mid-task. For instance, Perplexity’s Comet operates autonomously once a query is submitted, completing its process before allowing user feedback. Enterprise platforms show a split in autonomy. Their setup often involves a manual design phase where users configure triggers and safeguards, sometimes with AI assistance. However, once deployed, agents like Microsoft 365 Copilot or Glean can run with high autonomy, triggered by events such as a new email or database update without requiring human involvement during execution.
A smaller subset of agents is designed for developers, operating through command-line interfaces (CLI). These tools, such as certain coding agents, typically require explicit user confirmation for sensitive operations like file edits. Some systems also offer a “watch mode” for real-time oversight of critical actions, providing a balance between automation and control. Geographically, the development of these advanced agents is heavily concentrated, with the United States and China serving as the primary hubs, while other regions have limited representation in the current ecosystem.
(Source: ZDNET)

