Why AI Agents Still Can’t Replace Freelancers

▼ Summary
– A new study found top AI agents can only automate less than 3% of tasks required by the average freelancer, failing to complete most projects at an acceptable level.
– The study established the Remote Labor Index (RLI) as a benchmark to measure AI’s ability to perform economically valuable remote freelance work.
– Six leading AI agents, including Google’s Gemini 2.5 Pro and OpenAI’s GPT-5, were evaluated across 23 freelance work categories using real project briefs from platforms like Upwork.
– The highest-performing AI agent, Manus, achieved only a 2.5% automation rate, revealing a significant gap between AI’s promised capabilities and current performance.
– Despite rapid AI development, human freelancers currently have little reason to fear job replacement, as most work requires complex skills beyond current AI capabilities.
For freelance professionals concerned about artificial intelligence taking over their work, a recent study offers considerable reassurance. New research reveals that today’s most sophisticated AI agents can only handle a tiny fraction, less than 3%, of the tasks typically managed by independent contractors. This finding comes from an evaluation of systems including Google’s Gemini 2.5 Pro and OpenAI’s GPT-5, which were tested using a new benchmark called the Remote Labor Index (RLI).
The RLI was specifically developed to measure how effectively AI can perform economically valuable remote work. This benchmark provides a qualitative framework to assess AI capabilities against the backdrop of ambitious predictions from tech leaders about AI’s potential to disrupt labor markets. The study, which has not yet undergone peer review, was posted on the preprint server arXiv.
Researchers evaluated six leading AI agents across 23 distinct categories of freelance work. These categories, identified through platforms like Upwork, included graphic design, product design, computer-aided design, and game development. Each AI was given a project brief along with any necessary files, and its final deliverables were then manually compared against work produced by human freelancers. The central question was whether an AI’s output would be acceptable to a reasonable client as commissioned work.
The results were telling. The top-performing model, Manus, achieved an automation rate of just 2.5%. Grok 4 and Claude Sonnet 2.5 followed closely, each scoring 2.1%. These figures highlight that even the most advanced agents are currently far from being able to autonomously manage the diverse and complex demands of real remote labor.
This research underscores a critical point: human labor involves a complex blend of technical proficiency and interpersonal skills that AI has not yet mastered. Freelance work, in particular, demands a high degree of self-sufficiency, organization, and the ability to negotiate and communicate with clients, capabilities that the RLI benchmark does not even fully capture. The gap between the promise of AI and its current practical application in the freelance economy remains substantial.
It’s important to recognize that AI agents represent a significant step beyond simple chatbots. They are designed to interact with digital tools and execute multi-step tasks, positioning them as precursors to more advanced artificial general intelligence (AGI). AGI is often loosely defined as a system capable of matching or exceeding human performance on any economically valuable task. If that definition holds, the RLI study suggests we are still very far from achieving such a milestone.
While the capabilities of AI agents are advancing quickly, and tech companies are investing heavily in their development, the present reality is clear. For now, freelancers can operate with confidence that their roles remain secure. The nuanced, dynamic, and multifaceted nature of their work presents a challenge that today’s AI simply cannot meet. The prospect of companies hiring AI freelancers may belong to a future five or ten years away, but for the moment, human contractors have little reason to fear being replaced.
(Source: ZDNET)





