Study: AI Coding Tools Don’t Boost All Developers’ Speed

▼ Summary
– AI coding tools like Cursor and GitHub Copilot, powered by models from OpenAI and others, promise productivity gains by automating coding tasks.
– A METR study found AI tools increased task completion time by 19% for experienced developers, contrary to their 24% expected reduction.
– Only 56% of participants had prior experience with Cursor, though 94% had used web-based LLMs, and all received training for the study.
– Researchers suggest AI slowed developers due to time spent prompting and waiting for responses, especially in complex codebases.
– While AI tools have improved, the study cautions against assuming universal productivity gains, noting potential errors and security risks.
The impact of AI coding assistants on developer productivity isn’t as straightforward as many assume, according to fresh research examining how these tools perform in real-world scenarios. While platforms like GitHub Copilot and Cursor have become staples in modern software development, promising faster coding and fewer errors, their effectiveness varies significantly depending on the developer’s experience and project complexity.
A recent controlled experiment conducted by AI research organization METR revealed unexpected results. The study involved 16 seasoned open-source contributors working on 246 actual coding tasks from their regular projects. Half the assignments permitted the use of advanced AI tools like Cursor Pro, while the other half required manual coding. Contrary to expectations, developers working with AI assistance took 19% longer to complete tasks, a stark contrast to their initial prediction that AI would cut their workload by nearly a quarter.
One key factor behind the slowdown appears to be the overhead of interacting with AI systems. Developers spent considerable time refining prompts and waiting for responses rather than actively writing code. Additionally, AI tools struggled with large, intricate codebases, precisely the type of projects used in the study. While 94% of participants had prior experience with web-based large language models, only 56% were familiar with Cursor, suggesting a learning curve may have contributed to inefficiencies.
The findings don’t entirely dismiss AI’s potential. Previous large-scale studies have demonstrated measurable productivity gains, and METR acknowledges that AI capabilities evolve rapidly, what holds true today might not in just a few months. However, the research highlights that AI isn’t a universal shortcut for experienced developers, particularly when handling complex systems where manual oversight remains critical.
Beyond speed concerns, other studies have flagged risks like AI-generated errors and security vulnerabilities, reinforcing the need for cautious adoption. While these tools continue to improve, developers should approach them as supplements rather than replacements for human expertise, especially in high-stakes environments. The study serves as a reminder that technological advancements, no matter how impressive, don’t always translate to immediate efficiency gains in practice.
(Source: TechCrunch)