AI & TechArtificial IntelligenceNewswireScienceTechnology

OpenAI’s Push for Fully Automated Research

▼ Summary

– AI researchers are excited about extending the success of coding agents like Codex to broader, long-running scientific research.
– OpenAI’s Pachocki believes continuous improvements in general model capability, like the leap from GPT-3 to GPT-4, naturally extend the duration models can work on problems.
– Training models with “reasoning” techniques, such as step-by-step problem-solving and backtracking, further increases their ability to work for longer periods.
– OpenAI specifically trains models on complex tasks like hard puzzles to teach them skills like managing large text chunks and breaking problems into subtasks.
– The current focus is on applying Codex-like problem-solving to real-world research, not specialized tasks like creating an automated mathematician, as demonstrated by GPT-5 solving unsolved math and science problems.

The prospect of autonomous AI research agents is generating significant excitement within the scientific community. This momentum is largely fueled by the impressive capabilities of current coding agents, which can handle substantial programming tasks. Their success raises a compelling question: can this delegation of complex work extend beyond software into broader scientific discovery? According to OpenAI’s John Pachocki, the answer is a definitive yes, viewing it as a natural progression of existing technological paths.

Pachocki notes that a general increase in model capability directly translates to greater independent operational endurance. He highlights the evolution from GPT-3 to GPT-4, where the latter demonstrated a far superior ability to work on a single problem for extended durations without specialized training. The development of reasoning models has provided another significant boost. By training large language models to tackle problems through step-by-step logic and backtracking from errors, researchers have enhanced their capacity for sustained, complex thought processes. Pachocki is confident this trajectory of improvement will continue.

To specifically cultivate this autonomy, OpenAI is training its systems with complex task samples, such as challenging puzzles from math and coding competitions. These tasks compel models to develop advanced skills, including managing extensive text contexts and deconstructing large problems into manageable subtasks. The objective, however, is not merely to create champion puzzle-solvers. As Pachocki explains, these exercises validate the underlying technology before real-world application. While building an automated mathematician is technically within reach, he states it is not a current priority, as the focus has shifted to more urgent, practical applications.

The company’s immediate goal is to generalize the problem-solving prowess demonstrated by Codex in programming to other domains. Pachocki observes a fundamental shift in how programmers work, moving from constant manual code editing to managing teams of AI agents. This paradigm suggests that if an AI can navigate the structured logic of coding, its problem-solving framework can be adapted elsewhere. Recent achievements support this vision. Researchers utilizing GPT-5, the model behind Codex, have reported breakthroughs, employing it to find novel solutions to previously unsolved mathematical problems and to overcome stubborn obstacles in biology, chemistry, and physics research.

(Source: MIT Technology Review)

Topics

ai research systems 95% coding agents 93% model capability boost 90% gpt evolution 88% reasoning models 87% complex task training 85% automated mathematician 82% real-world research focus 80% problem-solving generalization 78% programming transformation 75%