GPT-5.4: OpenAI’s Leap Toward Autonomous AI Agents

▼ Summary
– OpenAI has launched GPT-5.4, its latest AI model featuring advancements in reasoning, coding, and handling professional documents, and it is the first with native computer operation capabilities.
– The model can write code to operate computers and issue keyboard/mouse commands in response to screenshots, improving its ability to use web browsers and tools.
– GPT-5.4 is better at gathering and synthesizing information from multiple sources for complex questions and is claimed to be OpenAI’s most factual model yet.
– Within ChatGPT, the GPT-5.4 Thinking model provides an outline for complex queries and allows users to adjust their request mid-response, available now on web and Android.
– GPT-5.4 is rolling out across ChatGPT, Codex, and the API, with specialized versions like GPT-5.4 Thinking for Plus/Team/Pro users and GPT-5.4 Pro for enterprise and complex tasks.
OpenAI has unveiled GPT-5.4, a significant upgrade to its artificial intelligence platform that marks a major step toward creating autonomous digital assistants. This new model integrates enhanced reasoning, superior coding abilities, and professional proficiency with office software like spreadsheets and presentations. Most notably, it is OpenAI’s first model with native computer use capabilities, allowing it to operate a computer directly by executing tasks across various applications on a user’s behalf.
The rollout includes making GPT-5.4 available through its developer API and its specialized coding tool, Codex. For everyday users, a version called GPT-5.4 Thinking is being introduced to ChatGPT. This iteration can generate code to control computer functions and issue keyboard and mouse commands based on visual input from screenshots. The model also demonstrates improved performance when navigating web browsers and exhibits greater accuracy and efficiency in utilizing external tools and APIs to accomplish complex objectives.
A key advancement lies in the model’s refined research skills. It can more persistently gather and cross-reference information from multiple sources over several steps, which is particularly useful for finding obscure details in large datasets. The AI then synthesizes this gathered data into coherent and well-structured answers. OpenAI states that GPT-5.4 represents its “most factual model yet,” reporting that individual claims are 33 percent less likely to be false compared to those generated by the earlier GPT-5.2.
Within the ChatGPT interface, the GPT-5.4 Thinking model provides users with an outline of its step-by-step process for tackling complicated queries. This transparency allows individuals to adjust or refine their request while the model is still formulating its response. The approach is designed to help users steer the AI toward their desired outcome without needing to restart the entire conversation or go through numerous additional prompts. This interactive feature is currently accessible on the ChatGPT web platform and Android devices, with plans to extend it to the iOS application soon.
Availability of GPT-5.4 is being phased in across ChatGPT, Codex, and the API. The GPT-5.4 Thinking model is being offered to ChatGPT Plus, Team, and Pro subscribers. For demanding professional and academic needs, a high-performance variant called GPT-5.4 Pro is being released. This version is tailored for “maximum performance on complex tasks” and will be accessible via the API, as well as to users of ChatGPT Enterprise and Edu plans.
(Source: The Verge)





