AINewswire

OpenAI’s New o3, o4-mini Models ‘Think With Images’

▼ Summary

OpenAI introduced two new AI models, o3 and o4-mini, on April 16th, focusing on enhanced reasoning capabilities.
– The models can natively process and reason about visual information, allowing users to upload and analyze photos, diagrams, and screenshots.
– o3 is the most powerful model, excelling in complex tasks such as coding, math, and visual perception, while o4-mini is a faster, cost-efficient alternative optimized for speed and high volumes of requests.
– Both models can use tools within ChatGPT, like web search and Python code execution, to solve multi-step problems independently.
– The new models are available to paying ChatGPT users and developers via API, with free users accessing reasoning capabilities through a “Think” option.

OpenAI has unveiled its latest AI developments, introducing two new models, o3 and o4-mini, belonging to its “o-series” focused on enhanced reasoning capabilities. Announced April 16th, these models represent a significant step forward, particularly in their ability to natively process and reason about visual information, alongside improvements in areas like coding and math.

Reasoning and Visual Integration

The core advancement highlighted with o3 and o4-mini is their capacity to “think with images.” Unlike previous models that might simply identify objects, these new systems can incorporate visual data directly into their reasoning process, or chain-of-thought. According to OpenAI, this means users can upload photos, diagrams (even blurry or imperfect ones), or screenshots, and the models can analyze, interpret, and use that visual information to solve problems or answer queries. For example, o3 could analyze a complex scientific poster image and potentially draw conclusions not explicitly stated in the text. This integration allows the models to manipulate images internally – cropping, zooming, rotating – as part of their problem-solving, without relying on separate specialized tools.

READ ALSO  Mike Krieger Shares Insights on AI's Future Value and Challenges on 20VC Podcast

Meet the Models: o3 and o4-mini

OpenAI positions o3 as its most powerful reasoning model to date, pushing performance boundaries in complex tasks involving coding, math, science, and visual perception. It reportedly makes fewer errors than previous models on difficult tasks and excels at analyzing charts and graphics. o4-mini is designed as a smaller, faster, and more cost-efficient alternative. While still highly capable, especially in math, coding, and visual tasks, it’s optimized for scenarios requiring speed or handling high volumes of requests. Both models are also described as being able to “agentically” use all available tools within ChatGPT – like web search, Python code execution, and image generation – to tackle multi-step problems more independently.

Availability and Context

These new reasoning models are being rolled out, starting with availability for paying ChatGPT users (Plus, Pro, Team) and via the API for developers. Free users can reportedly access the reasoning capability via a “Think” option. This release follows closely on other recent OpenAI announcements, including the GPT-4.1 family of models, indicating a rapid pace of development. The ability to deeply integrate visual understanding into language models is a key area of advancement across the AI industry, and the o3 and o4-mini models represent OpenAI’s latest contribution to this evolving landscape, aiming for AI that interacts with information more comprehensively.

READ ALSO  Major AI Breakthroughs and News You Need to Know

Topics

openais new models 100% gpt-o3 gpt-o4-mini 95% multimodal capability 90% image understanding 85% ai competition 75% ai accessibility 70% responsible ai deployment 65% ai 0%
Show More

The Wiz

Wiz Consults, home of the Internet is led by "the twins", Wajdi & Karim, experienced professionals who are passionate about helping businesses succeed in the digital world. With over 20 years of experience in the industry, they specialize in digital publishing and marketing, and have a proven track record of delivering results for their clients.