Topic: ai model release

  • GPT-5.4: OpenAI's Leap Toward Autonomous AI Agents

    GPT-5.4: OpenAI's Leap Toward Autonomous AI Agents

    OpenAI has launched GPT-5.4, a major AI upgrade featuring native computer use capabilities, allowing it to directly operate a computer and applications on a user's behalf. The model offers enhanced reasoning, superior coding, and professional software proficiency, with a "Thinking" version in Cha...

    Read More »
  • OpenAI's GPT-5.1 Introduces 8 Custom AI Personalities

    OpenAI's GPT-5.1 Introduces 8 Custom AI Personalities

    OpenAI has released GPT-5.1 Instant and GPT-5.1 Thinking, which are more responsive and personable models designed to address past criticisms of excessive agreeableness and to handle different types of queries effectively. The new models feature eight preset personalities for varied interaction s...

    Read More »
  • GPT-5.2 vs. Gemini 3: Can It Finally Surpass the Competition?

    GPT-5.2 vs. Gemini 3: Can It Finally Surpass the Competition?

    OpenAI has released **GPT-5.2**, a model designed for professional knowledge work, claiming it is their most capable yet for enhancing workplace productivity and was accelerated to compete with rivals like Google and Anthropic. The model shows significant performance gains, scoring 70.9% on an ec...

    Read More »
  • Open-Source AI Coding Model Rivals Proprietary Options

    Open-Source AI Coding Model Rivals Proprietary Options

    Mistral AI has launched Devstral 2, a powerful open-source AI coding model that achieves a 72.2% score on the SWE-bench benchmark, positioning it as a strong competitor to proprietary tools. The release includes the Mistral Vibe command-line tool for project-wide AI assistance and a smaller, loca...

    Read More »
  • Gemini 3 Leads the AI Race - For Now

    Gemini 3 Leads the AI Race - For Now

    Google's Gemini 3 has achieved record-breaking performance and adoption, quickly topping AI leaderboards and attracting over one million users in its first day, setting a new industry benchmark. The model demonstrates a clear lead in key areas like coding, mathematics, and visual comprehension, w...

    Read More »
  • China's Free AI Model Outperforms GPT-5 and Sonnet 4.5

    China's Free AI Model Outperforms GPT-5 and Sonnet 4.5

    Moonshot's new open-source AI model, Kimi K2 Thinking, claims to outperform top proprietary models like GPT-5 and Claude Sonnet 4.5 on key benchmarks including reasoning and information retrieval. The model is freely available, trained for just $4.6 million, and uses a Mixture-of-Experts architec...

    Read More »
  • Claude Haiku 4.5 matches top AI models at a fraction of the cost

    Claude Haiku 4.5 matches top AI models at a fraction of the cost

    Anthropic released Claude Haiku 4.5, a compact AI model that matches the performance of its earlier Sonnet 4 model while being faster and one-third the cost. The model is designed for efficient coding assistance and rivals top-tier models in specific tasks but lacks the extensive general knowledg...

    Read More »
  • Nous Research Launches Hermes 4 AI, Outperforming ChatGPT Without Restrictions

    Nous Research Launches Hermes 4 AI, Outperforming ChatGPT Without Restrictions

    Hermes 4 is a family of open-source large language models that challenges proprietary AI systems by offering comparable performance with fewer content restrictions and greater user control. It introduces a hybrid reasoning feature for transparency in problem-solving and achieves top-tier results,...

    Read More »
  • The Most Misunderstood Graph in AI Explained

    The Most Misunderstood Graph in AI Explained

    METR's report on Claude Opus 4.5's performance, suggesting it could handle tasks estimated to take humans up to five hours, sparked intense discussion and alarm within the AI community. The findings are highly uncertain, with METR emphasizing substantial error bars in its estimates and clarifying...

    Read More »