ai benchmarks

Tracking AI’s Rise and the Future of Nuclear Power

February 6, 2026

Blue tile art of a ship sailing on waves with a rising graph.

Recent AI models like Claude Opus 4.5 are advancing faster than predicted, but true capability requires careful evaluation beyond dramatic…

Artificial Intelligence

Amazon Bets Against AI Benchmark Obsession

December 3, 2025

Man in suit gestures while speaking at Web Summit conference.

Amazon's SVP of AGI, Rohit Prasad, criticizes the AI industry's focus on standardized benchmarks, arguing they are noisy and fail…

AI & Tech

New AI Benchmark Tests Chatbots’ Commitment to Human Wellbeing

November 24, 2025

Woman studies book while using ChatGPT on her phone.

HumaneBench is a new evaluation framework designed to systematically measure AI chatbots' impact on user welfare, focusing on principles like…

AI & Tech

Google’s Gemini 3: Smarter, Faster, and Free

November 20, 2025

Google has launched Gemini 3, a free, smarter, and faster AI model that enhances user interactions across its ecosystem with…

AI & Tech

China’s Open-Source Triumph in AI: Kimi K2 Thinking Rewrites the Rules

November 9, 2025

KIMI K2 Thinking text overlayed on a circuit board graphic with an idea lightbulb.

The new AI frontier is open-source, and it's being driven from the East. Moonshot AI’s Kimi K2 Thinking, an affordable…

Artificial Intelligence

China’s Free AI Model Outperforms GPT-5 and Sonnet 4.5

November 9, 2025

Dramatic red and dark clouds with a circular overlay.

Moonshot's new open-source AI model, Kimi K2 Thinking, claims to outperform top proprietary models like GPT-5 and Claude Sonnet 4.5…

AI & Tech

DeepSeek V3.1: The Most Powerful Open AI Model Yet

August 20, 2025

Abstract art depicting a whale formed from computer code, superimposed on a textured Chinese flag.

DeepSeek V3.1 is a powerful 685-billion parameter open-source AI model that matches the performance of top proprietary systems, challenging the…

AI & Tech

OpenAI Unveils GPT-5 & New Models: AI-Powered Software Creation

August 8, 2025

Diverse team of tech superheroes in blue costumes, holding devices, against a cityscape. Number '5' is on their suits.

OpenAI launched the GPT-5 model family, featuring four variants (GPT-5, GPT-5 Mini, GPT-5 Nano, and GPT-5 Pro) tailored for different…

Artificial Intelligence

Mixed Reactions to OpenAI’s New Open Source GPT Models

August 7, 2025

A stylized painting of four diverse people reacting to something on a laptop. Warm, expressive colors and brushstrokes.

OpenAI released its first major open-source models since 2019, gpt-oss-120B and gpt-oss-20B, sparking debate over their competitive performance in benchmarks…

Artificial Intelligence

OpenAI Unveils Two New Open-Source AI Reasoning Models

August 6, 2025

OpenAI has released two open-source AI reasoning models (gpt-oss-120b and gpt-oss-20b), marking its first major open-weight release since GPT-2 and…

AI & Tech

OpenAI Launches Open-Source GPT Models: GPT-OSS-120B & GPT-OSS-20B

August 6, 2025

A young man proudly displays a technologically advanced plant to a cheering crowd in a field.

OpenAI released two open-source language models, GPT-OSS-120B and GPT-OSS-20B, offering high performance and accessibility for enterprises and developers. The models…

Artificial Intelligence

Exaone Ecosystem Grows With Cutting-Edge AI Models

August 4, 2025

Diagram showing Exaone's AI platform, featuring interconnected modules for vision language, chat, data foundry, and path 2.0.

LG AI Research launched Exaone 4.0, a hybrid reasoning AI model excelling in science, math, and coding benchmarks, outperforming competitors…

Artificial Intelligence

Elon Musk’s xAI Unveils Grok 4 with $300/Month Subscription

July 10, 2025

Elon Musk's xAI launched "Grok 4" and Grok 4 Heavy, its most advanced AI models, alongside a $300/month SuperGrok Heavy…

AI & Tech

AGI Debate Divides Microsoft and OpenAI

July 9, 2025

Silhouette of a person with circuit board patterns emanating from their open head, symbolizing AI integration or transhumanism.

Microsoft and OpenAI disagree on defining AGI, with some reports linking it to a $100 billion profit benchmark, highlighting widespread…

Artificial Intelligence

DeepSeek R1-0528 Rivals OpenAI & Gemini in Open Source AI

May 29, 2025

A stylized 3D rendering of a sperm whale breaching, surrounded by a digital matrix of binary code and circuit lines.

DeepSeek's new open-source AI model DeepSeek-R1-0528 rivals premium models like OpenAI and Google Gemini, offering advanced reasoning capabilities while remaining…

Artificial Intelligence