Amazon's SVP of AGI, Rohit Prasad, criticizes the AI industry's focus on standardized benchmarks, arguing they are noisy and fail…
Read More »ai benchmarks
HumaneBench is a new evaluation framework designed to systematically measure AI chatbots' impact on user welfare, focusing on principles like…
Read More »Google has launched Gemini 3, a free, smarter, and faster AI model that enhances user interactions across its ecosystem with…
Read More »The new AI frontier is open-source, and it's being driven from the East. Moonshot AI’s Kimi K2 Thinking, an affordable…
Read More »Moonshot's new open-source AI model, Kimi K2 Thinking, claims to outperform top proprietary models like GPT-5 and Claude Sonnet 4.5…
Read More »DeepSeek V3.1 is a powerful 685-billion parameter open-source AI model that matches the performance of top proprietary systems, challenging the…
Read More »OpenAI launched the GPT-5 model family, featuring four variants (GPT-5, GPT-5 Mini, GPT-5 Nano, and GPT-5 Pro) tailored for different…
Read More »OpenAI released its first major open-source models since 2019, gpt-oss-120B and gpt-oss-20B, sparking debate over their competitive performance in benchmarks…
Read More »OpenAI has released two open-source AI reasoning models (gpt-oss-120b and gpt-oss-20b), marking its first major open-weight release since GPT-2 and…
Read More »OpenAI released two open-source language models, GPT-OSS-120B and GPT-OSS-20B, offering high performance and accessibility for enterprises and developers. The models…
Read More »LG AI Research launched Exaone 4.0, a hybrid reasoning AI model excelling in science, math, and coding benchmarks, outperforming competitors…
Read More »Elon Musk's xAI launched "Grok 4" and Grok 4 Heavy, its most advanced AI models, alongside a $300/month SuperGrok Heavy…
Read More »Microsoft and OpenAI disagree on defining AGI, with some reports linking it to a $100 billion profit benchmark, highlighting widespread…
Read More »DeepSeek's new open-source AI model DeepSeek-R1-0528 rivals premium models like OpenAI and Google Gemini, offering advanced reasoning capabilities while remaining…
Read More »OpenAI has upgraded ChatGPT Pro with the new o3 reasoning model, enhancing capabilities like Operator, its autonomous web browsing tool,…
Read More »AI2's Olmo 2 1B, a 1-billion-parameter AI model, outperforms similar-sized models from Google, Meta, and Alibaba across benchmarks while being…
Read More »














