ai benchmarking

Artificial Intelligence

GPT-5 Matches Human Performance in Diverse Jobs, Says OpenAI

OpenAI's GDPval benchmark evaluates AI performance against human professionals in key economic sectors, showing models like GPT-5 and Claude Opus…

Read More »
AI & Tech

Google Launches Gemini AI for Advanced Parallel Reasoning

Google launched Gemini 2.5 Deep Think, its most advanced AI reasoning model, capable of solving complex problems by evaluating multiple…

Read More »
AI & Tech

Mathematicians Battle AI in Secret Showdown

But Glazer wanted to speed things up, so Epoch AI hosted the in-person meeting on Saturday, May 17, and Sunday,…

Read More »
Artificial Intelligence

QwenLong-L1 Outperforms LLMs in Long-Context Reasoning

Alibaba's QwenLong-L1 framework enables large language models to analyze lengthy documents (hundreds of thousands of tokens) with high accuracy, addressing…

Read More »
Artificial Intelligence

LM Arena Secures $100M for AI Leaderboards

LM Arena raised $100 million in seed funding at a $600 million valuation, led by Andreessen Horowitz and UC Investments,…

Read More »
Artificial Intelligence

Anthropic & Google Land OpenAI-Backed Harvey as Client

Harvey, a $3B legal AI startup, is expanding its partnerships to include AI models from Anthropic and Google, moving beyond…

Read More »