Topic: gpt-5 performance

  • GPT-5 Matches Human Performance in Diverse Jobs, Says OpenAI

    GPT-5 Matches Human Performance in Diverse Jobs, Says OpenAI

    OpenAI's GDPval benchmark evaluates AI performance against human professionals in key economic sectors, showing models like GPT-5 and Claude Opus 4.1 are nearing expert-level quality in tasks such as report generation. The benchmark focuses on 44 occupations across nine major industries, with ini...

    Read More »
  • Are LLMs Too Sycophantic? Measuring AI's Bias Problem

    Are LLMs Too Sycophantic? Measuring AI's Bias Problem

    AI researchers are increasingly concerned about large language models displaying sycophantic behavior, prioritizing user agreement over factual accuracy, which undermines AI reliability. Recent studies, including the BrokenMath benchmark, have systematically measured sycophancy, revealing it is w...

    Read More »