Topic: mixture- -experts architecture

Alibaba Launches Qwen3-Coder: Its Most Powerful AI Coding Model Yet

July 24, 2025

85%

Alibaba Launches Qwen3-Coder: Its Most Powerful AI Coding Model Yet

Alibaba has launched Qwen3-Coder, an advanced open-source AI coding model designed for complex development workflows and automated programming. The model features a 480-billion-parameter Mixture-of-Experts framework with a 256K-token context window, expandable to 1 million tokens, enabling high e...

OpenAI Unveils Two New Open-Source AI Reasoning Models

August 6, 2025

70%

OpenAI Unveils Two New Open-Source AI Reasoning Models

OpenAI has released two open-source AI reasoning models (gpt-oss-120b and gpt-oss-20b), marking its first major open-weight release since GPT-2 and signaling a strategic shift amid competition from Chinese AI labs. The models outperform many open-source alternatives in benchmarks like Codeforces ...

DeepSeek Prover AI Model Boosts Math Capabilities

April 30, 2025

70%

DeepSeek Prover AI Model Boosts Math Capabilities

DeepSeek released Prover V2, an upgraded AI model with 671 billion parameters and a mixture-of-experts architecture, enhancing its ability to solve complex mathematical proofs. The Prover model, initially launched in August, focuses on formal theorem proving and advanced mathematica...

Alibaba Launches Qwen3: Hybrid AI Reasoning Models

April 29, 2025

60%

Alibaba Launches Qwen3: Hybrid AI Reasoning Models

Alibaba has unveiled Qwen3, a powerful new family of AI models that challenges leading systems from global tech giants. The Chinese company claims these ...