Topic: mixture- -experts architecture
-
Alibaba Launches Qwen3-Coder: Its Most Powerful AI Coding Model Yet
Alibaba has launched Qwen3-Coder, an advanced open-source AI coding model designed for complex development workflows and automated programming. The model features a 480-billion-parameter Mixture-of-Experts framework with a 256K-token context window, expandable to 1 million tokens, enabling high e...
Read More » -
OpenAI Unveils Two New Open-Source AI Reasoning Models
OpenAI has released two open-source AI reasoning models (gpt-oss-120b and gpt-oss-20b), marking its first major open-weight release since GPT-2 and signaling a strategic shift amid competition from Chinese AI labs. The models outperform many open-source alternatives in benchmarks like Codeforces ...
Read More » -
DeepSeek Prover AI Model Boosts Math Capabilities
DeepSeek released Prover V2, an upgraded AI model with 671 billion parameters and a mixture-of-experts architecture, enhancing its ability to solve complex mathematical proofs. The Prover model, initially launched in August, focuses on formal theorem proving and advanced mathematica...
Read More » -
Alibaba Launches Qwen3: Hybrid AI Reasoning Models
Alibaba has unveiled Qwen3, a powerful new family of AI models that challenges leading systems from global tech giants. The Chinese company claims these ...
Read More »