Is Qwen2.5 the Meta AI LLaMA of China?
Qwen2.5 is a party of 14 foundation models created by Alibaba Cloud. Three of these models are math-specific LLMs, three are coding-specific, and the rest are general purpose.
Qwen2.5-Coder-1.5B, Qwen2.5-Coder-7B, and Qwen2.5-Coder-32B were trained on 5.5 trillion tokens of source code, text-code grounding data, and synthetic data. They support up to 128,000 tokens of context, cover 92 programming languages, and do reasonably well on code generation, code completion, and code repair. The open-source 7B version of Qwen2.5-Coder has outperformed larger models like DeepSeek-Coder-V2-Lite and CodeStral-22B on a number of benchmarks.
The base models (Qwen2.5-0.5B, Qwen2.5-1.5B, Qwen2.5-3B, Qwen2.5-7B, Qwen2.5-14B, Qwen2.5-32B, and Qwen2.5-72B) were pre-trained on a dataset of 18 trillion tokens, resulting in Qwen2.5-72B significantly outperforming its peers across a wide range of tasks, achieving results similar to LLaMA-3-405B while utilizing only 1/5th the parameters.
Ещё видео!