Phi-2 is the latest small language model from Microsoft. First model in the Phi series, the 1.3 billion parameter Phi-1, achieved state-of-the-art performance on Python coding among existing SLMs (specifically on the HumanEval and MBPP benchmarks). Extending the focus to common sense reasoning and language understanding, a new 1.3 billion parameter model named Phi-1.5 provided performance comparable to models 5x larger. Phi-2 is a 2.7 billion-parameter language model that demonstrates outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters. On complex benchmarks Phi-2 matches or outperforms models up to 25x larger, thanks to new innovations in model scaling and training data curation.
In this video, I will talk about the following: Why is Phi-2 important? A recap of Phi-1 and Phi-1.5. Phi-1.5 vs Phi-2. How is Phi-2 trained?
Phi-1 video: [ Ссылка ]
For more details, please look at [ Ссылка ] and [ Ссылка ] and [ Ссылка ]
Phi-2: The surprising power of small language models. Mojan Javaheripi and Sébastien Bubeck. December 12, 2023. Microsoft Research Blog.
Phi-1 paper: Gunasekar, Suriya, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi et al. "Textbooks Are All You Need." arXiv:2306.11644 (2023).
Phi-1.5 paper: Li, Yuanzhi, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar, and Yin Tat Lee. "Textbooks are all you need ii: phi-1.5 technical report." arXiv preprint arXiv:2309.05463 (2023). [ Ссылка ]
Ещё видео!