Blog : [ Ссылка ]
Gemma 2
1. It comes up in two sizes 9B and 27B.
2. Gemma 2 27B approaches Meta Llama 3 70B performance.
3. It was trained on 13T tokens.
4. They have released both base and instruction versions.
5. It comes up with 8192 Context window size.
6. Major techniques used in architecture are Sliding window attention, logit soft-capping and Grouped-query attention
7. Major Techniques used in training are SFT, Distillation, RLHF and Model merging.
8. It was trained on Google TPU's
9. Its Commercial use is allowed.
10. It is available on both Hugging Face and Ollama.
#gemma2 #google #huggingface #ollama #llm #generativeai #opensource #opensourcecommunity
Ещё видео!