While many early large language models were predominantly trained on English language data, the field is rapidly evolving. Newer models are increasingly being trained on multilingual datasets, and there's a growing focus on developing models specifically for the world’s languages. However, challenges remain in ensuring equitable representation and performance across diverse languages, particularly those with less available data and computational resources.
Gemma, Google's family of open models, is designed to address these challenges by enabling the development of projects in non-Germanic languages. Its tokenizer and large token vocabulary make it particularly well-suited for handling diverse languages. Watch how developers in India used Gemma to create Navarasa — a fine-tuned Gemma model for Indic languages.
Watch the full keynote: [ Ссылка ]
To watch this keynote with American Sign Language (ASL) interpretation, please click here: [ Ссылка ]
#GoogleIO #GoogleIO2024
Subscribe to our Channel: [ Ссылка ]
Find us on X: [ Ссылка ]
Watch us on TikTok: [ Ссылка ]
Follow us on Instagram: [ Ссылка ]
Join us on Facebook: [ Ссылка ]
Ещё видео!