👉 Need help with AI? Reach out: [ Ссылка ]
This is the 6th video in a series on using large language models (LLMs) in practice. Here, I review key aspects of developing a foundation LLM based on the development of models such as GPT-3, Llama, Falcon, and beyond.
More Resources:
👉 Series Playlist: [ Ссылка ]
📰 Read more: [ Ссылка ]
[1] BloombergGPT: [ Ссылка ]
[2] Llama 2: [ Ссылка ]
[3] LLM Energy Costs: [ Ссылка ]
[4] arXiv:2005.14165 [cs.CL]
[5] Falcon 180b Blog: [ Ссылка ]
[6] arXiv:2101.00027 [cs.CL]
[7] Alpaca Repo: [ Ссылка ]
[8] arXiv:2303.18223 [cs.CL]
[9] arXiv:2112.11446 [cs.CL]
[10] arXiv:1508.07909 [cs.CL]
[11] SentencePience: [ Ссылка ]
[12] Tokenizers Doc: [ Ссылка ]
[13] arXiv:1706.03762 [cs.CL]
[14] Andrej Karpathy Lecture: [ Ссылка ]
[15] Hugging Face NLP Course: [ Ссылка ]
[16] arXiv:1810.04805 [cs.CL]
[17] arXiv:1910.13461 [cs.CL]
[18] arXiv:1603.05027 [cs.CV]
[19] arXiv:1607.06450 [stat.ML]
[20] arXiv:1803.02155 [cs.CL]
[21] arXiv:2203.15556 [cs.CL]
[22] Trained with Mixed Precision Nvidia: [ Ссылка ]
[23] DeepSpeed Doc: [ Ссылка ]
[24] [ Ссылка ]
[25] [ Ссылка ]
[26] arXiv:2001.08361 [cs.LG]
[27] arXiv:1803.05457 [cs.AI]
[28] arXiv:1905.07830 [cs.CL]
[29] arXiv:2009.03300 [cs.CY]
[30] arXiv:2109.07958 [cs.CL]
[31] [ Ссылка ]
[32] [ Ссылка ]
--
Book a call: [ Ссылка ]
Socials
[ Ссылка ]
[ Ссылка ]
[ Ссылка ]
[ Ссылка ]
The Data Entrepreneurs
🎥 YouTube: [ Ссылка ]
👉 Discord: [ Ссылка ]
📰 Medium: [ Ссылка ]
📅 Events: [ Ссылка ]
🗞️ Newsletter: [ Ссылка ]
Support ❤️
[ Ссылка ]
Intro - 0:00
How much does it cost? - 1:30
4 Key Steps - 3:55
Step 1: Data Curation - 4:19
1.1: Data Sources - 5:31
1.2: Data Diversity - 7:45
1.3: Data Preparation - 9:06
Step 2: Model Architecture (Transformers) - 13:17
2.1: 3 Types of Transformers - 15:13
2.2: Other Design Choices - 18:27
2.3: How big do I make it? - 22:45
Step 3: Training at Scale - 24:20
3.1: Training Stability - 26:52
3.2: Hyperparameters - 28:06
Step 4: Evaluation - 29:14
4.1: Multiple-choice Tasks - 30:22
4.2: Open-ended Tasks - 32:59
What's next? - 34:31
How to Build an LLM from Scratch | An Overview
Теги
large language modelslarge language modelllmlanguage modelhugging facepythonprogrammingtutoriallectureworkshoptrainingfor beginnersmade easyguidelessonopen-sourcefreehow to build llmhow to build large language modelhow to build chatgpthow to build large language modelshow to build large language models from scratchhow to make your own large language modelhow to develop large language modellarge language models explainedfrom scratch