Is TinyLLaMA any good? Let's run it on Raspberry Pi, benchmark the baseline and possible optimizations! Then we talk with our TinyLLaMA living inside of Raspberry Pi using a simple barebones web server for inference.
Github Gist (benchmark commands, benchmark results, prompts):
[ Ссылка ]
TinyLlama/TinyLlama-1.1B-Chat-v1.0
[ Ссылка ]
TinyLlama/TinyLlama-1.1B-Chat-v1.0 (GGUF)
[ Ссылка ]
![](https://i.ytimg.com/vi/anJm2LFqQjo/maxresdefault.jpg)