On-Device LLM Inference at 600 Tokens/Sec.: All Open Source