Run 70Bn Llama 3 Inference on a Single 4GB GPU