Wei Wei, Developer Advocate at Google, shares general principles and best practices to improve TensorFlow Serving performance. He discusses how to improve the latency for API surfaces, batching, and more parameters that you can tune.
Resources:
TensorFlow Serving performance guide → [ Ссылка ]
Profile Inference Requests with TensorBoard → [ Ссылка ]
TensorFlow Serving batching configuration → [ Ссылка ]
TensorFlow Serving SavedModel Warmup → [ Ссылка ]
XLA homepage → [ Ссылка ]
How to make TensorFlow models run faster on GPUs (with XLA) → [ Ссылка ]
How OpenX Trains and Serves for a Million Queries per Second in under 15
Milliseconds → [ Ссылка ]
ResNet complete example → [ Ссылка ]
Deploying Production ML Models with TensorFlow Serving playlist → [ Ссылка ] Subscribe to TensorFlow → [ Ссылка ]
#TensorFlow #MachineLearning #ML
Ещё видео!