Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from November 12 - 15, 2024. Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at [ Ссылка ]
AutoML and Training Working Group Updates - Andrey Velichkevich, Apple; Yuki Iwai, CyberAgent; Johnu George, Nutanix; Amber Graner, Open Source Evangelist
AutoML working group (WG) is responsible for all aspects of AutoML features on Kubeflow with Katib as the sub-project. Katib is a Kubernetes-native project with rich support for HyperParameter tuning, Neural Architecture Search, and Early Stopping algorithms. The speakers will share Katib's 2024 ROADMAP, which includes Katib V1 APIs graduation, support for advanced parameter distribution (e.g. uniform or log-uniform), and large language model (LLM) parameters tuning. Training WG is responsible for operating scalable distributed training jobs on Kubeflow with Training Operator and MPI-Operator as sub-projects. Training Operator is a unified service for model training on various ML frameworks like PyTorch, Tensorflow, XGBoost, and PaddlePaddle. MPI Operator makes it easy to run HPC and all-reduce training tasks on Kubernetes. The speakers will share Training Operators' 2024 ROADMAP, which includes support of LLM API via Python SDK, advanced suspend semantics, and indexed job support.
Ещё видео!