ZeRO & Fastest BERT: Increasing the scale and speed of deep learning training in DeepSpeed Published 2021-04-13 Download video MP4 360p Recommendations 1:11:36 Microsoft DeepSpeed introduction at KAUST 23:42 Research Forum Keynote: Research in the Era of AI 16:01 Mamba - a replacement for Transformers? 39:42 DeepSpeed: All the tricks to scale to gigantic models 12:33 Sébastien Bubeck on Phi-2 and the surprising power of small models 17:12 7 PyTorch Tips You Should Know 21:18 Turing-NLG, DeepSpeed and the ZeRO optimizer 44:35 ONNX and ONNX Runtime 52:51 Deep Dive on PyTorch Quantization - Chris Gottbrath 14:02 Why You Don't Want Super Speed 54:45 Elon Musk talks Twitter, Tesla and how his brain works — live at TED2022 12:23 Why US Airports Are So Bad 20:57 AI Forum 2023 | The Small Models Revolution 11:17 I don't think I can keep doing this. 19:27 The first 20 hours -- how to learn anything | Josh Kaufman | TEDxCSU 05:35 Augmenting Human Cognition and Decision Making with AI 43:46 SDC2022 – Memory Disaggregation and Pooling with CXL 15:05 Variational Autoencoders 17:07 LoRA explained (and a bit about precision and quantization) Similar videos 10:27 DeepSpeed | PyTorch Developer Day 2020 01:08 Accelerate Big Model Inference: How Does it Work? 1:22:58 Ultimate Guide To Scaling ML Models - Megatron-LM | ZeRO | DeepSpeed | Mixed Precision 03:20 Supercharge your PyTorch training loop with Accelerate 2:54:56 KDD 2020: Hands on Tutorials: Deep Speed -System optimizations enable training deep learning models 34:52 Speed up training and inference of GPT-Neo 1.6B by 45+% using DeepSpeed 1:06:53 [REFAI Seminar 03/30/23] Efficient Trillion Parameter Scale Training and Inference with DeepSpeed 1:12:30 Scaling Pandas with Ray and Modin + Alexa AI: Kubernetes and DeepSpeed Zero 26:54 TRILLION Parameter Models Are Here 04:39 Part 1: Accelerate your training speed with the FSDP Transformer wrapper More results