ZeRO & Fastest BERT: Increasing the scale and speed of deep learning training in DeepSpeed

Published 2021-04-13

Download video MP4 360p

Recommendations

1:11:36

Microsoft DeepSpeed introduction at KAUST
23:42

Research Forum Keynote: Research in the Era of AI
16:01

Mamba - a replacement for Transformers?
39:42

DeepSpeed: All the tricks to scale to gigantic models
12:33

Sébastien Bubeck on Phi-2 and the surprising power of small models
17:12

7 PyTorch Tips You Should Know
21:18

Turing-NLG, DeepSpeed and the ZeRO optimizer
44:35

ONNX and ONNX Runtime
52:51

Deep Dive on PyTorch Quantization - Chris Gottbrath
14:02

Why You Don't Want Super Speed
54:45

Elon Musk talks Twitter, Tesla and how his brain works — live at TED2022
12:23

Why US Airports Are So Bad
20:57

AI Forum 2023 | The Small Models Revolution
11:17

I don't think I can keep doing this.
19:27

The first 20 hours -- how to learn anything | Josh Kaufman | TEDxCSU
05:35

Augmenting Human Cognition and Decision Making with AI
43:46

SDC2022 – Memory Disaggregation and Pooling with CXL
15:05

Variational Autoencoders
17:07

LoRA explained (and a bit about precision and quantization)

Similar videos

10:27

DeepSpeed | PyTorch Developer Day 2020
01:08

Accelerate Big Model Inference: How Does it Work?
1:22:58

Ultimate Guide To Scaling ML Models - Megatron-LM | ZeRO | DeepSpeed | Mixed Precision
03:20

Supercharge your PyTorch training loop with Accelerate
2:54:56

KDD 2020: Hands on Tutorials: Deep Speed -System optimizations enable training deep learning models
34:52

Speed up training and inference of GPT-Neo 1.6B by 45+% using DeepSpeed
1:06:53

[REFAI Seminar 03/30/23] Efficient Trillion Parameter Scale Training and Inference with DeepSpeed
1:12:30

Scaling Pandas with Ray and Modin + Alexa AI: Kubernetes and DeepSpeed Zero
26:54

TRILLION Parameter Models Are Here
04:39

Part 1: Accelerate your training speed with the FSDP Transformer wrapper
More results