Democratizing Foundation Models via k-bit Quantization - Tim Dettmers | Stanford MLSys #82

Published 2023-10-23

Download video MP4 360p
Download video MP4 720p

Recommendations

55:59

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83
56:32

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86
10:32

Why Elon Musk is Afraid of Artificial Intelligence | Insights from 'Superintelligence' Nick Bostrom
17:07

LoRA explained (and a bit about precision and quantization)
30:48

QLoRA: Efficient Finetuning of Quantized LLMs | Tim Dettmers
58:41

8-bit Methods for Efficient Deep Learning with Tim Dettmers
45:32

A Survey of Techniques for Maximizing LLM Performance
55:51

Your Microbiome: What Is It, and How Can It Help or Hurt You?
59:17

Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84
1:16:48

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88
52:02

NWDS Talk - From Text2SQL to Automating BI: The Coming Wave of LLM Analytic Agents
1:27:21

CBMM10 Panel: Research on Intelligence in the Age of AI
1:04:31

MedAI #78: Foundation Models for Medical AI | Vivek Natarajan
1:11:41

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy
1:01:53

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models
58:13

How Fine-tuning Open Source LLMs Solves GenAI Productionization - Piero Molino | Stanford MLSys #94
42:06

Understanding 4bit Quantization: QLoRA explained (w/ Colab)
1:13:13

Robert Sapolsky: The Biology of Humans at Our Best and Worst
55:01

Scaling Up “Vibe Checks” for LLMs - Shreya Shankar | Stanford MLSys #97
1:02:50

MIT 6.S191 (2023): Recurrent Neural Networks, Transformers, and Attention

Democratizing Foundation Models via k-bit Quantization - Tim Dettmers | Stanford MLSys #82

Download video MP4 360p

Download video MP4 720p

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Why Elon Musk is Afraid of Artificial Intelligence | Insights from 'Superintelligence' Nick Bostrom

LoRA explained (and a bit about precision and quantization)

QLoRA: Efficient Finetuning of Quantized LLMs | Tim Dettmers

8-bit Methods for Efficient Deep Learning with Tim Dettmers

A Survey of Techniques for Maximizing LLM Performance

Your Microbiome: What Is It, and How Can It Help or Hurt You?

Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

NWDS Talk - From Text2SQL to Automating BI: The Coming Wave of LLM Analytic Agents

CBMM10 Panel: Research on Intelligence in the Age of AI

MedAI #78: Foundation Models for Medical AI | Vivek Natarajan

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models

How Fine-tuning Open Source LLMs Solves GenAI Productionization - Piero Molino | Stanford MLSys #94

Understanding 4bit Quantization: QLoRA explained (w/ Colab)

Robert Sapolsky: The Biology of Humans at Our Best and Worst

Scaling Up “Vibe Checks” for LLMs - Shreya Shankar | Stanford MLSys #97

MIT 6.S191 (2023): Recurrent Neural Networks, Transformers, and Attention

Tim Dettmers—k-bit Inference Scaling Laws

Multimodal Reasoning: PaLM-E & Gemini - Aakanksha Chowdhery | Stanford MLSys #90

8-bit Methods for Efficient Deep Learning -- Tim Dettmers (University of Washington)

8-bit Optimizers via Block-wise Quantization

A Taxonomy of ML for Systems Problems - Martin Maas | Stanford MLSys #81

ML for ML Compilers - Mangpo Phothilimthana | Stanford MLSys #80

QLoRA: Efficient Finetuning of Quantized Large Language Models (Tim Dettmers)

A data-centric view on reliable generalization - Ludwig Schmidt | Stanford MLSys #71

AI on your phone? Tim Dettmers on quantization of neural networks — #41

MLT init Session #17: LLM int8

QLoRA: Quantization for Fine Tuning

Panel discussion #1 | with Tim Dettmers, Johnathan Frankle, Julien Launay and Ce Zhang

LLMs for Everything and Everyone! - Sebastian Raschka - Lightning AI

Tim Dettmers: Personal Side of Academia, How to pick your Grad School, RTX 3000 FAQ

[Ambient AI] Lecture 5: Deep neural network quantization (1)

Democratizing Foundation Models via k-bit Quantization - Tim Dettmers | Stanford MLSys #82

Download video MP4 360p

Download video MP4 720p

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Why Elon Musk is Afraid of Artificial Intelligence | Insights from 'Superintelligence' Nick Bostrom

LoRA explained (and a bit about precision and quantization)

QLoRA: Efficient Finetuning of Quantized LLMs | Tim Dettmers

8-bit Methods for Efficient Deep Learning with Tim Dettmers

A Survey of Techniques for Maximizing LLM Performance

Your Microbiome: What Is It, and How Can It Help or Hurt You?

Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

NWDS Talk - From Text2SQL to Automating BI: The Coming Wave of LLM Analytic Agents

CBMM10 Panel: Research on Intelligence in the Age of AI

MedAI #78: Foundation Models for Medical AI | Vivek Natarajan

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models

How Fine-tuning Open Source LLMs Solves GenAI Productionization - Piero Molino | Stanford MLSys #94

Understanding 4bit Quantization: QLoRA explained (w/ Colab)

Robert Sapolsky: The Biology of Humans at Our Best and Worst

Scaling Up “Vibe Checks” for LLMs - Shreya Shankar | Stanford MLSys #97

MIT 6.S191 (2023): Recurrent Neural Networks, Transformers, and Attention

Tim Dettmers—k-bit Inference Scaling Laws

Multimodal Reasoning: PaLM-E & Gemini - Aakanksha Chowdhery | Stanford MLSys #90

8-bit Methods for Efficient Deep Learning -- Tim Dettmers (University of Washington)

8-bit Optimizers via Block-wise Quantization

A Taxonomy of ML for Systems Problems - Martin Maas | Stanford MLSys #81

ML for ML Compilers - Mangpo Phothilimthana | Stanford MLSys #80

QLoRA: Efficient Finetuning of Quantized Large Language Models (Tim Dettmers)

A data-centric view on reliable generalization - Ludwig Schmidt | Stanford MLSys #71

AI on your phone? Tim Dettmers on quantization of neural networks — #41

MLT __init__ Session #17: LLM int8

QLoRA: Quantization for Fine Tuning

Panel discussion #1 | with Tim Dettmers, Johnathan Frankle, Julien Launay and Ce Zhang

LLMs for Everything and Everyone! - Sebastian Raschka - Lightning AI

Tim Dettmers: Personal Side of Academia, How to pick your Grad School, RTX 3000 FAQ

[Ambient AI] Lecture 5: Deep neural network quantization (1)

MLT init Session #17: LLM int8