Sparsity for Efficient Long Sequence Generation of LLMs Published 2023-12-08 Download video MP4 360p Recommendations 1:05:25 Theoretical Characterization of Forgetting and Generalization of Continual Learning 49:47 “What's wrong with LLMs and what we should be building instead” - Tom Dietterich - #VSCF2023 1:01:23 DSPy: Advanced Prompt Engineering? 1:06:11 Rethinking the Theoretical Foundation of Reinforcement Learning 1:05:21 [REFAI Seminar 04/20/23] Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time 1:37:37 The Turing Lectures: The future of generative AI 58:58 FlashAttention - Tri Dao | Stanford MLSys #67 25:47 Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED 58:12 MIT Introduction to Deep Learning | 6.S191 19:17 Low-rank Adaption of Large Language Models: Explaining the Key Concepts Behind LoRA 59:41 Machine Learning challenges in Metrology in Semiconductor Device Industry 47:48 LLM Foundations (LLM Bootcamp) 59:48 [1hr Talk] Intro to Large Language Models 45:46 Geoffrey Hinton | On working with Ilya, choosing problems, and the power of intuition Similar videos 53:35 Yuandong Tian | Efficient Inference of LLMs with Long Context Support 52:12 Pixelated Butterfly: Fast Machine Learning with Sparsity - Beidi Chen | Stanford MLSys #49 04:38 LoRA - Low-rank Adaption of AI Large Language Models: LoRA and QLoRA Explained Simply 03:03 Text Classification Explained | Sentiment Analysis Example | Deep Learning Applications | Edureka 07:08 Ep. 5 - How to Overcome LLM Context Window Limitations 56:18 Ji Lin's PhD Defense, Efficient Deep Learning Computing: From TinyML to Large Language Model. @MIT 42:27 Unlock Faster and More Efficient LLMs with SparseGPT 30:25 Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // CTO Mistral 59:54 Compression for AGI - Jack Rae | Stanford MLSys #76 14:46 LLama-2 7B: 400K context length - Beyond Limits? 15:09 LongT5: Efficient Text-To-Text Transformer for Long Sequences (Research Paper Summary) 05:34 Attention mechanism: Overview 05:04 xFormers: Building Blocks for Efficient Transformers at PyTorch Conference 2022 1:19:06 Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87 More results