Towards Monosemanticity: Decomposing Language Models Into Understandable Components Published 2023-10-25 Download video MP4 360p Download video MP4 720p Recommendations 39:41 Language Models Can Explain Neurons in Language Models 26:55 ChatGPT: 30 Year History | How AI Learned to Talk 1:00:14 Studying Large Language Model Generalization with Influence Functions 11:49 Anthropic Solved Interpretability? 44:34 Chronos: Learning the Language of Time Series 1:26:45 Bill Dally | Directions in Deep Learning Hardware 55:55 Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024) 46:02 What is generative AI and how does it work? – The Turing Lectures with Mirella Lapata 2:29:35 A Walkthrough of Toy Models of Superposition w/ Jess Smith 43:31 What is a vector database? Why are they critical infrastructure for #ai #applications? 36:29 Compromising LLMs: The Advent of AI Malware 30:21 How Stable Diffusion Works (AI Image Generation) 55:41 Connor Leahy Unveils the Darker Side of AI 15:22 How Intelligence Evolved | A 600 Million Year Story 31:51 Universal and Transferable Adversarial Attacks on Aligned Language Models Explained 09:21 Reading AI's Mind - Mechanistic Interpretability Explained [Anthropic Research] 47:06 MIT 6.S087: Foundation Models & Generative AI. INTRODUCTION 15:25 Visual Guide to Transformer Neural Networks - (Episode 2) Multi-Head & Self-Attention 14:22 How AI Learns Concepts 2:22:44 Adversarial Attacks on LLMs Similar videos 15:54 🚀🔍 AI papers deep dive: LLM understanding, RAG, CoT 1:12:46 EP36: ChatGPT Vision Road Tested, AutoGen Cheese Test & Anthropic's Break Through 40:59 Chris Olah - Looking Inside Neural Networks with Mechanistic Interpretability 58:06 Poisoning Web-Scale Training Datasets - Nicholas Carlini | Stanford MLSys #75 3:13:13 Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind 47:20 Superposition in LLM Feature Representations | Boluwatife Ben-Adeola | Conf42 LLMs 2024 03:14 Google invests $2B in Anthropic 💰, RAG demystified ❓, decomposing LLMs with dictionary learning 📚 1:14:36 The AI Scouting Report: Jailbreaks and Defense 1:11:01 Interpretability Hackathon 3.0 Keynote - Neel Nanda More results