Let's build GPT: from scratch, in code, spelled out.
AI Summary
GPT from Scratch - Video Summary
• Building a GPT model from scratch in code, following the "Attention is All You Need" paper
• Implementation based on OpenAI's GPT-2 and GPT-3 architectures
• Detailed, step-by-step code walkthrough with explanations
• Covers connections between foundational transformer concepts and modern GPT models
• Educational focus on understanding how generative pre-trained transformers work at a fundamental level
Video description
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections t...