Skip to main content

Let's build GPT: from scratch, in code, spelled out.

Andrej Karpathy · 1:56:20 · 4 views

AI Summary

GPT from Scratch - Video Summary

• Building a GPT model from scratch in code, following the "Attention is All You Need" paper

• Implementation based on OpenAI's GPT-2 and GPT-3 architectures

• Detailed, step-by-step code walkthrough with explanations

• Covers connections between foundational transformer concepts and modern GPT models

• Educational focus on understanding how generative pre-trained transformers work at a fundamental level

Video description

We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections t...