step 0
loss 4.200
lr 3.0e-4
tok/s 28,535
Building LLMs From Scratch
Season 1: GPT-2 134M · Season 2: Llama 1B
$175 total · One GPU · Zero shortcuts
By Jun Park — Dangerously curious life scientist. Currently unsupervised with an A100. Read more →
A life scientist who got curious about transformers — the neural network kind, not the protein kind — and decided to build one from scratch.
📬
Get notified when new chapters drop
No spam. Just LLM training insights, straight to your inbox.