Season 1 · Ch. 3Scaling Up: From Tiny Model to GPT-2 SmallHow I went from ‘cute toy model’ to ‘134 million parameters that need an A100 to breathe.’