Season 2 · Ch. 3

What GPUburnout-1B Actually Learned

Time to face the music Training a language model is the fun part. You watch the loss drop, you generate text samples that are slightly less incoherent than yesterday’s, you tell yourself “look, it almost knows what France is.” It’s addictive. It’s rewarding. It also tells you absolutely nothing about how good your model actually is. Benchmarking is where the universe hands you a report card you didn’t ask for. ...

March 6, 2026 · 10 min · Jun Park
Season 1 · Ch. 5

The Results Are In (And My Wallet Is Empty)

Final loss curves, the damage to my compute budget, and 22 lessons I paid dearly to learn.

February 6, 2026 · 6 min · Jun Park
GPUburnout
GPUburnout
Will Code for Tokens
S1 GPT-2 134M
S2 Llama 1B