Season 1 · Ch. 6

Training Optimizations Deep Dive: How I Made the A100 Actually Work

The complete technical reference for achieving 16x speedup. Every optimization explained with code and diagrams.

February 12, 2026 · 21 min · Jun Park
Season 1 · Ch. 4

11 Training Challenges and How I Solved Them

A comprehensive guide to every way I shot myself in the foot training GPT-2 Small. Learn from my pain.

February 2, 2026 · 6 min · Jun Park
GPUburnout
GPUburnout
Will Code for Tokens
S1 GPT-2 134M
S2 Llama 1B