Torch-Compile

Season 1 · Ch. 6

Training Optimizations Deep Dive: How I Made the A100 Actually Work

The complete technical reference for achieving 16x speedup. Every optimization explained with code and diagrams.

Season 1 · Ch. 4

A comprehensive guide to every way I shot myself in the foot training GPT-2 Small. Learn from my pain.

GPUburnout

Will Code for Tokens

S1 GPT-2 134M

S2 Llama 1B