Essential Math & Concepts for LLM InferenceBack of the envelope calculations to estimate model's GPU memory requirements & insights into HW/SW optimizationsMay 31, 2024·12 min read·1.6K
CPU & GPU - The BasicsA digestible high-level overview of what happens in The DieApr 8, 2024·16 min read·623