In this presentation Micron will discuss the memory bandwidth requirements of LLM (Large-Language Models) depending on their parameter sizes and quantization level.
Several possible memory technologies will be considered against these requirements. Additionally, associated storage bandwidth requirements driven by model loading latencies will be presented and evaluated against various storage technologies such as UFS and NVMe.