- Published on
In this post, we'll discuss the major differences between GPUs and SRAM-centric accelerators (e.g. Cerebras, Groq, and d-Matrix), explaining why near-compute memory versus far-compute memory is the key tradeoff being made by these architectures, and what this means for inference workloads.