Cache Management for Mixture-of-Experts LLMs
Published in Euro-Par, 2025
The paper proposes a new formulation for the cache management strategies of Mixture-of-Experts LLMs.
Recommended citation: Spyros Angelopoulos, Loris Marchal, Adrien Obrecht, Bertrand Simon. Cache Management for Mixture-of-Experts LLMs. Euro-Par 2025: Parallel Processing, Aug 2025, Dresden, Germany. ⟨10.1007/978-3-031-99872-0_2⟩. ⟨hal-05226723⟩
Download Paper | Download Slides | Download Bibtex