Loki: Low-Rank Keys for Efficient Sparse Attention

Published in NeurIPS 2024 (To Appear), 2024

Recommended citation: Prajwal Singhania, Siddharth Singh, Shwai He, Soheil Feizi, Abhinav Bhatele, "Loki: Low-Rank Keys for Efficient Sparse Attention." arXiv, 2024. https://arxiv.org/abs/2406.02542