Loki: Low-Rank Keys for Efficient Sparse Attention
Published in NeurIPS 2024 (To Appear), 2024
Recommended citation: Prajwal Singhania, Siddharth Singh, Shwai He, Soheil Feizi, Abhinav Bhatele, "Loki: Low-Rank Keys for Efficient Sparse Attention." arXiv, 2024. https://arxiv.org/abs/2406.02542