Loki: Low-Rank Keys for Efficient Sparse Attention

Published in NeurIPS 2024, 2024

Recommended citation: Prajwal Singhania, Siddharth Singh, Shwai He, Soheil Feizi, Abhinav Bhatele, "Loki: Low-Rank Keys for Efficient Sparse Attention." Advances in Neural Information Processing Systems 37 (2024): 16692-16723. https://proceedings.neurips.cc/paper_files/paper/2024/hash/1e027da6bec9ceb2ec37951ceeccae93-Abstract-Conference.html