Repository for A Simple and Effective L2 Norm-Based Method for KV Cache Compression, presented at EMNLP 2024. TL;DR Tokens with low $L_2$ norm in their key embeddings ...