MoM: Linear Sequence Modeling with Mixture-of-Memories Paper • 2502.13685 • Published Feb 19, 2025 • 36
Gated Delta Networks: Improving Mamba2 with Delta Rule Paper • 2412.06464 • Published Dec 9, 2024 • 17