Dec 2, 2024 | You don’t need step-wise labels to train strong process reward models. Check out our new preprint. |
Oct 2, 2024 | Check out LongGen and S2-Attention, simple and effective architectures that substantially reduce the KV cache overhead of long-context LLMs. We also released an efficient and easy-to-use CUDA kernel library for various types of sparse attention. |
Oct 1, 2024 | Our new preprint, FactCheckmakte, found that hallucinations can be identified and mitigated before they even happen! |
May 2, 2024 | We found that the retrieval capabilities of long-context LLMs can be attributed to a small set of attention heads. Check out our new preprint! |
Apr 2, 2024 | Check out Eurus, our state-of-the-art opensource LLMs! |