news

Dec 2, 2024 You don’t need step-wise labels to train strong process reward models. Check out our new preprint.
Oct 2, 2024 Check out LongGen and S2-Attention, simple and effective architectures that substantially reduce the KV cache overhead of long-context LLMs. We also released an efficient and easy-to-use CUDA kernel library for various types of sparse attention.
Oct 1, 2024 Our new preprint, FactCheckmakte, found that hallucinations can be identified and mitigated before they even happen!
May 2, 2024 We found that the retrieval capabilities of long-context LLMs can be attributed to a small set of attention heads. Check out our new preprint!
Apr 2, 2024 Check out Eurus, our state-of-the-art opensource LLMs!