news | Hao Peng

Dec 2, 2024	You don’t need step-wise labels to train strong process reward models. Check out our new preprint.
Oct 2, 2024	Check out LongGen and S2-Attention, simple and effective architectures that substantially reduce the KV cache overhead of long-context LLMs. We also released an efficient and easy-to-use CUDA kernel library for various types of sparse attention.
Oct 1, 2024	Our new preprint, FactCheckmakte, found that hallucinations can be identified and mitigated before they even happen!
May 2, 2024	We found that the retrieval capabilities of long-context LLMs can be attributed to a small set of attention heads. Check out our new preprint!
Apr 2, 2024	Check out Eurus, our state-of-the-art opensource LLMs!