Announcement_3
Contrary to many recent claims, we show that RL actually learns new skills and generalizes surprisingly well in LLMs. Check out our new preprint.
Contrary to many recent claims, we show that RL actually learns new skills and generalizes surprisingly well in LLMs. Check out our new preprint.