The inefficiency of RL, and implications for RLVR progress
Link: https://www.dwarkesh.com/p/bits-per-sample
Discussion: https://news.ycombinator.com/item?id=46067011
The inefficiency of RL, and implications for RLVR progress
Link: https://www.dwarkesh.com/p/bits-per-sample
Discussion: https://news.ycombinator.com/item?id=46067011
If you have a fediverse account, you can quote this note from your own instance. Search https://social.lansky.name/users/hn100/statuses/115640146882314879 on your instance and quote it. (Note that quoting is not supported in Mastodon.)