Measuring AI Ability to Complete Long Tasks: Opus 4.5 has 50% horizon of 4h49M
Link: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
Discussion: https://news.ycombinator.com/item?id=46342166
Measuring AI Ability to Complete Long Tasks: Opus 4.5 has 50% horizon of 4h49M
Link: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
Discussion: https://news.ycombinator.com/item?id=46342166
If you have a fediverse account, you can quote this note from your own instance. Search https://social.lansky.name/users/hn100/statuses/115756499581308796 on your instance and quote it. (Note that quoting is not supported in Mastodon.)