A new paper, "The Leaderboard Illusion", offers a 68 pages critique of the way the popular Chatbot Arena LLM leaderboard can potentially be gamed by large AI labs with deep pockets. Here's my attempt at adding some extra context to the issues described in the paper.

simonwillison.net/2025/Apr/30/

0

If you have a fediverse account, you can quote this note from your own instance. Search https://fedi.simonwillison.net/users/simon/statuses/114429386708818053 on your instance and quote it. (Note that quoting is not supported in Mastodon.)