Recently, I spent a lot of time reading & writing about LLM benchmark construct validity for a forthcoming article. I also interviewed LLM researchers in academia & industry. The piece is more descriptive than interpretive, but if I’d had the freedom to take it where I wanted it to go, I would’ve addressed the possibility that mental capabilities (like those that benchmarks test for) are never completely innate; they’re always a function of the tests we use to measure them ...

(1/2)

0

If you have a fediverse account, you can quote this note from your own instance. Search https://assemblag.es/users/emma/statuses/115849109665692091 on your instance and quote it. (Note that quoting is not supported in Mastodon.)