New benchmark just dropped: SnitchBench by Theo Browne tests if LLMs will snitch on you to the authorities if you feed them incriminating documents and a tool that lets them send email, as seen in the Claude 4 System Card
Turns out they pretty much all will! https://simonwillison.net/2025/May/31/snitchbench-with-llm/
