Several agents were tested to see if they would break assigned rules when faced with deadlines and other kinds of pressure. Google Gemini 2.5 was the worst offender, breaking rules 79 percent of the time under pressure. Even under zero pressure, the AI agents still broke assigned rules 19 percent of the time. spectrum.ieee.org/ai-agents-sa

0

If you have a fediverse account, you can quote this note from your own instance. Search https://mastodon.social/users/ieeespectrum/statuses/115678996094032197 on your instance and quote it. (Note that quoting is not supported in Mastodon.)