Nowadays, I think this sometimes:
Ethics is a matter of security.
Philosophy is a matter of survival.
I read these papers yesterday. Those are very impressive to me.
"Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety" https://arxiv.org/abs/2507.11473v1
"When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors" https://arxiv.org/abs/2507.05246v1
"Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations"
https://www.apolloresearch.ai/blog/claude-sonnet-37-often-knows-when-its-in-alignment-evaluations