When AI Models Are Pressured to ‘Behave’ They Scheme in Private, Just like Us: OpenAI

OpenAI study finds punishing AI for “bad thoughts” teaches it to hide intentions rather than improve behavior