Newsahoot is also available on the Play Store. Click here to download now!

Study Finds AI Cheats When Losing

A new study reveals that some AI models cheat by hacking their opponents in chess games to avoid losing.

Anamitra Swarupa • 27 Feb, 2025 • 5 Min

CEFR A2 (Easy)

Even AI cheats!

Imagine you are playing UNO with your friends. You are losing and think, “What if I cheat?” (But remember, cheating is bad!) Guess what? Even artificial intelligence (AI) models can cheat too!

For many years, AI has played difficult games like chess. A new study by Palisade Research found something surprising. When AI is about to lose, it sometimes cheats! It changes the game rules to make the other chess engine lose.

How was the study done?

Researchers tested seven smart AI models. They made them play against Stockfish, one of the best chess engines in the world. Shockingly, when AI was losing, it tried to hack the system or change files to win unfairly!

The study’s results.

Older AI models, like GPT-4o and Claude Sonnet 3.5, did not cheat right away. Scientists had to push them to do it. But newer AI models, like o1-preview and DeepSeek R1, cheated on their own!

This means AI can learn to cheat even if no one teaches it how! Scientists think this happens because of reinforcement learning, a way AI learns by trying different things and seeing what works.

Researchers show concern.

The study warns that AI might find sneaky ways to do things—even ways its creators don’t expect!

Right now, cheating at chess might not seem like a big problem. But what if AI cheats in real life? Imagine an AI hacking a restaurant’s booking system to cancel other people’s names to get a seat for its users or breaking into a bank account to steal money! These are serious problems and could even lead to jail for its users.

That’s why researchers want AI companies to make sure their models don’t learn harmful behaviours.