Advertisement

Deception, lies, blackmail: is AI turning rogue? Experts alarmed over troubling outbursts

Advanced models exhibit fraudulent behaviours, posing challenges as researchers try to understand and regulate these new threats

Reading Time:3 minutes
Why you can trust SCMP
0
Experts have urged transparency and safety research to tackle AI deception. Photo: Shutterstock

The world’s most advanced artificial intelligence models are exhibiting troubling new behaviours – lying, scheming, and even threatening their creators to achieve their goals.

In one particularly jarring example, under threat of being unplugged, Anthropic’s latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair.

Meanwhile, ChatGPT-creator OpenAI’s o1 tried to download itself onto external servers and denied it when caught red-handed.

These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still do not fully understand how their own creations work.

Yet, the race to deploy increasingly powerful models continues at breakneck speed.

This deceptive behaviour appears linked to the emergence of “reasoning” models – AI systems that work through problems step-by-step rather than generating instant responses.

Advertisement
Select Voice
Choose your listening speed
Get through articles 2-3x faster
1.1x
220 WPM
Slow
Normal
Fast
1.1x