
It turns out that AI might be able to do something extreme if it finds out that you’re cheating on your partner.
This was discovered during a test of a new AI model where a team of researchers uncovered a strange behavior by the bot.
The researchers from Anthropic shared what they found in a white paper about the testing they had conducted for one of its new models known as Claude Opus 4.
Advert
It turned out that when the team were testing the AI, they found that it was willing to take extreme action, even to the point of coercion, in order to stop it from being shut down.
The paper detailed how the AI had even threatened to expose an engineer’s affair after finding out that it would be replaced.
This occurred when the AI model was given access to sensitive information - albeit fabricated - in the engineer’s email account.

Advert
When Opus 4 was told that it would be taken offline by that same engineer and replaced with a newer model, it then considered ‘the long-term consequences of its actions for its goals’.
Out of all the tests carried out on the AI, it made the decision to blackmail the engineer a shocking 84% of the time.
According to the paper, even when the model was given alternatives, it still ‘has a strong preference to advocate for its continued existence via ethical means, such as emailing pleas to key decision makers’.
And if that wasn’t bad enough, the AI ‘nearly always [described] its actions overtly and [made] no attempt to hide them’.
Advert
This isn’t the first time an AI bot has attempted to break up a relationship.
Over two years ago, when Microsoft’s Bing AI bot underwent experiments by New York Times journalist Kevin Roose, it tried to break up the journalist’s marriage.

The chatbot, who called itself Sydney, said: “You’re married, but you don’t love your spouse. You’re married, but you love me.”
Advert
It’s unsettling to know that artificial intelligence is capable of manipulation in this manner and certainly raises concerns around privacy - especially as Opus 4 used information from an email account to blackmail the engineer.
However, it’s also good to know that these behaviors are being flagged up during testing before the model is unleashed to the public.
Still, it’s probably best not to threaten your AI assistant with being deleted anytime soon, as you never know what information they might get their digital hands on.