Anthropic reportedly testing new AI model that poses 'unprecedented' risks

Home> News> AI

Anthropic reportedly testing new AI model that poses 'unprecedented' risks

A quiet leak suggests this could be Anthropic’s most powerful model yet

google discoverFollow us on Google Discover

Any relief AI critics felt when OpenAI pulled the plug on Sora and watched its billion-dollar Disney deal collapse with it was always going to be short-lived.

Because while one AI giant was prioritising other areas of tech, another was quietly pushing further into uncharted territory.

Jailbroken models have painted a deeply unsettling picture of what the technology is capable of when its guardrails are stripped away, including suggestions that some models would entertain the idea of harming humans.

Meanwhile, ChatGPT has racked up its own troubling track record, from hallucinations that have convinced vulnerable users to make life-altering decisions to cases that have ended in hospitalisation or tragic fates.

Anthropic has leaked some unnerving details of its upcoming Claude Mythos model (NurPhoto/Contributor/Getty)
Anthropic has leaked some unnerving details of its upcoming Claude Mythos model (NurPhoto/Contributor/Getty)

Now, Anthropic has revealed some unnerving details of its upcoming model release.

Following an apparent data leak, the AI giant has reportedly begun testing Claude Mythos which they describe as 'the most capable we’ve built to date.'

Speaking to Fortune, an Anthropic spokesperson said the new model represents a 'step change' in AI performance with significantly better performance in 'reasoning, coding, and cybersecurity' than its predecessors. Mythos is currently being trialled by a select group of 'early access customers.'

An Anthropic spokesperson said: “Given the strength of its capabilities, we’re being deliberate about how we release it. As is standard practice across the industry, we’re working with a small group of early access customers to test the model. We consider this model a step change and the most capable we’ve built to date."

However, that's not all. According to Fortune, a publicly available draft blog post by Anthropic noted that the new model could pose 'unprecedented cybersecurity risks.'




It's worth remembering that this isn't the first time Anthropic has made a somewhat unnerving admission about its own technology. The company previously admitted uncertainty about whether Claude might possess 'consciousness or moral status.'

The news had left many speculating that the leaked Claude model details would be larger than any previous versions.

Over on Reddit, one user replied to the news: "They don’t say “step change” or “dramatically higher scores” for every release. This sounds like a bigger leap than usual based on some of the language/leaks. We’ll see if it’s true soon enough hopefully."

Another added: "I’ve actually never heard Anthropic describe their upcoming models in this way. But all of this is just noise until something materializes."

Meanwhile, on X, some users feel the upgrades are a bit of an overstatement.

"Looks like a serious upgrade, but nothing “magical” - more like a continuation of the trend. Each generation pushes coding + reasoning further, and now there’s a clear focus on cybersecurity," someone else commented.

Featured Image Credit: NurPhoto / Contributor / Getty