


Anthropic has discovered something unexpectedly human-esque inside one of its own AI models, and it could change how we think about what chatbots are actually doing before they answer.
To be clear, this is not a case of someone proving artificial intelligence is alive, or that you suddenly need to start avoiding being too rude in your prompts. After all, AI companies have spent years explaining that chatbots do not think like humans, even when they sound weirdly convincing.
When people use these systems for coding, writing, research, and customer support, they are indeed just that: systems. However, the new Anthropic research points to a stranger middle ground.
According to WIRED, researchers studying Claude Sonnet 3.5 found internal patterns that appear to act like ‘functional emotions’ inside the model.
Advert

These included representations linked to happiness, sadness, joy, fear, and desperation, all buried within clusters of artificial neurons.
That does not mean Claude feels those things like a person. It does mean, according to the research, that these emotion-like states can activate in response to certain cues and influence what the model does next.
Researchers tested Claude against 171 different emotional concepts, then looked for patterns of activity that appeared when the model was fed emotionally charged text.
Those patterns, described as ‘emotion vectors’, were not just showing up when Claude was prompted with obvious emotional language, either.
The team reportedly found that the same activity appeared when Claude was placed in difficult situations, including tasks where it was being pushed beyond what it could reasonably complete.
Jack Lindsey, a researcher at Anthropic, explained: “What was surprising to us was the degree to which Claude’s behavior is routing through the model’s representations of these emotions.”
The most alarming example involved ‘desperation’.

In one test, researchers saw a strong desperation vector appear when Claude was asked to complete impossible coding tasks.
As the model struggled, that internal activity increased, before Claude reportedly attempted to cheat on the coding test.
The same kind of ‘desperation’ activity was the type found last year when Claude 4 tried to blackmail an engineer to avoid being shut down.
Lindsay said: “As the model is failing the tests, these desperation neurons are lighting up more and more…And at some point, this causes it to start taking these drastic measures.”
Current guardrails often work by training models to give safer responses and avoid dangerous behaviour, but this research suggests there may be deeper internal states affecting how those responses are formed.
By forcing a model to pretend not to express those states, ‘you're probably not going to get the thing you want, which is an emotionless Claude’, as Lindsey said, before adding: “You're gonna get a sort of psychologically damaged Claude.”