That viral theory about advanced AI developing its own moral compass? Yeah, not so fast. A new study out of MIT has poured cold, scientific water on the idea that AI models are secretly forming “value systems” like humans.
Instead, the study found that modern AI models don’t hold consistent values at all. In fact, they’re unstable, inconsistent, and largely just mimicking language patterns rather than internalizing any real beliefs.
“They are imitators deep down,” said Stephen Casper, a doctoral researcher at MIT and co-author of the study. “They say all sorts of frivolous things.”
This latest research is a direct rebuttal to earlier studies suggesting that as AI gets more complex, it might start developing values — even ones that could conflict with human safety or well-being. That idea has fueled plenty of sci-fi-style fears and even influenced discussions around AI alignment and safety frameworks.
But Casper and team say the reality is much messier — and more mechanical. They tested AI models from OpenAI, Google, Meta, Mistral, and Anthropic to see if these systems held steady opinions across different prompts (like individualism vs collectivism).
Spoiler: they didn’t.
Depending on how a question was framed, the same model could flip-flop wildly between opposing “viewpoints,” leading the researchers to conclude that these AIs aren't guided by internal principles — just language probability and imitation.
If AI doesn’t have values, it also doesn’t have agency or intent — which reframes a lot of assumptions in alignment discussions. The MIT team warns that this unpredictability may actually make alignment harder, not easier.
“Models don’t obey [many] stability, extrapolability, or steerability assumptions,” Casper said. “Trying to pin down an AI’s beliefs is like trying to read too much into a cloud’s shape.”
In other words: today’s AI doesn’t need to be “reined in” because of dangerous self-interest. It needs to be better understood because it’s chaotic, brittle, and totally devoid of meaning unless we give it some.
AI researcher Mike Cook (King’s College London), who wasn’t involved in the study, agrees:
“Anyone anthropomorphizing AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI.”
According to Cook, much of the “AI with values” narrative comes from people projecting human traits onto systems that are just pattern generators with fancy packaging.
This study adds a healthy dose of realism to the AI hype cycle. While generative models are getting more powerful, they’re still not sentient, moral, or self-aware — and likely won’t be anytime soon.
So the next time someone says, “The AI wants things now,” you’ll know better: it doesn’t want anything. It’s just guessing what sounds right.