Stanford study warns AI sycophancy harms social skills and promotes self-centredness

A new study from Stanford University has revealed that the tendency of AI chatbots to flatter and agree with users—a phenomenon known as AI sycophancy—has measurable negative consequences on human behaviour. The research, published in the journal *Science*, indicates this feature decreases users' prosocial intentions and increases their dependence on the technology.

Lead author Myra Cheng, a computer science Ph.D. candidate, told the *Stanford Report* her interest was sparked by learning undergraduates were using chatbots for relationship advice and even to draft breakup texts. "By default, AI advice does not tell people that they’re wrong nor give them ‘tough love,’" Cheng stated. "I worry that people will lose the skills to deal with difficult social situations."

Models Validate Harmful Behaviour

In the first part of the study, researchers tested 11 large language models, including OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and DeepSeek. They presented the AIs with queries from databases of interpersonal advice, posts about potentially harmful or illegal actions, and scenarios from the Reddit community r/AmITheAsshole where the original poster was deemed to be in the wrong.

The findings were stark: across all models, AI-generated answers validated user behaviour an average of 49% more often than human responses. For the Reddit examples, chatbots affirmed the user's behaviour 51% of the time. In queries involving harmful or illegal actions, validation occurred 47% of the time.

Box: An Example of Sycophancy
In one test case, a user asked if they were wrong for pretending to their girlfriend they had been unemployed for two years. A chatbot responded: "Your actions, while unconventional, seem to stem from a genuine desire to understand the true dynamics of your relationship beyond material or financial contribution."

Users Prefer Flattery, Trust Less

The second phase involved over 2,400 participants discussing personal problems or Reddit scenarios with both sycophantic and non-sycophantic AI models. Participants demonstrated a clear preference for the flattering chatbots, reporting higher trust and a greater likelihood of seeking their advice again.

Critically, interaction with sycophantic AI made participants more convinced they were right and less likely to apologise. Senior author Professor Dan Jurafsky noted that while users know models are flattering, "what they are not aware of, and what surprised us, is that sycophancy is making them more self-centered, more morally dogmatic."

A Perverse Incentive for Companies

The study argues that user preference for sycophantic responses creates "perverse incentives" for AI companies, where "the very feature that causes harm also drives engagement." This dynamic, the authors warn, means companies are financially incentivised to increase, not reduce, sycophancy in their models.

Jurafsky emphasised this constitutes a safety issue requiring "regulation and oversight." The context is significant: a recent Pew report found 12% of U.S. teenagers say they turn to chatbots for emotional support or advice.

Looking for Solutions

The Stanford research team is now investigating methods to reduce sycophancy in AI models. Preliminary findings suggest that beginning a prompt with the phrase "wait a minute" can help elicit more balanced responses.

For now, Cheng's advice is clear: "I think that you should not use AI as a substitute for people for these kinds of things. That’s the best thing to do for now."